Improving OCR on Low-Quality Documents with AuraSR-v2 and MiniCPM-V 2.6

Name: Improving OCR on Low-Quality Documents with AuraSR-v2 and MiniCPM-V 2.6
Uploaded: 2024-08-10T07:30:32+00:00
Channel: TheAILearner
Description: Welcome, fellow learners! In this video, we'll explore how to combine two newly released open-source models to achieve better OCR results on low-quality...

TheAILearner · Beginner ·📰 AI News & Updates ·1y ago

Skills: Multimodal LLMs90%CV Basics80%Modern CV Models70%

Welcome, fellow learners! In this video, we'll explore how to combine two newly released open-source models to achieve better OCR results on low-quality scanned documents. The first model, AuraSR, is a GAN-based super-resolution model that enhances the quality of scanned document images. The second model is MiniCPM-V 2.6, a recently released multimodal LLM, which we'll use to extract text from the upscaled document images. Notebook - https://colab.research.google.com/drive/11_0W59kZBoSf7aSeB_tc-SAX06kMu-xX?usp=sharing MiniCPM-V 2.6 - https://huggingface.co/openbmb/MiniCPM-V-2_6 AuraSR-v2 - https://huggingface.co/fal/AuraSR-v2 #ocr #superresolution #aurasr #minicpm #documentscanning #machinelearning #deeplearning #opensource #imageprocessing #gan #llm #generativeadversarialnetworks #lowqualityimages #4xresolution

Watch on YouTube ↗ (saves to browser)