Improving OCR on Low-Quality Documents with AuraSR-v2 and MiniCPM-V 2.6
Welcome, fellow learners! In this video, we'll explore how to combine two newly released open-source models to achieve better OCR results on low-quality scanned documents. The first model, AuraSR, is a GAN-based super-resolution model that enhances the quality of scanned document images. The second model is MiniCPM-V 2.6, a recently released multimodal LLM, which we'll use to extract text from the upscaled document images.
Notebook - https://colab.research.google.com/drive/11_0W59kZBoSf7aSeB_tc-SAX06kMu-xX?usp=sharing
MiniCPM-V 2.6 - https://huggingface.co/openbmb/MiniCPM-V-2_6
AuraSR-v2 - https://huggingface.co/fal/AuraSR-v2
#ocr #superresolution #aurasr #minicpm #documentscanning #machinelearning #deeplearning #opensource #imageprocessing #gan #llm #generativeadversarialnetworks #lowqualityimages #4xresolution
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Multimodal LLMs
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
SpaceX–Cursor: The Layer You Don’t Control Owns Your Future in AI
Dev.to AI
From Rainforests to Recycling Plants: 5 Ways NVIDIA AI Is Protecting the Planet
NVIDIA AI Blog
Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to AI
AI Layoffs Are Building a Prisoner’s Dilemma That Could Collapse the Global Economy
Medium · AI
🎓
Tutor Explanation
DeepCamp AI