Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,346
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline

Showing 225 reads from curated sources

Watershed Segmentation Using OpenCV
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago
Watershed Segmentation Using OpenCV
Explore the elegant intersection of nature-inspired algorithms and computer vision. This comprehensive technical guide unveils the powerful watershed segmentati
BAIR Blog 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 4mo ago
Information-Driven Design of Imaging Systems
<!-- These are comments in HTML. The above header text is needed to format the title, authors, etc. The "information-driven-imaging" is the representative image
Enhancing Images: Adaptive Shadow Correction Using OpenCV
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago
Enhancing Images: Adaptive Shadow Correction Using OpenCV
In this blog post, we'll tackle this challenge head-on with a practical approach to shadow correction using OpenCV. Our method leverages Multi-Scale Retinex (MS
Smart Document Scanning with Live OCR using OpenCV.js
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago
Smart Document Scanning with Live OCR using OpenCV.js
This blog explores how to build a smart, browser-based document scanner using OpenCV.js and live OCR. It covers document detection, perspective correction, inte
OpenCV G-API: From Imperative to Declarative Pipelines
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago
OpenCV G-API: From Imperative to Declarative Pipelines
Explore OpenCV G-API and how it transforms image-processing pipelines from imperative to declarative with graph-based execution. The post OpenCV G-API: From Imp
Run FLUX.2 on Replicate
Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 5mo ago
Run FLUX.2 on Replicate
FLUX.2 brings professional-grade image generation and editing with unprecedented detail, multi-reference support, and enterprise efficiency.
Teaching AI to see the world more like we do
DeepMind Blog 👁️ Computer Vision ⚡ AI Lesson 6mo ago
Teaching AI to see the world more like we do
Our new paper analyzes the important ways AI systems organize the visual world differently from humans.
Image editing in Gemini just got a major upgrade
DeepMind Blog 👁️ Computer Vision ⚡ AI Lesson 6mo ago
Image editing in Gemini just got a major upgrade
Transform images in amazing new ways with updated native image editing in the Gemini app.
How to Solve it With Code course now available
Fast.ai Blog 👁️ Computer Vision ⚡ AI Lesson 7mo ago
How to Solve it With Code course now available
tl/dr: This is a copy of a one-off email I sent to all fast.ai forum users, with a long-overdue update. I had planned to send this email a year ago to let you k
Which image editing model should I use?
Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 8mo ago
Which image editing model should I use?
Here is the ultimate comparison post on all the latest image editing models.
OpenAI News 👁️ Computer Vision ⚡ AI Lesson 8mo ago
Outbound coordinated vulnerability disclosure policy
Outbound coordinated vulnerability disclosure policy
GitHub Engineering 👁️ Computer Vision ⚡ AI Lesson 8mo ago
Post-quantum security for SSH access on GitHub
GitHub is introducing post-quantum secure key exchange methods for SSH access to better protect Git data in transit. The post Post-quantum security for SSH acce
Generate consistent characters
Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 10mo ago
Generate consistent characters
We compare the best image models for generating consistent characters from a single reference image.
BAIR Blog 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 10mo ago
Whole-Body Conditioned Egocentric Video Prediction
.modal { display: none; position: fixed; z-index: 9999; padding-top: 50px; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; background-color: rgba(0,
OpenAI News 👁️ Computer Vision ⚡ AI Lesson 1y ago
Introducing our latest image generation model in the API
Our latest image generation model is now available in the API via ‘gpt-image-1’—enabling developers and businesses to build professional-grade, customizable vis
OpenAI News 👁️ Computer Vision ⚡ AI Lesson 1y ago
Thinking with images
OpenAI o3 and o4-mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.
Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 1y ago
Replicate Intelligence #2
Faster image generation, AI-powered world simulator, insights on AI dataset complexity
Weaviate Blog 👁️ Computer Vision ⚡ AI Lesson 2y ago
Using Weaviate to Find Waldo
Dive into using Weaviate for image recognition to find the "needle in a haystack"!
Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago
A Dive into Text-to-Video Models
Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago
Universal Image Segmentation with Mask2Former and OneFormer
Weaviate Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago
How to build an Image Search Application with Weaviate
Learn how to use build an image search application using the Img2vec-neural module in Weaviate.
Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago
Image Classification with AutoTrain
Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago
Automating image collection
Using CLIP and LAION5B to collect thousands of captioned images.
Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 5y ago
Weight Banding
Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.
Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 5y ago
High-Low Frequency Detectors
A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.
Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6y ago
An Overview of Early Vision in InceptionV1
An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'
Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6y ago
A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features
An example project using webpack and svelte-loader and ejs to inline SVGs
Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 7y ago
Object Detection Part 4: Fast Detection Models
In Part 3 , we have reviewed models in the R-CNN family. All of them are region-based object detection algorithms. They can achieve hig
Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago
Object Detection for Dummies Part 3: R-CNN Family
[Updated on 2018-12-20: Remove YOLO here. Part 4 will cover multiple fast object detection algorithms, including YOLO.] [Updated on 2018-12-27: Add bbox regress
Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago
Object Detection for Dummies Part 2: CNN, DPM and Overfeat
Part 1 of the &ldquo;Object Detection for Dummies&rdquo; series introduced: (1) the concept of image gradient vector and how HOG algorithm summarizes the inform
Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 8y ago
Feature Visualization
How neural networks build up their understanding of images
Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago
Object Detection for Dummies Part 1: Gradient Vector, HOG, and SS
I&rsquo;ve never worked in the field of
OpenAI News 👁️ Computer Vision ⚡ AI Lesson 8y ago
Robust adversarial inputs
We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that