Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,346

lessons

Skills in this topic

3 skills — Sign in to track your progress

View full skill map →

Classify images with a pre-trained CNN

Modern CV Models

Run YOLO for real-time object detection

Build a Stable Diffusion inference pipeline

Videos 1,121 Reads 225

Showing 225 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

Watershed Segmentation Using OpenCV

OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago

Watershed Segmentation Using OpenCV

Explore the elegant intersection of nature-inspired algorithms and computer vision. This comprehensive technical guide unveils the powerful watershed segmentati

BAIR Blog 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 4mo ago

Information-Driven Design of Imaging Systems

<!-- These are comments in HTML. The above header text is needed to format the title, authors, etc. The "information-driven-imaging" is the representative image

Enhancing Images: Adaptive Shadow Correction Using OpenCV

OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago

Enhancing Images: Adaptive Shadow Correction Using OpenCV

In this blog post, we'll tackle this challenge head-on with a practical approach to shadow correction using OpenCV. Our method leverages Multi-Scale Retinex (MS

Smart Document Scanning with Live OCR using OpenCV.js

OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago

Smart Document Scanning with Live OCR using OpenCV.js

This blog explores how to build a smart, browser-based document scanner using OpenCV.js and live OCR. It covers document detection, perspective correction, inte

OpenCV G-API: From Imperative to Declarative Pipelines

OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 4mo ago

OpenCV G-API: From Imperative to Declarative Pipelines

Explore OpenCV G-API and how it transforms image-processing pipelines from imperative to declarative with graph-based execution. The post OpenCV G-API: From Imp

Run FLUX.2 on Replicate

Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 5mo ago

Run FLUX.2 on Replicate

FLUX.2 brings professional-grade image generation and editing with unprecedented detail, multi-reference support, and enterprise efficiency.

Teaching AI to see the world more like we do

DeepMind Blog 👁️ Computer Vision ⚡ AI Lesson 6mo ago

Teaching AI to see the world more like we do

Our new paper analyzes the important ways AI systems organize the visual world differently from humans.

Image editing in Gemini just got a major upgrade

DeepMind Blog 👁️ Computer Vision ⚡ AI Lesson 6mo ago

Image editing in Gemini just got a major upgrade

Transform images in amazing new ways with updated native image editing in the Gemini app.

How to Solve it With Code course now available

Fast.ai Blog 👁️ Computer Vision ⚡ AI Lesson 7mo ago

How to Solve it With Code course now available

tl/dr: This is a copy of a one-off email I sent to all fast.ai forum users, with a long-overdue update. I had planned to send this email a year ago to let you k

Which image editing model should I use?

Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 8mo ago

Which image editing model should I use?

Here is the ultimate comparison post on all the latest image editing models.

OpenAI News 👁️ Computer Vision ⚡ AI Lesson 8mo ago

Outbound coordinated vulnerability disclosure policy

Outbound coordinated vulnerability disclosure policy

GitHub Engineering 👁️ Computer Vision ⚡ AI Lesson 8mo ago

Post-quantum security for SSH access on GitHub

GitHub is introducing post-quantum secure key exchange methods for SSH access to better protect Git data in transit. The post Post-quantum security for SSH acce

Generate consistent characters

Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 10mo ago

Generate consistent characters

We compare the best image models for generating consistent characters from a single reference image.

BAIR Blog 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 10mo ago

Whole-Body Conditioned Egocentric Video Prediction

.modal { display: none; position: fixed; z-index: 9999; padding-top: 50px; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; background-color: rgba(0,

OpenAI News 👁️ Computer Vision ⚡ AI Lesson 1y ago

Introducing our latest image generation model in the API

Our latest image generation model is now available in the API via ‘gpt-image-1’—enabling developers and businesses to build professional-grade, customizable vis

OpenAI News 👁️ Computer Vision ⚡ AI Lesson 1y ago

Thinking with images

OpenAI o3 and o4-mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.

Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 1y ago

Replicate Intelligence #2

Faster image generation, AI-powered world simulator, insights on AI dataset complexity

Weaviate Blog 👁️ Computer Vision ⚡ AI Lesson 2y ago

Using Weaviate to Find Waldo

Dive into using Weaviate for image recognition to find the "needle in a haystack"!

Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago

A Dive into Text-to-Video Models

Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago

Universal Image Segmentation with Mask2Former and OneFormer

Weaviate Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago

How to build an Image Search Application with Weaviate

Learn how to use build an image search application using the Img2vec-neural module in Weaviate.

Hugging Face Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago

Image Classification with AutoTrain

Replicate Blog 👁️ Computer Vision ⚡ AI Lesson 3y ago

Automating image collection

Using CLIP and LAION5B to collect thousands of captioned images.

Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 5y ago

Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.

Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 5y ago

High-Low Frequency Detectors

A family of early-vision neurons reacting to directional transitions from high to low spatial frequency.

Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6y ago

An Overview of Early Vision in InceptionV1

An overview of all the neurons in the first five layers of InceptionV1, organized into a taxonomy of 'neuron groups.'

Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 6y ago

A Discussion of 'Adversarial Examples Are Not Bugs, They Are Features': Two Examples of Useful, Non-Robust Features

An example project using webpack and svelte-loader and ejs to inline SVGs

Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 7y ago

Object Detection Part 4: Fast Detection Models

In Part 3 , we have reviewed models in the R-CNN family. All of them are region-based object detection algorithms. They can achieve hig

Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago

Object Detection for Dummies Part 3: R-CNN Family

[Updated on 2018-12-20: Remove YOLO here. Part 4 will cover multiple fast object detection algorithms, including YOLO.] [Updated on 2018-12-27: Add bbox regress

Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago

Object Detection for Dummies Part 2: CNN, DPM and Overfeat

Part 1 of the “Object Detection for Dummies” series introduced: (1) the concept of image gradient vector and how HOG algorithm summarizes the inform

Distill.pub 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 8y ago

Feature Visualization

How neural networks build up their understanding of images

Lilian Weng's Blog 👁️ Computer Vision ⚡ AI Lesson 8y ago

Object Detection for Dummies Part 1: Gradient Vector, HOG, and SS

I’ve never worked in the field of

OpenAI News 👁️ Computer Vision ⚡ AI Lesson 8y ago

Robust adversarial inputs

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that