Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

2,373

lessons

Skills in this topic

3 skills — Sign in to track your progress

View full skill map →

Classify images with a pre-trained CNN

Modern CV Models

Run YOLO for real-time object detection

Build a Stable Diffusion inference pipeline

Videos 1,145 Reads 1,228

Level: All Beginner Intermediate Advanced

Any Length Short (<5m) Medium (5-20m) Long (>20m)

Newest Popular Oldest

Building MCP Servers with LangChain in Python

Computer Vision

Building MCP Servers with LangChain in Python

Muhammad Moin Intermediate 1y ago

Drowsiness Detection with Vision AI | Improve Safety with AI

Computer Vision

Drowsiness Detection with Vision AI | Improve Safety with AI

Roboflow Intermediate 1y ago

Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez

Computer Vision

Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez

PyTorch Intermediate 1y ago

MedGemma LLM: Doctors, Meet Your AI Assistant 🧠

Computer Vision ⚡ AI Lesson

MedGemma LLM: Doctors, Meet Your AI Assistant 🧠

AI Anytime Intermediate 1y ago

[CVPR 2025] Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Computer Vision

[CVPR 2025] Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

anucvml Intermediate 1y ago

China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!

Computer Vision

China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!

Analytics Vidhya Intermediate 1y ago

How to Segment Your Audience in Mailchimp

Computer Vision ⚡ AI Lesson

How to Segment Your Audience in Mailchimp

Intuit Mailchimp Intermediate 1y ago

Intuit uses Google Cloud Document AI to further simplify tax prep for millions

Computer Vision

Intuit uses Google Cloud Document AI to further simplify tax prep for millions

Google Cloud Intermediate 1y ago

Multimodal AI & Next Gen Databases | Data Brew | Episode 42

Computer Vision ⚡ AI Lesson

Multimodal AI & Next Gen Databases | Data Brew | Episode 42

Databricks Intermediate 1y ago

RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow

Computer Vision

RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow

Roboflow Intermediate 1y ago

Expedition Aya Kick Off Event

Computer Vision

Expedition Aya Kick Off Event

Cohere Intermediate 1y ago

Build a Football Analysis System Using YOLO11 and Supervision

Computer Vision

Build a Football Analysis System Using YOLO11 and Supervision

Muhammad Moin Intermediate 1y ago

Seminar: Segment Anything - Meta AI (15-03-2025)

Computer Vision

Seminar: Segment Anything - Meta AI (15-03-2025)

IEC Seminar Intermediate 1y ago

Building a travel buddy with Gemma

Computer Vision

Building a travel buddy with Gemma

Google for Developers Intermediate 1y ago

New Way Now: Safe Rate helps homebuyers and owners save thousands with AI-powered mortgage assistant

Computer Vision

New Way Now: Safe Rate helps homebuyers and owners save thousands with AI-powered mortgage assistant

Google Cloud Intermediate 1y ago

Peter Tong - MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Computer Vision

Peter Tong - MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Cohere Intermediate 1y ago

How Machines Find Patterns [Template Matching]

Computer Vision

How Machines Find Patterns [Template Matching]

Jia-Bin Huang Intermediate 1y ago

Next Multi trillion dollar industry?

Computer Vision

Next Multi trillion dollar industry?

Full Disclosure Intermediate 1y ago

DeepSeek’s Janus-Pro-7B Crushes DALL·E 3! #deepseek #openai

Computer Vision

DeepSeek’s Janus-Pro-7B Crushes DALL·E 3! #deepseek #openai

Analytics Vidhya Intermediate 1y ago

This Python module is your go-to for speech and image recognition!

Computer Vision ⚡ AI Lesson

This Python module is your go-to for speech and image recognition!

Tech With Tim Intermediate 1y ago

Selling the Cause: Leveraging Marketing Strategies & Storytelling in Nonprofits

Computer Vision

Selling the Cause: Leveraging Marketing Strategies & Storytelling in Nonprofits

The Nonprofit Prof Intermediate 1y ago

Not ElevenLabs, This new #1 Text to Speech AI is FREE!!!!

Computer Vision

Not ElevenLabs, This new #1 Text to Speech AI is FREE!!!!

1littlecoder Intermediate 1y ago

Next AI Project is Image Classification in Python🔍🤖

Computer Vision ⚡ AI Lesson

Next AI Project is Image Classification in Python🔍🤖

Tech With Tim Intermediate 1y ago

Best of 2024 in Vision [LS Live @ NeurIPS]

Computer Vision ⚡ AI Lesson

Best of 2024 in Vision [LS Live @ NeurIPS]

Latent Space Intermediate 1y ago

How to Do Email Segmentation the Right Way

Computer Vision ⚡ AI Lesson

How to Do Email Segmentation the Right Way

Spark Bridge Digital | Email Marketing Agency Intermediate 1y ago

OpenAI DevDay 2024 | Multimodal apps with the Realtime API

Computer Vision

OpenAI DevDay 2024 | Multimodal apps with the Realtime API

OpenAI Intermediate 1y ago

Ethan Norville EXPOSES Coronation Project Secrets

Computer Vision

Ethan Norville EXPOSES Coronation Project Secrets

Professor Charley T Intermediate 1y ago

MediaPipe Web: Bringing cross-platform AI tech to the browser

Computer Vision ⚡ AI Lesson

MediaPipe Web: Bringing cross-platform AI tech to the browser

Chrome for Developers Intermediate 1y ago

Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati

Computer Vision ⚡ AI Lesson

Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati

AI Engineer Intermediate 1y ago

Transformers.js: State-of-the-art Machine Learning for the web

Computer Vision ⚡ AI Lesson

Transformers.js: State-of-the-art Machine Learning for the web

Chrome for Developers Intermediate 1y ago

Stanford Seminar - Open-world Segmentation and Tracking in 3D

Computer Vision

Stanford Seminar - Open-world Segmentation and Tracking in 3D

Stanford Online Intermediate 1y ago

The Next Decade in AI and Computer Vision

Computer Vision ⚡ AI Lesson

The Next Decade in AI and Computer Vision

a16z Intermediate 1y ago

Hairmony: Fairness-aware hairstyle classification

Computer Vision

Hairmony: Fairness-aware hairstyle classification

Microsoft Research Intermediate 1y ago

AI vs. Machine Learning: Debunked

Computer Vision

AI vs. Machine Learning: Debunked

Jean Lee Intermediate 1y ago

Multimodal RAG YT Video

Computer Vision

Multimodal RAG YT Video

Srikantan Sankaran Intermediate 1y ago

Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics

Computer Vision

Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics

Creator Academy Australia Intermediate 1y ago

ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta

Computer Vision

ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta

PyTorch Intermediate 1y ago

New Course: YOLOv12 – Custom Object Detection, Tracking & Web Apps

Computer Vision

New Course: YOLOv12 – Custom Object Detection, Tracking & Web Apps

Muhammad Moin Intermediate 1y ago

Build an AI-Powered Self-Serve Checkout & Cost Calculator in 10 Minutes (Almost)

Computer Vision

Build an AI-Powered Self-Serve Checkout & Cost Calculator in 10 Minutes (Almost)

Roboflow Intermediate 1y ago

Measure Liquid Levels with AI | Build a Web App Powered by Computer Vision

Computer Vision

Measure Liquid Levels with AI | Build a Web App Powered by Computer Vision

Roboflow Intermediate 1y ago

Pool Shot Predictor with OpenCV: Will the Ball Go Into the Pocket?

Computer Vision

Pool Shot Predictor with OpenCV: Will the Ball Go Into the Pocket?

Muhammad Moin Intermediate 1y ago

How to Train YOLO11 Instance Segmentation Models on Your Custom Dataset in Google Colab

Computer Vision

How to Train YOLO11 Instance Segmentation Models on Your Custom Dataset in Google Colab

Muhammad Moin Intermediate 1y ago

Estimate Real Distance to Objects with Depth Pro and YOLO11

Computer Vision

Estimate Real Distance to Objects with Depth Pro and YOLO11

Muhammad Moin Intermediate 1y ago

Florence-2: Create and Deploy a Custom Vision Language Model

Computer Vision

Florence-2: Create and Deploy a Custom Vision Language Model

Roboflow Intermediate 1y ago

YOLO11: Performance Benchmark and Real World Use Cases

Computer Vision

YOLO11: Performance Benchmark and Real World Use Cases

Roboflow Intermediate 1y ago

Video Analytics with AI | Live Coding & Q&A (Oct 9th)

Computer Vision

Video Analytics with AI | Live Coding & Q&A (Oct 9th)

Roboflow Intermediate 1y ago

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

Computer Vision

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

Roboflow Intermediate 1y ago

YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)

Computer Vision

YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)

Roboflow Intermediate 1y ago

📚 Continue on Coursera External links · Free to audit

View all →

AI and Disaster Management

📚 External: Coursera ↗

AI and Disaster Management

Opens on Coursera ↗

Introduction to Computer Vision and Image Processing

📚 External: Coursera ↗

Introduction to Computer Vision and Image Processing

Opens on Coursera ↗

📚 External: Coursera ↗

Create Image Captioning Models - Português Brasileiro

Opens on Coursera ↗

Self-Driving Car Specialization Course

📚 External: Coursera ↗

Self-Driving Car Specialization Course

Opens on Coursera ↗

Aspectos conceptuales y operativos de la Telesalud

📚 External: Coursera ↗

Aspectos conceptuales y operativos de la Telesalud

Opens on Coursera ↗

📚 External: Coursera ↗

Form Parsing Using Document AI

Opens on Coursera ↗

📚 External: Coursera ↗

Create and Test a Document AI Processor

Opens on Coursera ↗

AI Technologies in Healthcare

📚 External: Coursera ↗

AI Technologies in Healthcare

Opens on Coursera ↗

📚 External: Coursera ↗

Introduction to Vertex AI Embeddings: Text and Multimodal

Opens on Coursera ↗

Deep Learning Applications for Computer Vision

📚 External: Coursera ↗

Deep Learning Applications for Computer Vision

Opens on Coursera ↗

Marketing Fundamentals Mastery: Apply, Analyze & Evaluate

📚 External: Coursera ↗

Marketing Fundamentals Mastery: Apply, Analyze & Evaluate

Opens on Coursera ↗

6G Evolution: Blockchain, Semantic Communications & Radar

📚 External: Coursera ↗

6G Evolution: Blockchain, Semantic Communications & Radar

Opens on Coursera ↗

Landing.AI for Beginners: Build Data Visualization AI Models

📚 External: Coursera ↗

Landing.AI for Beginners: Build Data Visualization AI Models

Opens on Coursera ↗

Customer Relationship Management

📚 External: Coursera ↗

Customer Relationship Management

Opens on Coursera ↗

Vision Models: Train and Evaluate

📚 External: Coursera ↗

Vision Models: Train and Evaluate

Opens on Coursera ↗

Computer Vision: Face Recognition Quick Starter in Python

📚 External: Coursera ↗

Computer Vision: Face Recognition Quick Starter in Python

Opens on Coursera ↗

Bases teóricas de la gestión de la salud y las lesiones

📚 External: Coursera ↗

Bases teóricas de la gestión de la salud y las lesiones

Opens on Coursera ↗

Azure Practical - Cognitive Services

📚 External: Coursera ↗

Azure Practical - Cognitive Services

Opens on Coursera ↗