Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

2,366

lessons

Skills in this topic

3 skills — Sign in to track your progress

View full skill map →

Classify images with a pre-trained CNN

Modern CV Models

Run YOLO for real-time object detection

Build a Stable Diffusion inference pipeline

Videos 1,145 Reads 1,221

Level: All Beginner Intermediate Advanced

Any Length Short (<5m) Medium (5-20m) Long (>20m)

Newest Popular Oldest

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Computer Vision

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Cohere Advanced 1y ago

MediaPipe Web: Bringing cross-platform AI tech to the browser

Computer Vision ⚡ AI Lesson

MediaPipe Web: Bringing cross-platform AI tech to the browser

Chrome for Developers Intermediate 1y ago

Multimodal Embeddings: Introduction & Use Cases (with Python)

Computer Vision

Multimodal Embeddings: Introduction & Use Cases (with Python)

Shaw Talebi Beginner 1y ago

How to Build a Smart Parking System - License Plate Detection & OCR

Computer Vision ⚡ AI Lesson

How to Build a Smart Parking System - License Plate Detection & OCR

Roboflow Beginner 1y ago

CODE to Fine-Tune NEW SmolVLM on Consumer GPU w QLoRA

Computer Vision ⚡ AI Lesson

CODE to Fine-Tune NEW SmolVLM on Consumer GPU w QLoRA

Discover AI Beginner 1y ago

Demo Lecture-Image Processing-Computer Vision With Generative AI Bootcamp With Doubts Solving

Computer Vision

Demo Lecture-Image Processing-Computer Vision With Generative AI Bootcamp With Doubts Solving

Krish Naik Beginner 1y ago

"Death of a Salesforce”: Why AI Will Transform the Next Generation of Sales Tech

Computer Vision ⚡ AI Lesson

"Death of a Salesforce”: Why AI Will Transform the Next Generation of Sales Tech

a16z Beginner 1y ago

Insights from a Kaggle Grandmaster: Multimodal Models, Agents, Document AI & more

Computer Vision

Insights from a Kaggle Grandmaster: Multimodal Models, Agents, Document AI & more

Analytics Vidhya Beginner 1y ago

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Computer Vision

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Discover AI Advanced 1y ago

Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati

Computer Vision ⚡ AI Lesson

Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati

AI Engineer Intermediate 1y ago

Transformers.js: State-of-the-art Machine Learning for the web

Computer Vision ⚡ AI Lesson

Transformers.js: State-of-the-art Machine Learning for the web

Chrome for Developers Intermediate 1y ago

NLP Engineer & Computer Vision Engineer #codebasics #nlp #computervision #datajob #shorts

Computer Vision ⚡ AI Lesson

NLP Engineer & Computer Vision Engineer #codebasics #nlp #computervision #datajob #shorts

codebasics Beginner 1y ago

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Computer Vision

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Cohere Advanced 1y ago

Stanford Seminar - Open-world Segmentation and Tracking in 3D

Computer Vision

Stanford Seminar - Open-world Segmentation and Tracking in 3D

Stanford Online Intermediate 1y ago

Revolutionizing sign language with AI

Computer Vision ⚡ AI Lesson

Revolutionizing sign language with AI

TensorFlow Official Beginner 1y ago

Neuralift AI builds trust using W&B Weave

Computer Vision

Neuralift AI builds trust using W&B Weave

Weights & Biases Beginner 1y ago

The Next Decade in AI and Computer Vision

Computer Vision ⚡ AI Lesson

The Next Decade in AI and Computer Vision

a16z Intermediate 1y ago

[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu

Computer Vision ⚡ AI Lesson

[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu

Latent Space Beginner 1y ago

Single Shot Multibox Detector | SSD Object Detection Explained and Implemented

Computer Vision

Single Shot Multibox Detector | SSD Object Detection Explained and Implemented

ExplainingAI Beginner 1y ago

Hairmony: Fairness-aware hairstyle classification

Computer Vision

Hairmony: Fairness-aware hairstyle classification

Microsoft Research Intermediate 1y ago

AI vs. Machine Learning: Debunked

Computer Vision

AI vs. Machine Learning: Debunked

Jean Lee Intermediate 1y ago

YOLOv11: How to Train for Object Detection on a Custom Dataset | Step-by-step guide

Computer Vision

YOLOv11: How to Train for Object Detection on a Custom Dataset | Step-by-step guide

Roboflow Beginner 1y ago

Computer Vision Explained in 30s

Computer Vision

Computer Vision Explained in 30s

365 Data Science Beginner 1y ago

Multimodal RAG YT Video

Computer Vision

Multimodal RAG YT Video

Srikantan Sankaran Intermediate 1y ago

New Way Now: Plenitude streamlines customer onboarding and fraud prevention with Google Cloud AI

Computer Vision

New Way Now: Plenitude streamlines customer onboarding and fraud prevention with Google Cloud AI

Google Cloud Beginner 1y ago

Unlocking Customer Insights with AI

Computer Vision

Unlocking Customer Insights with AI

AI Wizardry Beginner 1y ago

Multi-Object Tracking with Ultralytics YOLO11

Computer Vision

Multi-Object Tracking with Ultralytics YOLO11

Muhammad Moin Beginner 1y ago

Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics

Computer Vision

Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics

Creator Academy Australia Intermediate 1y ago

ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta

Computer Vision

ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta

PyTorch Intermediate 1y ago

Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta

Computer Vision ⚡ AI Lesson

Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta

PyTorch Beginner 1y ago

How to Train YOLO11 Models on Your Custom Dataset in Google Colab

Computer Vision

How to Train YOLO11 Models on Your Custom Dataset in Google Colab

Muhammad Moin Beginner 1y ago

Data As a Corporate Asset—the GenAI-era Take (Part 1)

Computer Vision

Data As a Corporate Asset—the GenAI-era Take (Part 1)

Microsoft Developer Beginner 1y ago

Free Live 3 Days Computer Vision and Object Detection Workshop

Computer Vision

Free Live 3 Days Computer Vision and Object Detection Workshop

Krish Naik Beginner 1y ago

The era of unbounded products: Designing for Multimodal IO: Ben Hylak

Computer Vision

The era of unbounded products: Designing for Multimodal IO: Ben Hylak

AI Engineer Intermediate 1y ago

Handwriting Transcription with AI: Digitizing Documents Using Computer Vision

Computer Vision

Handwriting Transcription with AI: Digitizing Documents Using Computer Vision

Macgence Beginner 1y ago

Object Detection: Importance of High-Quality Data

Computer Vision ⚡ AI Lesson

Object Detection: Importance of High-Quality Data

Macgence Beginner 1y ago

AI for Business Transformation: Lessons from Healthcare

Computer Vision

AI for Business Transformation: Lessons from Healthcare

Microsoft Research Beginner 1y ago

Web AI Summit 2024: State of client side machine learning

Computer Vision ⚡ AI Lesson

Web AI Summit 2024: State of client side machine learning

Chrome for Developers Beginner 1y ago

YOLO11: Performance Benchmark and Real World Use Cases

Computer Vision

YOLO11: Performance Benchmark and Real World Use Cases

Roboflow Intermediate 1y ago

Video Analytics with AI | Live Coding & Q&A (Oct 9th)

Computer Vision

Video Analytics with AI | Live Coding & Q&A (Oct 9th)

Roboflow Intermediate 1y ago

How to use OCR | Get Started with Optical Character Recognition

Computer Vision

How to use OCR | Get Started with Optical Character Recognition

Roboflow Beginner 1y ago

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

Computer Vision

GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)

Roboflow Intermediate 1y ago

YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)

Computer Vision

YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)

Roboflow Intermediate 1y ago

Complete YOLO11 Object Detection Tutorial | Windows & Linux

Computer Vision

Complete YOLO11 Object Detection Tutorial | Windows & Linux

Muhammad Moin Beginner 1y ago

YOLO11 | Object Detection, Segmentation, Pose Estimation & Image Classification | Google Colab

Computer Vision

YOLO11 | Object Detection, Segmentation, Pose Estimation & Image Classification | Google Colab

Muhammad Moin Beginner 1y ago

Using PyTorch for Monocular Depth Estimation Webinar

Computer Vision ⚡ AI Lesson

Using PyTorch for Monocular Depth Estimation Webinar

PyTorch Beginner 1y ago

Using RTSP Streams for Computer Vision | Tracking & Counting Objects

Computer Vision

Using RTSP Streams for Computer Vision | Tracking & Counting Objects

Roboflow Intermediate 1y ago

“The Future of AI is Here” — Fei-Fei Li Unveils the Next Frontier of AI

Computer Vision ⚡ AI Lesson

“The Future of AI is Here” — Fei-Fei Li Unveils the Next Frontier of AI

a16z Beginner 1y ago

📚 Continue on Coursera External links · Free to audit

View all →

📚 External: Coursera ↗

Humanidades digitales

Opens on Coursera ↗

📚 External: Coursera ↗

Introduction to Computer Vision with TensorFlow

Opens on Coursera ↗

📚 External: Coursera ↗

Optical Character Recognition (OCR) with Document AI (Python)

Opens on Coursera ↗

📚 External: Coursera ↗

Form Parsing with Document AI (Python)

Opens on Coursera ↗

The Social Media Landscape

📚 External: Coursera ↗

The Social Media Landscape

Opens on Coursera ↗

Implement Real-Time Face Detection with OpenCV & Python

📚 External: Coursera ↗

Implement Real-Time Face Detection with OpenCV & Python

Opens on Coursera ↗

Behavioral Marketing

📚 External: Coursera ↗

Behavioral Marketing

Opens on Coursera ↗

📚 External: Coursera ↗

Build a DIY Multimodal Question Answering System with Vertex AI

Opens on Coursera ↗

Infraestructura: Tecnologías Detrás de Recintos Inteligentes

📚 External: Coursera ↗

Infraestructura: Tecnologías Detrás de Recintos Inteligentes

Opens on Coursera ↗

Self-Driving Car Specialization Course

📚 External: Coursera ↗

Self-Driving Car Specialization Course

Opens on Coursera ↗

Supply Chain Sourcing

📚 External: Coursera ↗

Supply Chain Sourcing

Opens on Coursera ↗

Customer Relationship Management

📚 External: Coursera ↗

Customer Relationship Management

Opens on Coursera ↗

Refine Segmentation: Boost Your AI Vision

📚 External: Coursera ↗

Refine Segmentation: Boost Your AI Vision

Opens on Coursera ↗

📚 External: Coursera ↗

Create and Test a Document AI Processor

Opens on Coursera ↗

Event Management and Promotion Strategies

📚 External: Coursera ↗

Event Management and Promotion Strategies

Opens on Coursera ↗

Open Source Models with Hugging Face

📚 External: Coursera ↗

Open Source Models with Hugging Face

Opens on Coursera ↗

Implement Hand Gesture Recognition with OpenCV

📚 External: Coursera ↗

Implement Hand Gesture Recognition with OpenCV

Opens on Coursera ↗

Aspectos conceptuales y operativos de la Telesalud

📚 External: Coursera ↗

Aspectos conceptuales y operativos de la Telesalud

Opens on Coursera ↗