Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

110

lessons

Level: All Beginner Intermediate Advanced

Any Length Short (<5m) Medium (5-20m) Long (>20m)

Newest Popular Oldest

How Audi Uses AI to Transform Automotive Manufacturing at Scale | Amazon Web Services

Computer Vision

How Audi Uses AI to Transform Automotive Manufacturing at Scale | Amazon Web Services

Amazon Web Services Advanced 3w ago

TensorFlow: Advanced Techniques Specialization

Computer Vision

TensorFlow: Advanced Techniques Specialization

DeepLearning.AI Advanced 1mo ago

AI Guidance for Physical Work

Computer Vision

AI Guidance for Physical Work

Y Combinator Advanced 2mo ago

YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)

Computer Vision

YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)

Roboflow Advanced 2mo ago

Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models

Computer Vision

Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models

Microsoft Research Advanced 3mo ago

Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look

Computer Vision

Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look

Cohere Advanced 3mo ago

Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python

Computer Vision

Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python

Roboflow Advanced 4mo ago

Genomcore impulsa la investigación biomédica con AWS e IA | Amazon Web Services

Computer Vision

Genomcore impulsa la investigación biomédica con AWS e IA | Amazon Web Services

Amazon Web Services Advanced 5mo ago

Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI

Computer Vision

Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI

Cohere Advanced 5mo ago

No, Apple isn’t trying to buy up all the 13 Pro Maxes.

Computer Vision

No, Apple isn’t trying to buy up all the 13 Pro Maxes.

The Verge Advanced 6mo ago

Qwen3-Omni: The First Open All-in-One AI?

Computer Vision

Qwen3-Omni: The First Open All-in-One AI?

What's AI by Louis-François Bouchard Advanced 6mo ago

Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738

Computer Vision

Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738

The TWIML AI Podcast with Sam Charrington Advanced 9mo ago

VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)

Computer Vision

VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)

freeCodeCamp.org Advanced 9mo ago

Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks

Computer Vision

Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks

Databricks Advanced 9mo ago

Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial

Computer Vision

Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial

Samin Learns AI Advanced 9mo ago

FastVLM brings advanced computer vision to your phone...

Computer Vision

FastVLM brings advanced computer vision to your phone...

NeuralNine Advanced 10mo ago

Find out how Nevada DETR achieved 4x faster approvals with Vertex AI

Computer Vision

Find out how Nevada DETR achieved 4x faster approvals with Vertex AI

Google Cloud Advanced 12mo ago

PaliGemma – Making Gemma 2 see by adding a vision encoder

Computer Vision

PaliGemma – Making Gemma 2 see by adding a vision encoder

Google for Developers Advanced 1y ago

George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO

Computer Vision

George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO

george hotz archive Advanced 1y ago

Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!

Computer Vision

Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!

Analytics Vidhya Advanced 1y ago

Deepseek is back with VISION

Computer Vision

Deepseek is back with VISION

1littlecoder Advanced 1y ago

Using Vertex AI for healthcare

Computer Vision

Using Vertex AI for healthcare

Google Cloud Tech Advanced 1y ago

Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing

Computer Vision

Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing

NVIDIA Developer Advanced 1y ago

YOLOv2 (YOLO9000) and YOLOv3 Explained

Computer Vision

YOLOv2 (YOLO9000) and YOLOv3 Explained

ExplainingAI Advanced 1y ago

New Video AI by META & Stanford Univ: APOLLO (7B)

Computer Vision

New Video AI by META & Stanford Univ: APOLLO (7B)

Discover AI Advanced 1y ago

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Computer Vision

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Discover AI Advanced 1y ago

open-animal-tracks

Computer Vision

open-animal-tracks

Data Skeptic Advanced 1y ago

Bird Distribution Modeling with Satbird

Computer Vision

Bird Distribution Modeling with Satbird

Data Skeptic Advanced 1y ago

Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum

Computer Vision

Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum

Microsoft Research Advanced 1y ago

Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson

Computer Vision

Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson

Latent Space Advanced 1y ago

JETSON AI LAB | Research Group Meeting (8/6/2024)

Computer Vision

JETSON AI LAB | Research Group Meeting (8/6/2024)

NVIDIA Developer Advanced 1y ago

Audience Segmentation Tips: 3 Ways to Segment Your Email List

Computer Vision

Audience Segmentation Tips: 3 Ways to Segment Your Email List

Klaviyo Advanced 1y ago

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - 692

Computer Vision

Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - 692

The TWIML AI Podcast with Sam Charrington Advanced 1y ago

New Microsoft Vision Model has AMAZING TRICKS!!!

Computer Vision

New Microsoft Vision Model has AMAZING TRICKS!!!

1littlecoder Advanced 1y ago

Robotics AI for Industrial Applications

Computer Vision

Robotics AI for Industrial Applications

Weights & Biases Advanced 1y ago

Build computer vision applications easily with Roboflow and Google Cloud

Computer Vision

Build computer vision applications easily with Roboflow and Google Cloud

Google Cloud Advanced 1y ago

Search Engines Struggle to Process Text Efficiently: The Hidden Cost #seo

Computer Vision

Search Engines Struggle to Process Text Efficiently: The Hidden Cost #seo

Koray Tuğberk GÜBÜR Advanced 2y ago

Shashanka Venkataramana and Valentinos Pariza - Franca Nested Matryoshka Clustering for Scalable Vi

Computer Vision

Shashanka Venkataramana and Valentinos Pariza - Franca Nested Matryoshka Clustering for Scalable Vi

Cohere Advanced 6mo ago

David Fan & Peter Tong - Scaling Language Free Visual Representation Learning

Computer Vision

David Fan & Peter Tong - Scaling Language Free Visual Representation Learning

Cohere Advanced 8mo ago

RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)

Computer Vision

RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)

Roboflow Advanced 1y ago

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

Computer Vision

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

Roboflow Advanced 1y ago

Aya Vision - The Challenges & Breakthroughs

Computer Vision

Aya Vision - The Challenges & Breakthroughs

Cohere Advanced 1y ago

Shreyash Arya- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

Computer Vision

Shreyash Arya- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

Cohere Advanced 1y ago

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Computer Vision

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Cohere Advanced 1y ago

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Computer Vision

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Cohere Advanced 1y ago

Football AI | Community Q&A (Aug 29)

Computer Vision

Football AI | Community Q&A (Aug 29)

Roboflow Advanced 1y ago

Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)

Computer Vision

Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)

Roboflow Advanced 1y ago

Case study on CLIP: Large Multi-Modal Models for Blind & Low Vision Users | Microsoft Research Forum

Computer Vision

Case study on CLIP: Large Multi-Modal Models for Blind & Low Vision Users | Microsoft Research Forum

Microsoft Research Advanced 1y ago

📚 Coursera Courses Opens on Coursera · Free to audit

View all →

AI and Disaster Management

📚 Coursera Course ↗

AI and Disaster Management

Opens on Coursera ↗

Unity: Design & Deform Meshes for 3D Geometry Control

📚 Coursera Course ↗

Unity: Design & Deform Meshes for 3D Geometry Control

Opens on Coursera ↗

Preparing Multimodal Data: Vision, Audio, and NLP Pipelines

📚 Coursera Course ↗

Preparing Multimodal Data: Vision, Audio, and NLP Pipelines

Opens on Coursera ↗

Implement Hand Gesture Recognition with OpenCV

📚 Coursera Course ↗

Implement Hand Gesture Recognition with OpenCV

Opens on Coursera ↗

Deep Learning for Object Detection

📚 Coursera Course ↗

Deep Learning for Object Detection

Opens on Coursera ↗

Positioning: What you need for a successful Marketing Strategy

📚 Coursera Course ↗

Positioning: What you need for a successful Marketing Strategy

Opens on Coursera ↗

Supply Chain Sourcing

📚 Coursera Course ↗

Supply Chain Sourcing

Opens on Coursera ↗

📚 Coursera Course ↗

Process Images, Create Captioning AI Models

Opens on Coursera ↗

H2O Cloud AI Developer Services

📚 Coursera Course ↗

H2O Cloud AI Developer Services

Opens on Coursera ↗

Camera and Imaging

📚 Coursera Course ↗

Camera and Imaging

Opens on Coursera ↗

📚 Coursera Course ↗

Build a DIY Multimodal Question Answering System with Vertex AI

Opens on Coursera ↗

📚 Coursera Course ↗

Introduction to Vertex AI Embeddings: Text and Multimodal

Opens on Coursera ↗

📚 Coursera Course ↗

Process Documents with Python Using the Document AI API

Opens on Coursera ↗

Refine Segmentation: Boost Your AI Vision

📚 Coursera Course ↗

Refine Segmentation: Boost Your AI Vision

Opens on Coursera ↗

IA Para Todos (Español)

📚 Coursera Course ↗

IA Para Todos (Español)

Opens on Coursera ↗

📚 Coursera Course ↗

Image Segmentation, Filtering, and Region Analysis

Opens on Coursera ↗

📚 Coursera Course ↗

Using Specialized Processors with Document AI (Python)

Opens on Coursera ↗

Interdisciplinarity in Thought and Practice

📚 Coursera Course ↗

Interdisciplinarity in Thought and Practice

Opens on Coursera ↗