Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

110
lessons
How Audi Uses AI to Transform Automotive Manufacturing at Scale | Amazon Web Services
Computer Vision
How Audi Uses AI to Transform Automotive Manufacturing at Scale | Amazon Web Services
Amazon Web Services Advanced 3w ago
TensorFlow: Advanced Techniques Specialization
Computer Vision
TensorFlow: Advanced Techniques Specialization
DeepLearning.AI Advanced 1mo ago
AI Guidance for Physical Work
Computer Vision
AI Guidance for Physical Work
Y Combinator Advanced 2mo ago
YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)
Computer Vision
YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)
Roboflow Advanced 2mo ago
Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models
Computer Vision
Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models
Microsoft Research Advanced 3mo ago
Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look
Computer Vision
Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look
Cohere Advanced 3mo ago
Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python
Computer Vision
Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python
Roboflow Advanced 4mo ago
Genomcore impulsa la investigación biomédica con AWS e IA | Amazon Web Services
Computer Vision
Genomcore impulsa la investigación biomédica con AWS e IA | Amazon Web Services
Amazon Web Services Advanced 5mo ago
Ashmal Vayani - Seeing the World as It Speaks  Multilingual, Culturally Aware Multimodal AI
Computer Vision
Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI
Cohere Advanced 5mo ago
No, Apple isn’t trying to buy up all the 13 Pro Maxes.
Computer Vision
No, Apple isn’t trying to buy up all the 13 Pro Maxes.
The Verge Advanced 6mo ago
Qwen3-Omni: The First Open All-in-One AI?
Computer Vision
Qwen3-Omni: The First Open All-in-One AI?
What's AI by Louis-François Bouchard Advanced 6mo ago
Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738
Computer Vision
Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738
The TWIML AI Podcast with Sam Charrington Advanced 9mo ago
VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)
Computer Vision
VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)
freeCodeCamp.org Advanced 9mo ago
Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks
Computer Vision
Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks
Databricks Advanced 9mo ago
Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial
Computer Vision
Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial
Samin Learns AI Advanced 9mo ago
FastVLM brings advanced computer vision to your phone...
Computer Vision
FastVLM brings advanced computer vision to your phone...
NeuralNine Advanced 10mo ago
Find out how Nevada DETR achieved 4x faster approvals with Vertex AI
Computer Vision
Find out how Nevada DETR achieved 4x faster approvals with Vertex AI
Google Cloud Advanced 12mo ago
PaliGemma – Making Gemma 2 see by adding a vision encoder
Computer Vision
PaliGemma – Making Gemma 2 see by adding a vision encoder
Google for Developers Advanced 1y ago
George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO
Computer Vision
George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO
george hotz archive Advanced 1y ago
Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!
Computer Vision
Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!
Analytics Vidhya Advanced 1y ago
Deepseek is back with VISION
Computer Vision
Deepseek is back with VISION
1littlecoder Advanced 1y ago
Using Vertex AI for healthcare
Computer Vision
Using Vertex AI for healthcare
Google Cloud Tech Advanced 1y ago
Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing
Computer Vision
Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing
NVIDIA Developer Advanced 1y ago
YOLOv2 (YOLO9000) and YOLOv3 Explained
Computer Vision
YOLOv2 (YOLO9000) and YOLOv3 Explained
ExplainingAI Advanced 1y ago
New Video AI by META & Stanford Univ: APOLLO (7B)
Computer Vision
New Video AI by META & Stanford Univ: APOLLO (7B)
Discover AI Advanced 1y ago
MedAI: Vision Language Models & Fine-Tuning (KnowAda)
Computer Vision
MedAI: Vision Language Models & Fine-Tuning (KnowAda)
Discover AI Advanced 1y ago
open-animal-tracks
Computer Vision
open-animal-tracks
Data Skeptic Advanced 1y ago
Bird Distribution Modeling with Satbird
Computer Vision
Bird Distribution Modeling with Satbird
Data Skeptic Advanced 1y ago
Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum
Computer Vision
Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum
Microsoft Research Advanced 1y ago
Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson
Computer Vision
Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson
Latent Space Advanced 1y ago
JETSON AI LAB | Research Group Meeting (8/6/2024)
Computer Vision
JETSON AI LAB | Research Group Meeting (8/6/2024)
NVIDIA Developer Advanced 1y ago
Audience Segmentation Tips: 3 Ways to Segment Your Email List
3:24
Computer Vision
Audience Segmentation Tips: 3 Ways to Segment Your Email List
Klaviyo Advanced 1y ago
Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - 692
Computer Vision
Decoding Animal Behavior to Train Robots with EgoPet with Amir Bar - 692
The TWIML AI Podcast with Sam Charrington Advanced 1y ago
New Microsoft Vision Model has AMAZING TRICKS!!!
Computer Vision
New Microsoft Vision Model has AMAZING TRICKS!!!
1littlecoder Advanced 1y ago
Robotics AI for Industrial Applications
Computer Vision
Robotics AI for Industrial Applications
Weights & Biases Advanced 1y ago
Build computer vision applications easily with Roboflow and Google Cloud
Computer Vision
Build computer vision applications easily with Roboflow and Google Cloud
Google Cloud Advanced 1y ago
Search Engines Struggle to Process Text Efficiently: The Hidden Cost #seo
Computer Vision
Search Engines Struggle to Process Text Efficiently: The Hidden Cost #seo
Koray Tuğberk GÜBÜR Advanced 2y ago
Shashanka Venkataramana and Valentinos Pariza - Franca  Nested Matryoshka Clustering for Scalable Vi
Computer Vision
Shashanka Venkataramana and Valentinos Pariza - Franca Nested Matryoshka Clustering for Scalable Vi
Cohere Advanced 6mo ago
David Fan & Peter Tong  - Scaling Language Free Visual Representation Learning
Computer Vision
David Fan & Peter Tong - Scaling Language Free Visual Representation Learning
Cohere Advanced 8mo ago
RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)
Computer Vision
RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)
Roboflow Advanced 1y ago
YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)
Computer Vision
YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)
Roboflow Advanced 1y ago
Aya Vision - The Challenges & Breakthroughs
Computer Vision
Aya Vision - The Challenges & Breakthroughs
Cohere Advanced 1y ago
Shreyash Arya- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Computer Vision
Shreyash Arya- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Cohere Advanced 1y ago
Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
Computer Vision
Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
Cohere Advanced 1y ago
Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation
Computer Vision
Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation
Cohere Advanced 1y ago
Football AI | Community Q&A (Aug 29)
Computer Vision
Football AI | Community Q&A (Aug 29)
Roboflow Advanced 1y ago
Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)
Computer Vision
Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)
Roboflow Advanced 1y ago
Case study on CLIP: Large Multi-Modal Models for Blind & Low Vision Users | Microsoft Research Forum
Computer Vision
Case study on CLIP: Large Multi-Modal Models for Blind & Low Vision Users | Microsoft Research Forum
Microsoft Research Advanced 1y ago
📚 Coursera Courses Opens on Coursera · Free to audit
1 / 3 View all →
AI and Disaster Management
📚 Coursera Course ↗
Self-paced
AI and Disaster Management
Opens on Coursera ↗
Unity: Design & Deform Meshes for 3D Geometry Control
📚 Coursera Course ↗
Self-paced
Unity: Design & Deform Meshes for 3D Geometry Control
Opens on Coursera ↗
Preparing Multimodal Data: Vision, Audio, and NLP Pipelines
📚 Coursera Course ↗
Self-paced
Preparing Multimodal Data: Vision, Audio, and NLP Pipelines
Opens on Coursera ↗
Implement Hand Gesture Recognition with OpenCV
📚 Coursera Course ↗
Self-paced
Implement Hand Gesture Recognition with OpenCV
Opens on Coursera ↗
Deep Learning for Object Detection
📚 Coursera Course ↗
Self-paced
Deep Learning for Object Detection
Opens on Coursera ↗
Positioning: What you need for a successful Marketing Strategy
📚 Coursera Course ↗
Self-paced
Positioning: What you need for a successful Marketing Strategy
Opens on Coursera ↗