Multimodal RAG with GPT – Build Smarter Search & AI Systems

Coursera Courses ↗ · Coursera

Open Course on Coursera

Free to audit · Opens on Coursera

Multimodal RAG with GPT – Build Smarter Search & AI Systems

Coursera · Beginner ·🔍 RAG & Vector Search ·1mo ago
Updated in May 2025. This course now features Coursera Coach! A smarter way to learn with interactive, real-time conversations that help you test your knowledge, challenge assumptions, and deepen your understanding as you progress through the course. This course equips you with the skills to build smarter AI-driven systems using Retrieval Augmented Generation (RAG) and multimodal technology. You'll dive into the principles behind RAG and how it powers systems like advanced search engines, chatbots, and recommendation systems. The course will provide hands-on experience, enabling you to create multimodal systems that utilize images, text, and other forms of data to provide more intelligent and context-aware solutions. Starting with foundational knowledge, you will explore RAG systems, their components, and benefits. The course delves into how search capabilities can be integrated into multimodal systems and why this approach enhances both search and recommendation functionalities. You'll build multimodal search systems, creating embeddings and setting up a robust workflow to integrate different data types. You will also gain expertise in constructing a multimodal recommender system that combines RAG with GPT. As you progress, you will experiment with embedding images and using them in a vector database, setting up end-to-end systems, and refining them using hands-on lessons. Furthermore, you'll add a user interface to your multimodal recommender system, creating a polished, interactive tool that can be deployed for real-world use. By the end, you will have built a comprehensive multimodal RAG system with a recommender engine, capable of delivering highly relevant results. This course is ideal for AI enthusiasts, software developers, or data scientists looking to deepen their understanding of advanced search systems, recommendation algorithms, and the application of RAG in multimodal environments. A basic understanding of programming and machine learning concep
Watch on Coursera ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

When Should You Use Text2Cypher in a GraphRAG Pipeline
Learn when to use Text2Cypher in a GraphRAG pipeline to retrieve precise graph results from natural language questions
Dev.to AI
How to build a production RAG pipeline in Python (without a vector database)
Learn to build a production-ready RAG pipeline in Python without relying on a vector database, and understand the key considerations for a scalable and efficient implementation
Dev.to · Ayi NEDJIMI
Architecting Sub-150ms Hybrid RAG for Voice Agents: Combining pgvector, BM25, and Async FastAPI…
Learn how to architect a sub-150ms hybrid RAG for voice agents using pgvector, BM25, and Async FastAPI to serve large industrial catalogs
Medium · Python
Security Controls in Enterprise RAG: Keys, Audit Logs, and the Hierarchy That Prevents Role Elevation
Implement security controls in Enterprise RAG to prevent role elevation and ensure data integrity
Dev.to · Manjunath
Up next
Watch this before applying for jobs as a developer.
Tech With Tim
Watch →