Automating Creativity: Building Gen Media Agents with ADK and MCP

Google Cloud · Intermediate ·🤖 AI Agents & Automation ·2h ago
Can you automate the entire creative process? Following up on Khulan’s deep dive into manual prompting, Katie Nguyen (Developer Relations Engineer) joins Stephanie Wong to show the programmatic side of creative AI. In this live-coding session, Katie demonstrates how to use the Agent Development Kit (ADK) to build a "Character Story Agent" that handles everything from character design to final video editing—all through natural language. Watch as they bring Lulu the Shihtzu to life, moving from a simple text idea to a fully narrated, multi-scene video in minutes. Key Technical Takeaways: The Agentic Edge: Why agents are better than manual prompting for large stories. An agent maintains a "memory bank" of your character (Lulu), ensuring her bow and fur stay consistent across images, videos, and music. ADK & MCP Servers: Katie explains how the Agent Development Kit uses Model Context Protocol (MCP) servers to connect Gemini to a suite of specialized tools: Nano Banana 2: For high-fidelity, consistent character image generation. Veo 3.1 Light: For fast, cost-effective 6-second video animations. Lyria 3 Pro: For generating custom musical scores based on the scene's mood. Gemini 3.1 Flash TTS: For expressive, human-like narration using new audio tags. "Director" Skills: See how to offload complex logic into Agent Skills. Instead of bloating your system prompt, you can load a "Voice Director" or "Image Artist" skill that knows exactly how to get the best out of each model. Automated Post-Production: How the agent uses open-source tools like FFmpeg via the command line to stitch together audio, video, and music without a human editor. Self-Correction: Learn how an agent can "fact-check" its own outputs—like checking if the audio ran over the video length—and automatically regenerate assets to fit. "I’m an engineer, not a creative director. But with Gemini as my collaborator, I can describe a dog getting into trouble and let the agent handle the prompt engineering
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The $60 Billion Problem Nobody in B2B Sales Is Talking About — And the AI That’s Fixing It
AI-powered visual and voice-led meetings are revolutionizing B2B sales by replacing expensive and repetitive human-dependent tasks, saving $60 billion
Medium · AI
Multi-Agent AI Systems Explained: How I Orchestrate 6 Claude Agents with N8N for $5/Month
Learn how to build multi-agent AI systems using Claude agents and N8N for $5/month, and understand the difference between AI workflows and AI systems
Medium · AI
Stop Calling Them Chatbots. The AI Employee Era Already Started.
Learn how small businesses can gain an unfair advantage by leveraging AI employees, and why the term 'chatbot' is limiting their potential
Medium · Startup
Grafis Sederhana, Otak Juara: Rahasia Mengajari AI Teknik Balap Profesional di Dunia Virtual
Learn how to teach AI professional racing techniques in a virtual world using simple graphics
Medium · Machine Learning
Up next
Acquired's Ben Gilbert and David Rosenthal live from Google Cloud Next
Google Cloud
Watch →