Build Hour: GPT-Realtime-2
Skills:
Multimodal LLMs80%
Build with the next wave of realtime voice AI. In this Build Hour, you’ll learn how to use GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper to build low-latency voice agents that can translate live speech, reason across tools, operate apps, and support more natural voice-to-voice and voice-to-action experiences.
In this session, Teri Yu (Product) and Erika Kettleson (Solutions Engineering) will cover:
• Building with new realtime audio models for translation, streaming speech-to-text, and intelligent voice agents
• Using GPT-Realtime-2 capabilities like preambles, 128K context, parallel tool calling, domain understanding, context over turns, and controllable expressiveness
• Creating voice-powered workflows for shopping and product analytics dashboards
• Customer Spotlight on how Sierra (https://sierra.ai/) is designing production customer experience agents with guardrails, VAD tuning, tracing, redaction, evals, and customer-specific harnesses.
👉 Realtime Voice Blog: https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/
👉 Voice Agents Docs: https://developers.openai.com/api/docs/guides/voice-agents
👉 Playground: https://platform.openai.com/audio/realtime
👉 Follow along with the code repo: http://github.com/openai/build-hours
👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours
00:00 Welcome and intro
02:06 Realtime voice models overview
02:26 GPT-Realtime-Translate and GPT-Realtime-Whisper demo
04:36 GPT-Realtime-2: three ways to build with voice AI
05:14 What’s new in GPT-Realtime-2
06:58 Demo: Voice-powered search agent
12:32 Demo: Product analytics dashboard
17:24 What can you build with voice AI?
18:36 Customer spotlight: Sierra
29:56 Q&A
42:05 Resources & Upcoming Build Hours
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Multimodal LLMs
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Hypothesis Generation for AI Translation Quality: How To Find What’s Worth Testing
Medium · LLM
Why Is Everyone Learning Python in the AI Era? The Answer May Surprise You
Medium · Python
Why China Open Sources AI While America Keeps It Closed
Medium · AI
Google's I/O 2024 announcements just reset the AI developer stack
Dev.to · albe_sf
Chapters (11)
Welcome and intro
2:06
Realtime voice models overview
2:26
GPT-Realtime-Translate and GPT-Realtime-Whisper demo
4:36
GPT-Realtime-2: three ways to build with voice AI
5:14
What’s new in GPT-Realtime-2
6:58
Demo: Voice-powered search agent
12:32
Demo: Product analytics dashboard
17:24
What can you build with voice AI?
18:36
Customer spotlight: Sierra
29:56
Q&A
42:05
Resources & Upcoming Build Hours
🎓
Tutor Explanation
DeepCamp AI