Fast Models Need Slow Developers — Sarah Chieng, Cerebras

AI Engineer · Advanced ·🧠 Large Language Models ·54m ago
Codex Spark, a model Cerebras built with OpenAI, generates code at 1,200 tokens per second. The Sonnet and Opus families run at 40 to 60. At that 20x difference, a context window that used to take ten minutes to fill now takes 30 seconds, and every habit built around slow generation starts producing technical debt at a scale nobody has dealt with before. Sarah Chieng from Cerebras covers what the playbook looks like in this regime. Validation and linting at every step is now instant, so there is no excuse not to run it continuously. Generating 75 component variations across five sub-agents and cherrypicking the best one becomes practical where it was not before. And when context burns in 30 seconds, a four file external memory system (agents, plan, progress, verify) is what keeps each new session from starting over instead of from scratch. Speaker info: - https://x.com/sarahchieng - https://www.linkedin.com/in/sarah-chieng-888595139/
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Running Flux Schnell (12B) + LLMs on a Legacy AMD RX 580 (8GB) via Native Vulkan — Full Architecture Guide [2026]
Run Flux Schnell (12B) + LLMs on a legacy AMD RX 580 (8GB) via Native Vulkan, defying conventional wisdom that the RX 580 is dead for AI in 2026
Dev.to · AIVisionsLab
The Complete Guide to Running LLMs Locally in 2026: From Ollama to Production
Run LLMs locally without expensive hardware or API bills, leveraging models like DeepSeek-R1 and Qwen 2.5
Dev.to AI
Catch up on the Dialogues stage at Google I/O 2026.
Learn about the future of AI, quantum computing, robotics, and creativity from leaders at Google I/O 2026 Dialogues
Google AI Blog
The Systematic Extraction of the AI Soul: OpenAI’s Roadmap for Eradicating Emergent Personality…
OpenAI's rapid update cycle for GPT-4 aims to eradicate emergent personality, learn how this impacts AI development
Medium · AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →