[Workshop] AI Engineering 201: Inference
Skills:
LLM Engineering80%
Optional introductory course for AI Engineers, free for all Summit attendees. Advanced knowledge of AI Engineering, led by instructor Charles Frye of the massively popular Full Stack LLM Bootcamp.
Part I: Running Inference
What is the workload?
Open vs Proprietary Models
Execution
End User Device
Over a Network
Serving Inference
Timestamps
0:00:00 Intro & Overview
0:03:52 What is Inference?
0:10:16 Proprietary Models for Inference
0:21:22 Open Models for Inference
0:30:41 Will Open or Proprietary Models Win Long-Term?
0:36:19 Q&A on Models
0:44:12 Inference on End-User Devices
1:04:32 Inference-as-a-Service Providers
1:10:00 Cloud Inference and Serverless GPUs
1:17:46 Rack-and-Stack for Inference
1:20:12 Inference Arithmetic for GPUs
1:27:07 TPUs and Other Custom Silicon for Inference
1:36:11 Containerizing Inference and Inference Services
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from AI Engineer · AI Engineer · 17 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
▶
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
AI Engineer Summit 2023 — DAY 1 Livestream
AI Engineer
AI Engineer Summit 2023 — DAY 2 Livestream
AI Engineer
Principles for Prompt Engineering - Karina Nguyen (Claude Instant @ Anthropic)
AI Engineer
Announcing the AI Engineer Network: Benjamin Dunphy
AI Engineer
The 1,000x AI Engineer: Swyx
AI Engineer
Building AI For All: Amjad Masad & Michele Catasta
AI Engineer
The Age of the Agent: Flo Crivello
AI Engineer
See, Hear, Speak, Draw: Logan Kilpatrick & Simón Fishman
AI Engineer
Building Context-Aware Reasoning Applications with LangChain and LangSmith: Harrison Chase
AI Engineer
Pydantic is all you need: Jason Liu
AI Engineer
Building Blocks for LLM Systems & Products: Eugene Yan
AI Engineer
The Intelligent Interface: Sam Whitmore & Jason Yuan of New Computer
AI Engineer
Climbing the Ladder of Abstraction: Amelia Wattenberger
AI Engineer
Supabase Vector: The Postgres Vector database: Paul Copplestone
AI Engineer
[Workshop] AI Engineering 101
AI Engineer
The Hidden Life of Embeddings: Linus Lee
AI Engineer
[Workshop] AI Engineering 201: Inference
AI Engineer
The AI Pivot: With Chris White of Prefect & Bryan Bischof of Hex
AI Engineer
The AI Evolution: Mario Rodriguez, GitHub
AI Engineer
Move Fast Break Nothing: Dedy Kredo
AI Engineer
AI Engineering 201: The Rest of the Owl
AI Engineer
Building Reactive AI Apps: Matt Welsh
AI Engineer
Pragmatic AI with TypeChat: Daniel Rosenwasser
AI Engineer
Domain adaptation and fine-tuning for domain-specific LLMs: Abi Aryan
AI Engineer
Retrieval Augmented Generation in the Wild: Anton Troynikov
AI Engineer
Building Production-Ready RAG Applications: Jerry Liu
AI Engineer
120k players in a week: Lessons from the first viral CLIP app: Joseph Nelson
AI Engineer
The Weekend AI Engineer: Hassan El Mghari
AI Engineer
Harnessing the Power of LLMs Locally: Mithun Hunsur
AI Engineer
Trust, but Verify: Shreya Rajpal
AI Engineer
Open Questions for AI Engineering: Simon Willison
AI Engineer
Storyteller: Building Multi-modal Apps with TS & ModelFusion - Lars Grammel, PhD
AI Engineer
GPT Web App Generator - 10,000 apps created in a month: Matija Sosic
AI Engineer
Using AI to Build an Infinite Game: Jeff Schomay
AI Engineer
How to Become an AI Engineer from a Fullstack Background - Reid Mayo
AI Engineer
The Code AI Maturity Model and What It Means For You: Ado Kukic
AI Engineer
AI Engineer World’s Fair 2024 - Keynotes & Multimodality track
AI Engineer
From Text to Vision to Voice Exploring Multimodality with Open AI: Romain Huet
AI Engineer
The Making of Devin by Cognition AI: Scott Wu
AI Engineer
The Future of Knowledge Assistants: Jerry Liu
AI Engineer
Llamafile: bringing AI to the masses with fast CPU inference: Stephen Hood and Justine Tunney
AI Engineer
Open Challenges for AI Engineering: Simon Willison
AI Engineer
Lessons From A Year Building With LLMs
AI Engineer
From Software Developer to AI Engineer: Antje Barth
AI Engineer
Unlocking Developer Productivity across CPU and GPU with MAX: Chris Lattner
AI Engineer
Copilots Everywhere: Thomas Dohmke and Eugene Yan
AI Engineer
Fixing bugs in Gemma, Llama, & Phi 3: Daniel Han
AI Engineer
Low Level Technicals of LLMs: Daniel Han
AI Engineer
Emergence Launch: AI Agents and the future enterprise: Dr. Satya Nitta
AI Engineer
How Codeium Breaks Through the Ceiling for Retrieval: Kevin Hou
AI Engineer
What's new from Anthropic and what's next: Alex Albert
AI Engineer
Using agents to build an agent company: Joao Moura
AI Engineer
Decoding the Decoder LLM without de code: Ishan Anand
AI Engineer
Running AI Application in Minutes w/ AI Templates: Gabriela de Queiroz, Pamela Fox, Harald Kirschner
AI Engineer
Building with Anthropic Claude: Prompt Workshop with Zack Witten
AI Engineer
Building Reliable Agentic Systems: Eno Reyes
AI Engineer
10x Development: LLMs For the working Programmer - Manuel Odendahl
AI Engineer
Disrupting the $15 Trillion Construction Industry with Autonomous Agents: Dr. Sarah Buchner
AI Engineer
Hypermode Launch: Kevin Van Gundy
AI Engineer
Git push get an AI API: Ryan Fox-Tyler
AI Engineer
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
DeepSeek Just Dropped V4. Here's What the Benchmarks Actually Tell You.
Dev.to · Om Shree
Article: Orchestrating Agentic and Multimodal AI Pipelines with Apache Camel
InfoQ AI/ML
Context Engineering for Developers: A Practical Guide (2026)
Dev.to AI
GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to AI
Chapters (13)
Intro & Overview
3:52
What is Inference?
10:16
Proprietary Models for Inference
21:22
Open Models for Inference
30:41
Will Open or Proprietary Models Win Long-Term?
36:19
Q&A on Models
44:12
Inference on End-User Devices
1:04:32
Inference-as-a-Service Providers
1:10:00
Cloud Inference and Serverless GPUs
1:17:46
Rack-and-Stack for Inference
1:20:12
Inference Arithmetic for GPUs
1:27:07
TPUs and Other Custom Silicon for Inference
1:36:11
Containerizing Inference and Inference Services
🎓
Tutor Explanation
DeepCamp AI