Why Bigger Context Windows Make AI Worse
Skills:
LLM Engineering80%
► Try out Search Atlas with a 7-day free trial here: https://searchatlas.com/?utm_source=louis_bouchard&utm_medium=influencer_youtube&utm_campaign=q1_inf_cam&utm_content=primary_link
► Our recent webinar on AI engineering: https://youtu.be/ljOwBCdiHmg
► Learn more in our courses and social media: https://links.louisbouchard.ai/
► My Newsletter (My AI updates and news clearly explained): https://louisbouchard.substack.com/
Chapters:
0:00 Hey! Tap the Thumbs Up button and Subscribe. You'll learn a lot of cool stuff, I promise.
03:22 Why "More Tokens" Means Worse Results
04:52 "Lost in the Middle" Explained
05:36 The Cost & Complexity of Attention (N²)
07:34 1. Deterministic Trimming (Sliding Window)
08:18 2. Source-Level Filtering (Highest Impact)
09:21 3. Mechanical Compaction
10:06 4. Terminal Sequence Collapse
10:50 5. Semantic Summarization (Map Reduce vs. Stuffing)
12:10 6. Retrieval-Based Compaction & Contextual RAG
13:39 7. Knowledge Graphs & Graph RAG
14:40 8. Learned Prompt Compression (LLMLingua)
15:47 9. Multi-Tier Memory (MemGPT)
16:43 10. Agentic Context Engineering (ACE)
17:40 Bonus: Output Optimization Tricks
19:48 Best Practices: When (and When Not) to Compact
21:35 Multi-Agent & Model Routing Strategies
23:44 Actionable: Order of Operations for AI Engineers
#aiengineering #contextengineering #compaction
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
The RAG tool that auto-generates Q&A pairs from your documents
Dev.to · retrovirusretro
How to Build Secure AI: Implementing Guardrails for Enterprise LLM
Medium · LLM
5 Chinese AI tools with 100K+ stars that the West is ignoring
Dev.to AI
OpenAI claims it solved an 80-year-old math problem — for real this time
TechCrunch AI
Chapters (18)
Hey! Tap the Thumbs Up button and Subscribe. You'll learn a lot of cool stuff, I
3:22
Why "More Tokens" Means Worse Results
4:52
"Lost in the Middle" Explained
5:36
The Cost & Complexity of Attention (N²)
7:34
1. Deterministic Trimming (Sliding Window)
8:18
2. Source-Level Filtering (Highest Impact)
9:21
3. Mechanical Compaction
10:06
4. Terminal Sequence Collapse
10:50
5. Semantic Summarization (Map Reduce vs. Stuffing)
12:10
6. Retrieval-Based Compaction & Contextual RAG
13:39
7. Knowledge Graphs & Graph RAG
14:40
8. Learned Prompt Compression (LLMLingua)
15:47
9. Multi-Tier Memory (MemGPT)
16:43
10. Agentic Context Engineering (ACE)
17:40
Bonus: Output Optimization Tricks
19:48
Best Practices: When (and When Not) to Compact
21:35
Multi-Agent & Model Routing Strategies
23:44
Actionable: Order of Operations for AI Engineers
🎓
Tutor Explanation
DeepCamp AI