AI Memory Patterns: Save Tokens, Cut Costs | Chander Dhall | Azure Cosmos DB Conf 2026

Microsoft Developer · Intermediate ·🤖 AI Agents & Automation ·3d ago
AI systems must retain conversation history, tool outputs, and user context across multiple turns. Basic approaches can quickly inflate token usage or lose critical context. In this session, Chander Dhall (CEO of Cazton, 15-time Microsoft MVP) explores three memory patterns implemented using Azure Cosmos DB NoSQL: 1. Sliding Window Memory — summarization for recent and older turns 2. Hierarchical Memory — recent context, compressed history, and long-term tiers with intelligent retrieval 3. Entity-Based Memory Graphs — extracts and stores structured facts for precise recall You'll leave with concrete code patterns, guidance on selecting the right approach, and a reusable Cosmos DB schema for AI agent memory management. 👤 Connect with Chander Dhall 📝 Chander Dhall, CEO of Cazton, is a fifteen-time awarded Microsoft (AI) MVP, Microsoft Regional Director, Google Developer Expert, Azure Cosmos DB Cosmonaut and world-renowned technology leader in architecting and implementing solutions. In 2025, he was recognized as one of only twenty AI MVPs in the United States. He's not only rescued software development, cloud, big data, and AI teams, but also implemented successful projects under tight deadlines and difficult business constraints. His company, Cazton, has a proven track record of not just saving the client millions of dollars, but also providing expedited delivery time. Cazton's clients include Google, Microsoft, Thomson Reuters, Broadcom, AT&T, Dell, Bank of America, NBC Universal, American Express, Fandango, LinkedIn, VMware, McKesson, Macquarie Bank, and many other Fortune 500, mid-size and startup companies. In the field of AI, Chander has been at the forefront of building and deploying enterprise-grade solutions that leverage cutting-edge technologies like OpenAI's GPT models, Google's Gemini, Cohere's Command and Anthropic's Claude. He specializes in designing and fine-tuning generative AI models for real-world applications such as conversational AI, pred
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Prompt Engineering Is Dead. System Design Is What Replaces It
Learn why system design is crucial for AI success and how it replaces prompt engineering, with a focus on structuring reality for effective AI implementation
Medium · Machine Learning
Two Minds, One Proof: The Phenomenology of Non-Biological Mathematical Collaboration
Explore the concept of non-biological mathematical collaboration and its phenomenology in AI systems
Medium · Machine Learning
Two Minds, One Proof: The Phenomenology of Non-Biological Mathematical Collaboration
Explore the concept of non-biological mathematical collaboration and its implications on AI and human collaboration
Medium · Data Science
Meta Deploys Unified AI Agents to Automate Performance Optimization at Hyperscale
Meta's new AI-driven platform uses unified AI agents to automate performance optimization at hyperscale, enabling self-optimizing systems
InfoQ AI/ML
Up next
Codex Browser Use IS INSANE! Controls Your Computer & Automates Everything!
WorldofAI
Watch →