AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind

AI Engineer · Intermediate ·🧠 Large Language Models ·39m ago
Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every app shares it. Foreground apps get top priority. Background batch jobs queue and run overnight on charge. The developer never manages any of that. The tradeoff is reach. The GenAI MLKit APIs require flagship devices from the last couple of years. Classic MLKit for vision and OCR runs on a billion plus devices without issue. Hybrid inference, launched a few weeks before this talk, falls back from Nano to Gemini Flash in the cloud when the on device model is not available. An embedding API is coming soon for RAG style solutions. For anything beyond that, LiteRT is the other path. Speaker info: - https://x.com/FMuntenescu - https://www.linkedin.com/in/florina-muntenescu-314b8921 - https://github.com/florina-muntenescu - https://linkedin.com/in/ogaymond
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Structured Outputs at Scale: Three Approaches, One Clear Winner
Learn how constrained decoding outperforms prompt engineering for structured outputs in terms of reliability and speed
Medium · AI
Structured Outputs at Scale: Three Approaches, One Clear Winner
Learn how constrained decoding outperforms prompt engineering for structured outputs in terms of reliability and speed
Medium · LLM
I Stacked 4 More Context Layers on Top of RAG. Sonnet Got 12% Better. Haiku Got 14% Worse.
Adding context layers to RAG can improve performance, but may also have negative effects on certain models, highlighting the importance of careful evaluation and testing
Dev.to AI
I Was Scraping Google Scholar at 2am. There Had to Be a Better Way.
Learn how to efficiently collect academic data without scraping Google Scholar, and discover a better way to build a RAG pipeline
Dev.to AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →