AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind
Skills:
LLM Foundations70%
Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every app shares it. Foreground apps get top priority. Background batch jobs queue and run overnight on charge. The developer never manages any of that.
The tradeoff is reach. The GenAI MLKit APIs require flagship devices from the last couple of years. Classic MLKit for vision and OCR runs on a billion plus devices without issue. Hybrid inference, launched a few weeks before this talk, falls back from Nano to Gemini Flash in the cloud when the on device model is not available. An embedding API is coming soon for RAG style solutions. For anything beyond that, LiteRT is the other path.
Speaker info:
- https://x.com/FMuntenescu
- https://www.linkedin.com/in/florina-muntenescu-314b8921
- https://github.com/florina-muntenescu
- https://linkedin.com/in/ogaymond
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: LLM Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Structured Outputs at Scale: Three Approaches, One Clear Winner
Medium · AI
Structured Outputs at Scale: Three Approaches, One Clear Winner
Medium · LLM
I Stacked 4 More Context Layers on Top of RAG. Sonnet Got 12% Better. Haiku Got 14% Worse.
Dev.to AI
I Was Scraping Google Scholar at 2am. There Had to Be a Better Way.
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI