iPhone 17 Pro Just Ran a 400B LLM: On-Device AI Changes Everything (2026)
📰 Dev.to · Max Quimby
A developer ran a 400B parameter LLM on an iPhone 17 Pro using SSD-to-GPU streaming. Here's how Flash-MoE works and why on-device AI matters.
A developer ran a 400B parameter LLM on an iPhone 17 Pro using SSD-to-GPU streaming. Here's how Flash-MoE works and why on-device AI matters.