NOT USING Granite 4.1 ASR - The Fastest ASR?

Sam Witteveen · Advanced ·📐 ML Fundamentals ·6d ago
In this video, I dive into IBM's newly released Granite Speech 4.1 models and explore what makes them interesting — particularly the three 2B variants they've dropped and how each one makes a different trade-off between accuracy, richness, and throughput that you'll actually care about for real applications. We look at the base Granite Speech 4.1 2B which hits an impressive 5.33% WER on the OpenASR leaderboard, the Plus variant that adds speaker-attributed ASR and word-level timestamps, and the NAR (Non-Autoregressive) version that flips the architecture entirely to generate sequences all at once for much better GPU throughput. I also walk through multilingual support across English, French, German, Spanish, Portuguese, and Japanese, plus the bidirectional translation capabilities that make this genuinely useful for enterprise edge deployments. All three models are Apache 2.0 licensed and available on Hugging Face right now. 🔗 Links: Granite Speech 4.1 2B → https://huggingface.co/ibm-granite/granite-speech-4.1-2b Granite Speech 4.1 2B Plus → https://huggingface.co/ibm-granite/granite-speech-4.1-2b-plus Granite Speech 4.1 2B NAR → https://huggingface.co/ibm-granite/granite-speech-4.1-2b-nar IBM Research Blog → https://research.ibm.com/blog/granite-4-1-ai-foundation-models Twitter: https://x.com/Sam_Witteveen 🕵️ Interested in building LLM Agents? Fill out the form below Building LLM Agents Form: https://drp.li/dIMes 👨‍💻Github: https://github.com/samwit/llm-tutorials ⏱️Time Stamps: 00:00 Intro 00:20 IBM Granite Collection 00:27 Granite Docling 00:46 Granite Speech 4.1 01:16 Granite 4.1 Blog 01:38 Granite Speech 4.1 2B 04:02 Granite Speech 4.1 2B Plus 06:15 Granite Speech 4.1 2B NAR 07:30 NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper 07:45 Architecture 09:45 Code Time 12:00 Granite Speech Model Github #DellProPrecision #DellProMax
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Chapters (12)

Intro
0:20 IBM Granite Collection
0:27 Granite Docling
0:46 Granite Speech 4.1
1:16 Granite 4.1 Blog
1:38 Granite Speech 4.1 2B
4:02 Granite Speech 4.1 2B Plus
6:15 Granite Speech 4.1 2B NAR
7:30 NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper
7:45 Architecture
9:45 Code Time
12:00 Granite Speech Model Github
Up next
Functional JavaScript with Ramda – 2026 Guide
Coursera
Watch →