The animated Transformer: the Transformer model explained the fun way!

DeepLearning Hero · Beginner ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations90%LLM Engineering70%Prompt Craft50%

In this video, I'll be taking you through the amazing yet mysterious Transformer model (sorry no Autobots or Decepticons), which is the unsung hero behind ChatGPT and other LLMs. You'll learn end-to-end how a Transformer works. Papers Original Transfomer: https://arxiv.org/pdf/1706.03762.pdf GPT-3: https://arxiv.org/pdf/2005.14165.pdf Batch normalization: https://arxiv.org/pdf/1502.03167.pdf Layer normalization: https://arxiv.org/pdf/1607.06450.pdf Fast Transformer: https://arxiv.org/pdf/1911.02150.pdf Flash Attention: https://arxiv.org/pdf/2205.14135.pdf 00:00 - Introduction 1:40 - Input to the model 2:00 - Tokenization in Transformer 2:25 - Special tokens used by Transformer 3:01 - The input processor 3:51 - Some important notation and hyperparameters 4:40 - The importance of the context window size 5:24 - Basics of the Transformer 5:54 - RNNs vs Transformers 7:41 - Two types of attention: bidirectional vs causal 9:44 - Batch normalization vs Layer normalization 10:57 - Continuing on the Transformer 11:23 - Predictions with the Transformer 11:41 - Softmax for sequences 12:33 - Inference with the Transformer 13:14 - Sampling strategies for tokens 13:45 - Continuing on the Transformer 15:05 - Computing token embeddings 15:34 - Positional embeddings 17:00 - Self-attention layer 19:05 - Self-attention computations 21:31 - Multi-head attention 23:47 - Conclusion

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Beginners Tutorial to Upload Github Jupyter Notebook to Google Colab

Beginners Tutorial to Upload Github Jupyter Notebook to Google Colab

Related AI Lessons

I Tested 50 AI Prompts — Here’s the Formula That Always Works (2026)

Discover a proven formula for crafting effective AI prompts that drive results, based on testing 50 prompts

Medium · ChatGPT

How to Use ChatGPT in Business: A Practical Guide for 2026

Learn how to leverage ChatGPT in your business to automate tasks and generate content at scale, increasing efficiency and productivity

Medium · ChatGPT

The Complete Guide to Prompt Engineering: Unlock the Full Potential of AI

Learn how to unlock the full potential of AI with prompt engineering, a crucial factor in achieving game-changing outputs

Medium · Machine Learning

How to Add AI Features to Your SaaS App Without Breaking Everything

Learn how to add AI features to your SaaS app without disrupting your service, covering LLM integration, streaming responses, and cost management

Chapters (23)

Introduction

1:40 Input to the model

2:00 Tokenization in Transformer

2:25 Special tokens used by Transformer

3:01 The input processor

3:51 Some important notation and hyperparameters

4:40 The importance of the context window size

5:24 Basics of the Transformer

5:54 RNNs vs Transformers

7:41 Two types of attention: bidirectional vs causal

9:44 Batch normalization vs Layer normalization

10:57 Continuing on the Transformer

11:23 Predictions with the Transformer

11:41 Softmax for sequences

12:33 Inference with the Transformer

13:14 Sampling strategies for tokens

13:45 Continuing on the Transformer

15:05 Computing token embeddings

15:34 Positional embeddings

17:00 Self-attention layer

19:05 Self-attention computations

21:31 Multi-head attention

23:47 Conclusion

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)