OpenAI DALL·E: Creating Images from Text (Blog Post Explained)

Yannic Kilcher · Beginner ·🧠 Large Language Models ·5y ago
#openai #science #gpt3 OpenAI's newest model, DALL·E, shows absolutely amazing abilities in generating high-quality images from arbitrary text descriptions. Like GPT-3, the range of applications and the diversity of outputs is astonishing, given that this is a single model, trained on a purely autoregressive task. This model is a significant step towards the combination of text and images in future AI applications. OUTLINE: 0:00 - Introduction 2:45 - Overview 4:20 - Dataset 5:35 - Comparison to GPT-3 7:00 - Model Architecture 13:20 - VQ-VAE 21:00 - Combining VQ-VAE with GPT-3 27:30 - Pre-Training with Relaxation 32:15 - Experimental Results 33:00 - My Hypothesis about DALL·E's inner workings 36:15 - Sparse Attention Patterns 38:00 - DALL·E can't count 39:35 - DALL·E can't global order 40:10 - DALL·E renders different views 41:10 - DALL·E is very good at texture 41:40 - DALL·E can complete a bust 43:30 - DALL·E can do some reflections, but not others 44:15 - DALL·E can do cross-sections of some objects 45:50 - DALL·E is amazing at style 46:30 - DALL·E can generate logos 47:40 - DALL·E can generate bedrooms 48:35 - DALL·E can combine unusual concepts 49:25 - DALL·E can generate illustrations 50:15 - DALL·E sometimes understands complicated prompts 50:55 - DALL·E can pass part of an IQ test 51:40 - DALL·E probably does not have geographical / temporal knowledge 53:10 - Reranking dramatically improves quality 53:50 - Conclusions & Comments Blog: https://openai.com/blog/dall-e/ Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://discord.gg/4H8xxDF BitChute: https://www.bitchute.com/channel/yannic-kilcher Minds: https://www.minds.com/ykilcher Parler: https://parler.com/profile/YannicKilcher LinkedIn: https://www.linkedin.com/in/yannic-kilcher-488534136/ If you want to support me, the best thing to do is to share out the content :) If you want
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Yannic Kilcher · Yannic Kilcher · 0 of 60

← Previous Next →
1 Imagination-Augmented Agents for Deep Reinforcement Learning
Imagination-Augmented Agents for Deep Reinforcement Learning
Yannic Kilcher
2 Learning model-based planning from scratch
Learning model-based planning from scratch
Yannic Kilcher
3 Reinforcement Learning with Unsupervised Auxiliary Tasks
Reinforcement Learning with Unsupervised Auxiliary Tasks
Yannic Kilcher
4 Attention Is All You Need
Attention Is All You Need
Yannic Kilcher
5 git for research basics: fundamentals, commits, branches, merging
git for research basics: fundamentals, commits, branches, merging
Yannic Kilcher
6 Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Yannic Kilcher
7 World Models
World Models
Yannic Kilcher
8 Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
Yannic Kilcher
9 Stochastic RNNs without Teacher-Forcing
Stochastic RNNs without Teacher-Forcing
Yannic Kilcher
10 What’s in a name? The need to nip NIPS
What’s in a name? The need to nip NIPS
Yannic Kilcher
11 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Yannic Kilcher
12 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Yannic Kilcher
13 GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
Yannic Kilcher
14 Neural Ordinary Differential Equations
Neural Ordinary Differential Equations
Yannic Kilcher
15 The Odds are Odd: A Statistical Test for Detecting Adversarial Examples
The Odds are Odd: A Statistical Test for Detecting Adversarial Examples
Yannic Kilcher
16 Discriminating Systems - Gender, Race, and Power in AI
Discriminating Systems - Gender, Race, and Power in AI
Yannic Kilcher
17 Blockwise Parallel Decoding for Deep Autoregressive Models
Blockwise Parallel Decoding for Deep Autoregressive Models
Yannic Kilcher
18 S.H.E. - Search. Human. Equalizer.
S.H.E. - Search. Human. Equalizer.
Yannic Kilcher
19 Reinforcement Learning, Fast and Slow
Reinforcement Learning, Fast and Slow
Yannic Kilcher
20 Adversarial Examples Are Not Bugs, They Are Features
Adversarial Examples Are Not Bugs, They Are Features
Yannic Kilcher
21 I'm at ICML19 :)
I'm at ICML19 :)
Yannic Kilcher
22 Population-Based Search and Open-Ended Algorithms
Population-Based Search and Open-Ended Algorithms
Yannic Kilcher
23 XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Yannic Kilcher
24 Conversation about Population-Based Methods (Re-upload)
Conversation about Population-Based Methods (Re-upload)
Yannic Kilcher
25 Reconciling modern machine learning and the bias-variance trade-off
Reconciling modern machine learning and the bias-variance trade-off
Yannic Kilcher
26 Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
Learning World Graphs to Accelerate Hierarchical Reinforcement Learning
Yannic Kilcher
27 Manifold Mixup: Better Representations by Interpolating Hidden States
Manifold Mixup: Better Representations by Interpolating Hidden States
Yannic Kilcher
28 Processing Megapixel Images with Deep Attention-Sampling Models
Processing Megapixel Images with Deep Attention-Sampling Models
Yannic Kilcher
29 Gauge Equivariant Convolutional Networks and the Icosahedral CNN
Gauge Equivariant Convolutional Networks and the Icosahedral CNN
Yannic Kilcher
30 Auditing Radicalization Pathways on YouTube
Auditing Radicalization Pathways on YouTube
Yannic Kilcher
31 RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yannic Kilcher
32 Dynamic Routing Between Capsules
Dynamic Routing Between Capsules
Yannic Kilcher
33 DEEP LEARNING MEME REVIEW - Episode 1
DEEP LEARNING MEME REVIEW - Episode 1
Yannic Kilcher
34 Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Yannic Kilcher
35 [News] The Siraj Raval Controversy
[News] The Siraj Raval Controversy
Yannic Kilcher
36 LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games
LeDeepChef 👨‍🍳 Deep Reinforcement Learning Agent for Families of Text-Based Games
Yannic Kilcher
37 The Visual Task Adaptation Benchmark
The Visual Task Adaptation Benchmark
Yannic Kilcher
38 IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Yannic Kilcher
39 AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning
Yannic Kilcher
40 SinGAN: Learning a Generative Model from a Single Natural Image
SinGAN: Learning a Generative Model from a Single Natural Image
Yannic Kilcher
41 A neurally plausible model learns successor representations in partially observable environments
A neurally plausible model learns successor representations in partially observable environments
Yannic Kilcher
42 MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
MuZero: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Yannic Kilcher
43 Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions
Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions
Yannic Kilcher
44 NeurIPS 19 Poster Session
NeurIPS 19 Poster Session
Yannic Kilcher
45 Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore: a New Approach for Hard-Exploration Problems
Yannic Kilcher
46 Reformer: The Efficient Transformer
Reformer: The Efficient Transformer
Yannic Kilcher
47 [Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization
[Interview] Mark Ledwich - Algorithmic Extremism: Examining YouTube's Rabbit Hole of Radicalization
Yannic Kilcher
48 Turing-NLG, DeepSpeed and the ZeRO optimizer
Turing-NLG, DeepSpeed and the ZeRO optimizer
Yannic Kilcher
49 Growing Neural Cellular Automata
Growing Neural Cellular Automata
Yannic Kilcher
50 NeurIPS 2020 Changes to Paper Submission Process
NeurIPS 2020 Changes to Paper Submission Process
Yannic Kilcher
51 Deep Learning for Symbolic Mathematics
Deep Learning for Symbolic Mathematics
Yannic Kilcher
52 Online Education - How I Make My Videos
Online Education - How I Make My Videos
Yannic Kilcher
53 [Rant] coronavirus
[Rant] coronavirus
Yannic Kilcher
54 Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting
Axial Attention & MetNet: A Neural Weather Model for Precipitation Forecasting
Yannic Kilcher
55 Agent57: Outperforming the Atari Human Benchmark
Agent57: Outperforming the Atari Human Benchmark
Yannic Kilcher
56 State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication
State-of-Art-Reviewing: A Radical Proposal to Improve Scientific Publication
Yannic Kilcher
57 Dream to Control: Learning Behaviors by Latent Imagination
Dream to Control: Learning Behaviors by Latent Imagination
Yannic Kilcher
58 POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions
POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Solutions
Yannic Kilcher
59 Evaluating NLP Models via Contrast Sets
Evaluating NLP Models via Contrast Sets
Yannic Kilcher
60 [Drama] Who invented Contrast Sets?
[Drama] Who invented Contrast Sets?
Yannic Kilcher

Related AI Lessons

I Tried 10 ChatGPT Resume Prompts. Here's What Actually Got Me Interviews.
Learn how to use ChatGPT prompts to improve your resume and get more interview callbacks
Dev.to AI
How does indirect prompt injection work? #tech
Indirect prompt injection is a technique used in AI to manipulate model outputs by injecting prompts indirectly, and understanding how it works is crucial for developing secure AI systems.
Dev.to AI
A Unified View of AI Evolution: From Machine Learning to LLMs, RAG, and Fine-Tuning
Learn about the evolution of AI from machine learning to LLMs, RAG, and fine-tuning, and how to apply these concepts in practice
Dev.to · Naimul Karim
OpenAI Just Unleashed GPT-5.5 — And It Signals the Next Phase of AI
OpenAI's GPT-5.5 signals a shift towards practical AI applications in the real world
Medium · AI

Chapters (28)

Introduction
2:45 Overview
4:20 Dataset
5:35 Comparison to GPT-3
7:00 Model Architecture
13:20 VQ-VAE
21:00 Combining VQ-VAE with GPT-3
27:30 Pre-Training with Relaxation
32:15 Experimental Results
33:00 My Hypothesis about DALL·E's inner workings
36:15 Sparse Attention Patterns
38:00 DALL·E can't count
39:35 DALL·E can't global order
40:10 DALL·E renders different views
41:10 DALL·E is very good at texture
41:40 DALL·E can complete a bust
43:30 DALL·E can do some reflections, but not others
44:15 DALL·E can do cross-sections of some objects
45:50 DALL·E is amazing at style
46:30 DALL·E can generate logos
47:40 DALL·E can generate bedrooms
48:35 DALL·E can combine unusual concepts
49:25 DALL·E can generate illustrations
50:15 DALL·E sometimes understands complicated prompts
50:55 DALL·E can pass part of an IQ test
51:40 DALL·E probably does not have geographical / temporal knowledge
53:10 Reranking dramatically improves quality
53:50 Conclusions & Comments
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →