Reinforcement Learning from Human Feedback: From Zero to chatGPT
In this talk, we will cover the basics of Reinforcement Learning from Human Feedback (RLHF) and how this technology is being used to enable state-of-the-art ML tools like ChatGPT. Most of the talk will be an overview of the interconnected ML models and cover the basics of Natural Language Processing and RL that one needs to understand how RLHF is used on large language models. It will conclude with open question in RLHF.
RLHF Blogpost: https://huggingface.co/blog/rlhf
The Deep RL Course: https://hf.co/deep-rl-course
Slides from this talk: https://docs.google.com/presentation/d/1eI9PqRJTCFOIVihkig1voRM4MHDpLpCicX9lX1J2fqk/edit?usp=sharing
Nathan Twitter: https://twitter.com/natolambert
Thomas Twitter: https://twitter.com/thomassimonini
Nathan Lambert is a Research Scientist at HuggingFace. He received his PhD from the University of California, Berkeley working at the intersection of machine learning and robotics. He was advised by Professor Kristofer Pister in the Berkeley Autonomous Microsystems Lab and Roberto Calandra at Meta AI Research. He was lucky to intern at Facebook AI and DeepMind during his Ph.D. Nathan was was awarded the UC Berkeley EECS Demetri Angelakos Memorial Achievement Award for Altruism for his efforts to better community norms.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from HuggingFace · HuggingFace · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
The Future of Natural Language Processing
HuggingFace
Trends in Model Size & Computational Efficiency in NLP
HuggingFace
Increasing Data Usage in Natural Language Processing
HuggingFace
In Domain & Out of Domain Generalization in the Future of NLP
HuggingFace
The Limits of NLU & the Rise of NLG in the Future of NLP
HuggingFace
The Lack of Robustness in the Future of NLP
HuggingFace
Inductive Bias, Common Sense, Continual Learning in The Future of NLP
HuggingFace
Train a Hugging Face Transformers Model with Amazon SageMaker
HuggingFace
What is Transfer Learning?
HuggingFace
The pipeline function
HuggingFace
Navigating the Model Hub
HuggingFace
Transformer models: Decoders
HuggingFace
The Transformer architecture
HuggingFace
Transformer models: Encoder-Decoders
HuggingFace
Transformer models: Encoders
HuggingFace
Keras introduction
HuggingFace
The push to hub API
HuggingFace
Fine-tuning with TensorFlow
HuggingFace
Learning rate scheduling with TensorFlow
HuggingFace
TensorFlow Predictions and metrics
HuggingFace
Welcome to the Hugging Face course
HuggingFace
The tokenization pipeline
HuggingFace
Supercharge your PyTorch training loop with Accelerate
HuggingFace
The Trainer API
HuggingFace
Batching inputs together (PyTorch)
HuggingFace
Batching inputs together (TensorFlow)
HuggingFace
Hugging Face Datasets overview (Pytorch)
HuggingFace
Hugging Face Datasets overview (Tensorflow)
HuggingFace
What is dynamic padding?
HuggingFace
What happens inside the pipeline function? (PyTorch)
HuggingFace
What happens inside the pipeline function? (TensorFlow)
HuggingFace
Instantiate a Transformers model (PyTorch)
HuggingFace
Instantiate a Transformers model (TensorFlow)
HuggingFace
Preprocessing sentence pairs (PyTorch)
HuggingFace
Preprocessing sentence pairs (TensorFlow)
HuggingFace
Write your training loop in PyTorch
HuggingFace
Managing a repo on the Model Hub
HuggingFace
Chapter 1 Live Session with Sylvain
HuggingFace
Chapter 2 Live Session with Lewis
HuggingFace
The push to hub API
HuggingFace
Chapter 2 Live Session with Sylvain
HuggingFace
Chapter 3 live sessions with Lewis (PyTorch)
HuggingFace
Day 1 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 2 Talks: JAX, Flax & Transformers 🤗
HuggingFace
Day 3 Talks JAX, Flax, Transformers 🤗
HuggingFace
Chapter 4 live sessions with Omar
HuggingFace
Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker
HuggingFace
Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker
HuggingFace
Run a Batch Transform Job using Hugging Face Transformers and Amazon SageMaker
HuggingFace
[Webinar] How to add machine learning capabilities with just a few lines of code
HuggingFace
Hugging Face + Zapier Demo Video
HuggingFace
Hugging Face + Google Sheets Demo
HuggingFace
Hugging Face Infinity Launch - 09/28
HuggingFace
Build and Deploy a Machine Learning App in 2 Minutes
HuggingFace
Hugging Face Infinity - GPU Walkthrough
HuggingFace
Otto - 🤗 Infinity Case Study
HuggingFace
Workshop: Getting started with Amazon Sagemaker Train a Hugging Face Transformers and deploy it
HuggingFace
Workshop: Going Production: Deploying, Scaling & Monitoring Hugging Face Transformer models
HuggingFace
🤗 Tasks: Causal Language Modeling
HuggingFace
🤗 Tasks: Masked Language Modeling
HuggingFace
More on: LLM Engineering
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Deep Generative Adversarial Networks for Compressed Sensing Automates MRI
Dev.to AI
Self-Reference Cluster: A Lean 4 Common-Encoding Attempt for Lob's Theorem, Reflective Programming, and Acausal Decision Theory (Paper 135)
Dev.to AI
# The Sad King and the Problem of Controlling a Closed Intelligence
Medium · LLM
Self-Improving Python Scripts with LLMs: My Journey
Dev.to · RTT Enjoy
🎓
Tutor Explanation
DeepCamp AI