How to solve Reinforcement Learning when there are ZERO rewards (Curiosity & RND)

Name: How to solve Reinforcement Learning when there are ZERO rewards (Curiosity & RND)
Uploaded: 2025-05-13T22:29:02+00:00
Channel: Neural Breakdown with AVB
Description: In this video, we will learn about two great RL methods for self supervised exploration - Curiosity and Random Network Distillation (RND). We will use a...

Neural Breakdown with AVB · Beginner ·📄 Research Papers Explained ·11mo ago

In this video, we will learn about two great RL methods for self supervised exploration - Curiosity and Random Network Distillation (RND). We will use a popular on-policy RL algorithm (A2C or Advantage Actor-Critic) to do our exploration of this fascinating space. Along the way, we will learn some code examples in Python and Pytorch, and study why exactly these methods work and the various challenges they solve. Curiosity teaches agents to explore states that they find the least predictable, and RND teaches agents to explore states that are the most "novel". The neural network architectures be…

Watch on YouTube ↗ (saves to browser)

Next Up

Lecture 23: The Qing through Qianlong

MIT OpenCourseWare

How to solve Reinforcement Learning when there are ZERO rewards (Curiosity & RND)

Lesson complete!