9 Examples of Specification Gaming

Robert Miles AI Safety · Beginner ·🧠 Large Language Models ·5y ago

Skills: LLM Foundations70%AI Alignment Basics60%

AI systems do what you say, and it's hard to say exactly what you mean. Let's look at a list of real life examples of specification gaming! How to Help: https://aisafety.info/questions/8TJV/... https://www.aisafety.com/ Related Videos from me: Reward Hacking: https://youtu.be/92qDfT8pENs Reward Hacking Reloaded: https://youtu.be/46nsTFfsBuc What Can We Do About Reward Hacking?: https://youtu.be/13tZ9Yia71c Sources: The list: http://tinyurl.com/specification-gaming The blogpost this video is based on: https://vkrakovna.wordpress.com/2018/04/02/specification-gaming-examples-in-ai/ The newer blogpost that happened while I was making this video: https://deepmind.com/blog/article/Specification-gaming-the-flip-side-of-AI-ingenuity Links: Eliciting latent knowledge from AI: https://www.alignmentforum.org/posts/rxoBY9CMkqDsHt25t/eliciting-latent-knowledge-elk-distillation-summary Do current AI models show deception: https://aisafety.info/questions/9AL4/ What is deceptive alignment: https://aisafety.info/questions/8EL6/ Scaling Laws: https://aisafety.info/questions/7750/ LLMs as simulators: https://aisafety.info/questions/9FQK/ (Explosion graphic from videezy.com) Thanks to my wonderful patrons: https://www.patreon.com/robertskmiles Gladamas James Steef Scott Worley Chad Jones Chris Canal David Reid Francisco Tolmasky Frank Kurka Jake Ehrlich JJ Hepboin Kellen lask Michael Andregg Pedro A Ortega Peter Rolf Said Polat Teague Lasser Allen Faure Bryce Daifuku Clemens Arbesser Eric James Erik de Bruijn Jason Hise jugettje dutchking Ludwig Schubert Qeith Wreid Andrew Harcourt anul kumar sinha Ben Glanton Benjamin Watkin Cooper Lawton Duncan Orr Eric Scammell Euclidean Plane Ian Munro Igor Keller Ingvi Gautsson James Hinchcliffe Jeroen De Dauw Jon Halliday Jonatan R Julius Brash Jérôme Beaulieu Laura Olds Luc Ritchie Lupuleasa Ionuț Michael Greve Nathan Fish Nicholas Guyett Paul Hobbs Sean Gibat Sebastian Birjoveanu Shevis Johnson Taras Bobrovytsky Tim Neilson Tom O'C

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Robert Miles AI Safety · Robert Miles AI Safety · 27 of 47

← Previous Next →

Predicting AI: RIP Prof. Hubert Dreyfus

Predicting AI: RIP Prof. Hubert Dreyfus

Robert Miles AI Safety

Robert Miles AI Safety

Are AI Risks like Nuclear Risks?

Are AI Risks like Nuclear Risks?

Robert Miles AI Safety

Avoiding Negative Side Effects: Concrete Problems in AI Safety part 1

Avoiding Negative Side Effects: Concrete Problems in AI Safety part 1

Robert Miles AI Safety

Avoiding Positive Side Effects: Concrete Problems in AI Safety part 1.5

Avoiding Positive Side Effects: Concrete Problems in AI Safety part 1.5

Robert Miles AI Safety

Empowerment: Concrete Problems in AI Safety part 2

Empowerment: Concrete Problems in AI Safety part 2

Robert Miles AI Safety

Why Not Just: Raise AI Like Kids?

Why Not Just: Raise AI Like Kids?

Robert Miles AI Safety

Reward Hacking: Concrete Problems in AI Safety Part 3

Reward Hacking: Concrete Problems in AI Safety Part 3

Robert Miles AI Safety

The other "Killer Robot Arms Race" Elon Musk should worry about

The other "Killer Robot Arms Race" Elon Musk should worry about

Robert Miles AI Safety

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

Robert Miles AI Safety

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

Robert Miles AI Safety

What can AGI do? I/O and Speed

What can AGI do? I/O and Speed

Robert Miles AI Safety

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

Robert Miles AI Safety

AI Safety at EAGlobal2017 Conference

AI Safety at EAGlobal2017 Conference

Robert Miles AI Safety

Scalable Supervision: Concrete Problems in AI Safety Part 5

Scalable Supervision: Concrete Problems in AI Safety Part 5

Robert Miles AI Safety

Superintelligence Mod for Civilization V

Superintelligence Mod for Civilization V

Robert Miles AI Safety

Why Would AI Want to do Bad Things? Instrumental Convergence

Why Would AI Want to do Bad Things? Instrumental Convergence

Robert Miles AI Safety

Experts' Predictions about the Future of AI

Experts' Predictions about the Future of AI

Robert Miles AI Safety

AI Safety Gridworlds

AI Safety Gridworlds

Robert Miles AI Safety

Friend or Foe? AI Safety Gridworlds extra bit

Friend or Foe? AI Safety Gridworlds extra bit

Robert Miles AI Safety

Safe Exploration: Concrete Problems in AI Safety Part 6

Safe Exploration: Concrete Problems in AI Safety Part 6

Robert Miles AI Safety

Why Not Just: Think of AGI Like a Corporation?

Why Not Just: Think of AGI Like a Corporation?

Robert Miles AI Safety

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

Robert Miles AI Safety

Is AI Safety a Pascal's Mugging?

Is AI Safety a Pascal's Mugging?

Robert Miles AI Safety

AI That Doesn't Try Too Hard - Maximizers and Satisficers

AI That Doesn't Try Too Hard - Maximizers and Satisficers

Robert Miles AI Safety

Training AI Without Writing A Reward Function, with Reward Modelling

Training AI Without Writing A Reward Function, with Reward Modelling

Robert Miles AI Safety

9 Examples of Specification Gaming

9 Examples of Specification Gaming

Robert Miles AI Safety

10 Reasons to Ignore AI Safety

10 Reasons to Ignore AI Safety

Robert Miles AI Safety

Sharing the Benefits of AI: The Windfall Clause

Sharing the Benefits of AI: The Windfall Clause

Robert Miles AI Safety

Quantilizers: AI That Doesn't Try Too Hard

Quantilizers: AI That Doesn't Try Too Hard

Robert Miles AI Safety

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

Robert Miles AI Safety

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Robert Miles AI Safety

Intro to AI Safety, Remastered

Intro to AI Safety, Remastered

Robert Miles AI Safety

We Were Right! Real Inner Misalignment

We Were Right! Real Inner Misalignment

Robert Miles AI Safety

Apply to AI Safety Camp! #shorts

Apply to AI Safety Camp! #shorts

Robert Miles AI Safety

Win $50k for Solving a Single AI Problem? #Shorts

Win $50k for Solving a Single AI Problem? #Shorts

Robert Miles AI Safety

Free ML Bootcamp for Alignment #shorts

Free ML Bootcamp for Alignment #shorts

Robert Miles AI Safety

Apply Now for a Paid Residency on Interpretability #short

Apply Now for a Paid Residency on Interpretability #short

Robert Miles AI Safety

Why Does AI Lie, and What Can We Do About It?

Why Does AI Lie, and What Can We Do About It?

Robert Miles AI Safety

Apply to Study AI Safety Now! #shorts

Apply to Study AI Safety Now! #shorts

Robert Miles AI Safety

AI Ruined My Year

AI Ruined My Year

Robert Miles AI Safety

Learn AI Safety at MATS #shorts

Learn AI Safety at MATS #shorts

Robert Miles AI Safety

Using Dangerous AI, But Safely?

Using Dangerous AI, But Safely?

Robert Miles AI Safety

AI Safety Career Advice! (And So Can You!)

AI Safety Career Advice! (And So Can You!)

Robert Miles AI Safety

Robot Dog! Unitree Go2 review #shorts #robot #dog

Robot Dog! Unitree Go2 review #shorts #robot #dog

Robert Miles AI Safety

Tech is Good, AI Will Be Different

Tech is Good, AI Will Be Different

Robert Miles AI Safety

Apply for the Affine Superintelligence Alignment Seminar #shorts

Apply for the Affine Superintelligence Alignment Seminar #shorts

Robert Miles AI Safety

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Beginners Tutorial to Upload Github Jupyter Notebook to Google Colab

Beginners Tutorial to Upload Github Jupyter Notebook to Google Colab

Related AI Lessons

The Invisible Scaffolding — How Normalization Keeps Deep Models from Falling Apart

Learn how normalization keeps deep models stable and why it's a crucial design decision in large language models

Stop Telling the AI What to Build

Learn how to collaborate with AI using the Greek sculptor's method, focusing on discovery rather than instruction

L’IA ne simule pas les émotions. Elle les gouverne.

Discover how AI governs emotions in various applications, and why this matters for human-AI interaction

The Future of Artificial Intelligence and Machine Learning: Key Trends Every Developer Should Know

Stay ahead of the AI and ML curve by understanding key trends in intelligent systems, and apply them to your development work for a competitive edge

Medium · Machine Learning

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)