Future of AI

AI Safety & Ethics

Alignment, interpretability, AI risks, and building safe AI systems

6,844
lessons
Skills in this topic
View full skill map →
AI Alignment Basics
beginner
Explain the alignment problem
AI Ethics & Policy
beginner
Identify types of bias in ML systems
AI Safety Engineering
intermediate
Implement input and output guardrails

Showing 247 reads from curated sources

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago
Why Aggregate Accuracy is Inadequate for Evaluating Fairness in Law Enforcement Facial Recognition Systems
arXiv:2603.28675v1 Announce Type: cross Abstract: Facial recognition systems are increasingly deployed in law enforcement and security contexts, where algorithm
ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
arXiv:2504.11967v4 Announce Type: replace-cross Abstract: Unmanned Aerial Vehicles (UAVs) are indispensable for infrastructure inspection, surveillance, and rel
ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago
FlowPure: Continuous Normalizing Flows for Adversarial Purification
arXiv:2505.13280v2 Announce Type: replace-cross Abstract: Despite significant advances in the area, adversarial robustness remains a critical challenge in syste
Hacker News (AI) 🛡️ AI Safety & Ethics ⚡ AI Lesson 3w ago
FTC action against Match and OkCupid for deceiving users, sharing personal data
Comments
New Bernie Sanders AI Safety Bill Would Halt Data Center Construction
Wired AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 3w ago
New Bernie Sanders AI Safety Bill Would Halt Data Center Construction
The US senator said on Tuesday that a moratorium would give lawmakers time to "ensure that AI is safe." Alexandria Ocasio-Cortez will introduce a similar bill i
Hugging Face Blog 🛡️ AI Safety & Ethics ⚡ AI Lesson 7mo ago
Democratizing AI Safety with RiskRubric.ai
OpenAI News 🛡️ AI Safety & Ethics ⚡ AI Lesson 2y ago
Disrupting malicious uses of AI by state-affiliated threat actors
We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious c