Future of AI

AI Safety & Ethics

Alignment, interpretability, AI risks, and building safe AI systems

6,860

lessons

Skills in this topic

3 skills — Sign in to track your progress

View full skill map →

AI Alignment Basics

Explain the alignment problem

AI Ethics & Policy

Identify types of bias in ML systems

AI Safety Engineering

Implement input and output guardrails

Videos 6,598 Reads 262

Showing 262 reads from curated sources

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

Stratechery 🛡️ AI Safety & Ethics ⚡ AI Lesson 2w ago

Axios Supply Chain Attack, Claude Code Code Leaked, AI and Security

AI is going to be bad for security in the short-term, but much better than humans in the long-term.

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Smartphone-Based Identification of Unknown Liquids via Active Vibration Sensing

arXiv:2603.28787v1 Announce Type: cross Abstract: Traditional liquid identification instruments are often unavailable to the general public. This paper shows th

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper 2w ago

Design and Development of an ML/DL Attack Resistance of RC-Based PUF for IoT Security

arXiv:2603.28798v1 Announce Type: cross Abstract: Physically Unclonable Functions (PUFs) provide promising hardware security for IoT authentication, leveraging

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

arXiv:2603.28824v1 Announce Type: cross Abstract: Dataset condensation aims to synthesize compact yet informative datasets that retain the training efficacy of

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

arXiv:2603.29062v1 Announce Type: cross Abstract: LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

TSHA: A Benchmark for Visual Language Models in Trustworthy Safety Hazard Assessment Scenarios

arXiv:2603.29759v1 Announce Type: cross Abstract: Recent advances in vision-language models (VLMs) have accelerated their application to indoor safety hazards a

InfoQ AI/ML 🛡️ AI Safety & Ethics ⚡ AI Lesson 2w ago

PyPI Supply Chain Attack Compromises LiteLLM, Enabling the Exfiltration of Sensitive Information

Discovered by FutureSearch researcher Callum McMahon, a supply chain attack against LiteLLM on PyPI resulted in over 40 thousand downloads of a compromised vers

AI Sandboxes Are Crucial Regulatory Safety Nets For Advancing AI And Saving Humanity From Calamity

Forbes Innovation 🛡️ AI Safety & Ethics ⚡ AI Lesson 2w ago

AI Sandboxes Are Crucial Regulatory Safety Nets For Advancing AI And Saving Humanity From Calamity

Regulatory AI sandboxes are gaining popularity. Here's what they are, plus their tradeoffs. An AI Insider scoop.

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift

arXiv:2603.26676v1 Announce Type: cross Abstract: Current frontier AI safety evaluations emphasize static benchmarks, third-party annotations, and red-teaming.

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

On the Carbon Footprint of Economic Research in the Age of Generative AI

arXiv:2603.26712v1 Announce Type: cross Abstract: Generative artificial intelligence (AI) is increasingly used to write and refactor research code, expanding co

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Capability Safety as Datalog: A Foundational Equivalence

arXiv:2603.26725v1 Announce Type: cross Abstract: We prove that capability safety admits an exact representation as propositional Datalog evaluation (Datalogpro

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Gender-Based Heterogeneity in Youth Privacy-Protective Behavior for Smart Voice Assistants: Evidence from Multigroup PLS-SEM

arXiv:2603.27117v1 Announce Type: cross Abstract: This paper investigates how gender shapes privacy decision-making in youth smart voice assistant (SVA) ecosyst

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

AI-Powered Facial Mask Removal Is Not Suitable For Biometric Identification

arXiv:2603.27747v1 Announce Type: cross Abstract: Recently, crowd-sourced online criminal investigations have used generative-AI to enhance low-quality visual e

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Detection of Adversarial Attacks in Robotic Perception

arXiv:2603.28594v1 Announce Type: cross Abstract: Deep Neural Networks (DNNs) achieve strong performance in semantic segmentation for robotic perception but rem

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Information-Theoretic Limits of Safety Verification for Self-Improving Systems

arXiv:2603.28650v1 Announce Type: cross Abstract: Can a safety gate permit unbounded beneficial self-modification while maintaining bounded cumulative risk? We

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Why Aggregate Accuracy is Inadequate for Evaluating Fairness in Law Enforcement Facial Recognition Systems

arXiv:2603.28675v1 Announce Type: cross Abstract: Facial recognition systems are increasingly deployed in law enforcement and security contexts, where algorithm

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions

arXiv:2504.11967v4 Announce Type: replace-cross Abstract: Unmanned Aerial Vehicles (UAVs) are indispensable for infrastructure inspection, surveillance, and rel

ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 2w ago

FlowPure: Continuous Normalizing Flows for Adversarial Purification

arXiv:2505.13280v2 Announce Type: replace-cross Abstract: Despite significant advances in the area, adversarial robustness remains a critical challenge in syste

Hacker News (AI) 🛡️ AI Safety & Ethics ⚡ AI Lesson 3w ago

FTC action against Match and OkCupid for deceiving users, sharing personal data

New Bernie Sanders AI Safety Bill Would Halt Data Center Construction

Wired AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 3w ago

New Bernie Sanders AI Safety Bill Would Halt Data Center Construction

The US senator said on Tuesday that a moratorium would give lawmakers time to "ensure that AI is safe." Alexandria Ocasio-Cortez will introduce a similar bill i

Hugging Face Blog 🛡️ AI Safety & Ethics ⚡ AI Lesson 7mo ago

Democratizing AI Safety with RiskRubric.ai

OpenAI News 🛡️ AI Safety & Ethics ⚡ AI Lesson 2y ago

Disrupting malicious uses of AI by state-affiliated threat actors

We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious c