The Model Agreed, But Didn't Learn: Diagnosing Surface Compliance in Large Language Models

📰 ArXiv cs.AI

arXiv:2604.05995v1 Announce Type: cross Abstract: Large Language Models (LLMs) internalize vast world knowledge as parametric memory, yet inevitably inherit the staleness and errors of their source corpora. Consequently, ensuring the reliability and malleability of these internal representations is imperative for trustworthy real-world deployment. Knowledge editing offers a pivotal paradigm for surgically modifying memory without retraining. However, while recent editors demonstrate high success

Published 8 Apr 2026

Read full paper → ← Back to News