LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
📰 ArXiv cs.AI
arXiv:2604.12710v1 Announce Type: cross Abstract: Large language models (LLMs) often demonstrate strong safety performance in high-resource languages, yet exhibit severe vulnerabilities when queried in low-resource languages. We attribute this gap to a mismatch between language-agnostic semantic understanding ability and language-dominant safety alignment biased toward high-resource languages. Consistent with this hypothesis, we empirically identify the semantic bottleneck in LLMs, an intermedia
DeepCamp AI