GRM: Utility-Aware Jailbreak Attacks on Audio LLMs via Gradient-Ratio Masking
📰 ArXiv cs.AI
arXiv:2604.09222v1 Announce Type: cross Abstract: Audio large language models (ALLMs) enable rich speech-text interaction, but they also introduce jailbreak vulnerabilities in the audio modality. Existing audio jailbreak methods mainly optimize jailbreak success while overlooking utility preservation, as reflected in transcription quality and question answering performance. In practice, stronger attacks often come at the cost of degraded utility. To study this trade-off, we revisit existing atta
DeepCamp AI