BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

📰 ArXiv cs.AI

arXiv:2604.09378v1 Announce Type: cross Abstract: Agent ecosystems increasingly rely on installable skills to extend functionality, and some skills bundle learned model artifacts as part of their execution logic. This creates a supply-chain risk that is not captured by prompt injection or ordinary plugin misuse: a third-party skill may appear benign while concealing malicious behavior inside its bundled model. We present BadSkill, a backdoor attack formulation that targets this model-in-skill th

Published 13 Apr 2026

Read full paper → ← Back to Reads