SkillTester: Benchmarking Utility and Security of Agent Skills

📰 ArXiv cs.AI

SkillTester is a tool for evaluating the utility and security of agent skills

advanced Published 1 Apr 2026
Action Steps
  1. Implement paired baseline and with-skill execution conditions to evaluate agent skills
  2. Use a separate security probe suite to assess security vulnerabilities
  3. Normalize raw execution artifacts into utility and security scores
  4. Assign a three-level security status label based on the security score
Who Needs to Know This

AI engineers and researchers on a team can benefit from SkillTester to assess and improve the performance of their agent skills, while security experts can use it to identify potential vulnerabilities

Key Insight

💡 SkillTester provides a comprehensive evaluation framework for agent skills, combining utility and security assessments

Share This
🤖 Evaluate agent skills with SkillTester! 💡
Read full paper → ← Back to News