Feature Attribution Stability Suite: How Stable Are Post-Hoc Attributions?

📰 ArXiv cs.AI

Researchers introduce the Feature Attribution Stability Suite to evaluate the stability of post-hoc feature attribution methods under realistic input perturbations

advanced Published 6 Apr 2026

Action Steps

Identify the limitations of existing metrics for evaluating feature attribution stability
Develop a suite of metrics that condition on prediction preservation and capture explanation fragility separately from model sensitivity
Apply the Feature Attribution Stability Suite to various post-hoc feature attribution methods and evaluate their performance under realistic input perturbations
Analyze the results to determine the stability of different feature attribution methods and identify areas for improvement

Who Needs to Know This

Machine learning researchers and engineers working on safety-critical vision systems can benefit from this research to improve the reliability of their models, while data scientists and AI engineers can apply these methods to evaluate the stability of their own feature attribution methods

Key Insight

💡 The stability of post-hoc feature attribution methods is crucial for safety-critical vision systems and can be evaluated using a suite of metrics that condition on prediction preservation