CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Vision-Language Models

📰 ArXiv cs.AI

CDH-Bench is a benchmark for evaluating visual fidelity in vision-language models by testing their tendency to hallucinate based on commonsense rather than visual evidence

advanced Published 31 Mar 2026
Action Steps
  1. Identify the problem of commonsense-driven hallucination (CDH) in vision-language models
  2. Develop a benchmark to evaluate the visual fidelity of these models
  3. Use the CDH-Bench benchmark to test models' tendency to override visual evidence with commonsense alternatives
  4. Analyze the results to improve models' reliability and visual fidelity
Who Needs to Know This

AI researchers and engineers working on vision-language models can benefit from this benchmark to evaluate and improve their models' visual fidelity, while data scientists and ML engineers can use it to identify potential flaws in their models

Key Insight

💡 Vision-language models can override visual evidence with commonsense alternatives, leading to unreliable outputs

Share This
💡 Introducing CDH-Bench: a benchmark to evaluate visual fidelity in vision-language models and prevent commonsense-driven hallucination
Read full paper → ← Back to News