Quantifying Cross-Query Contradictions in Multi-Query LLM Reasoning

📰 ArXiv cs.AI

arXiv:2604.14525v1 Announce Type: new Abstract: Large language models frequently produce mutually inconsistent answers when reasoning over multiple related queries. We study case-file logical consistency: maintaining a globally satisfiable belief state across interdependent queries. We introduce a benchmark of 390 multi-query reasoning instances with entailment/contradiction/unknown labels and propose set-level metrics including Case Satisfiability Rate, Contradiction Density and Revision Cost.

Published 17 Apr 2026
Read full paper → ← Back to Reads