Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking

📰 ArXiv cs.AI

arXiv:2604.22760v1 Announce Type: cross Abstract: Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We present a unified benchmarking framework to quantify inter-LLM divergence, defined as the extent to which models differ in API discovery and ranking under identical tasks. Across 15 canonical API domains and 5 major model families, we measure pairwi

Published 28 Apr 2026

Read full paper → ← Back to Reads