TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification
📰 ArXiv cs.AI
arXiv:2604.14531v1 Announce Type: new Abstract: Every call to an LLM classification endpoint produces a labeled input-output pair already retained in production logs. These pairs constitute a free, growing training set: a lightweight surrogate trained on them can absorb a significant portion of future traffic at near-zero marginal inference cost. The open questions are when the surrogate is reliable enough to deploy, what it handles versus defers, and how that boundary evolves as data accumulate
DeepCamp AI