Your LLM Judge Has Opinions. They're Not About Quality.
📰 Dev.to · Frank Chen
When your eval score goes up, the natural conclusion is that your model got better. But there's...
When your eval score goes up, the natural conclusion is that your model got better. But there's...