HealthAdminBench: Evaluating Computer-Use Agents on Healthcare Administration Tasks

📰 ArXiv cs.AI

arXiv:2604.09937v1 Announce Type: new Abstract: Healthcare administration accounts for over $1 trillion in annual spending, making it a promising target for LLM-based computer-use agents (CUAs). While clinical applications of LLMs have received significant attention, no benchmark exists for evaluating CUAs on end-to-end administrative workflows. To address this gap, we introduce HealthAdminBench, a benchmark comprising four realistic GUI environments: an EHR, two payer portals, and a fax system,

Published 14 Apr 2026

Read full paper → ← Back to Reads