Generating Synthetic Doctor-Patient Conversations for Long-form Audio Summarization
📰 ArXiv cs.AI
arXiv:2604.06138v1 Announce Type: cross Abstract: Long-context audio reasoning is underserved in both training data and evaluation. Existing benchmarks target short-context tasks, and the open-ended generation tasks most relevant to long-context reasoning pose well-known challenges for automatic evaluation. We propose a synthetic data generation pipeline designed to serve both as a training resource and as a controlled evaluation environment, and instantiate it for first-visit doctor-patient con
DeepCamp AI