Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution

📰 ArXiv cs.AI

arXiv:2604.03472v1 Announce Type: cross Abstract: Co-evolutionary self-play, where one language model generates problems and another solves them, promises autonomous curriculum learning without human supervision. In practice, the proposer quickly converges to a narrow distribution of problems that satisfy the reward function. This diversity collapse renders the curriculum uninformative for the solver, stalling the co-evolutionary loop. We introduce vocabulary dropout, a random mask applied to th

Published 7 Apr 2026
Read full paper → ← Back to News