MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation

📰 ArXiv cs.AI

arXiv:2604.14564v1 Announce Type: new Abstract: Reinforcement learning (RL) paradigms have demonstrated strong performance on reasoning-intensive tasks such as code generation. However, limited trajectory diversity often leads to diminishing returns, which constrains the achievable performance ceiling. Search-enhanced RL alleviates this issue by introducing structured exploration, which remains constrained by the single-agent policy priors. Meanwhile, leveraging multiple interacting policies can

Published 17 Apr 2026

Read full paper → ← Back to Reads