Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards

📰 ArXiv cs.AI

arXiv:2604.09855v1 Announce Type: new Abstract: The recent advancement of Large Language Models (LLMs) has established their potential as autonomous interactive agents. However, they often struggle in strategic games of incomplete information, such as bilateral price negotiation. In this paper, we investigate if Reinforcement Learning from Verifiable Rewards (RLVR) can effectively teach LLMs to negotiate. Specifically, we explore the strategic behaviors that emerge during the learning process. W

Published 14 Apr 2026

Read full paper → ← Back to Reads