RLPO: Residual Listwise Preference Optimization for Long-Context Review Ranking

📰 ArXiv cs.AI

arXiv:2601.07449v2 Announce Type: replace-cross Abstract: Review ranking is pivotal in e-commerce for prioritizing diagnostic and authentic feedback from the deluge of user-generated content. While large language models have improved semantic assessment, existing ranking paradigms face a persistent trade-off in long-context settings. Pointwise scoring is efficient but often fails to account for list-level interactions, leading to miscalibrated top-$k$ rankings. Listwise approaches can leverage g

Published 17 Apr 2026

Read full paper → ← Back to Reads