Shuo Li Liu - Coherence in RLHF Preference Data

Name: Shuo Li Liu - Coherence in RLHF Preference Data
Uploaded: 2026-04-24T14:36:36Z
Channel: Cohere
Description: RLHF usually learn from pairwise comparisons, often through Bradley-Terry-style models. I will discuss what coherence requirements, such as Weak Stochas...

Cohere · Advanced ·📄 Research Papers Explained ·1w ago

Skills: Reading ML Papers90%

RLHF usually learn from pairwise comparisons, often through Bradley-Terry-style models. I will discuss what coherence requirements, such as Weak Stochastic Transitivity and the Weak Axiom of Revealed Preference, mean for preference trained AI systems. Shuo Li Liu is a PhD student in Economics at Princeton University. His work connects axiomatic decision theory and AI alignment, with current projects on stochastic choice, preference learning, and the foundations of RLHF evaluation. This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Katrina Lawrence and Neel Ghoshal, Leads of our ML Math group for their dedication in organizing this event. If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker. Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommunityApp).

Watch on YouTube ↗ (saves to browser)