Discriminative Policy Optimization for Token-Level Reward Models
Published in ICML, 2025
Recommended citation: Hongzhan Chen, Tao Yang, Shiping Gao, Ruijun Chen, Xiaojun Quan, Hongtao Tian, Ting Yao. (2025). "Discriminative Policy Optimization for Token-Level Reward Models." ICML 2025.
Download Paper
