Advantage-Guided Distillation for Preference Alignment in Small Language Models

Published in ICLR, 2025

Recommended citation: Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang. (2025). "Advantage-Guided Distillation for Preference Alignment in Small Language Models." ICLR 2025.
Download Paper