Steerable Preference Optimization of Reward Models

Published in Pluralistic Alignment @ ICML, 2026

Minsik Oh, Advit Deepak, Sophie Wu, Douwe Kiela, Ekaterina Shutova [Paper]