CM-DPO: Constraint-Margin Direct Preference Optimization for LLM Planning
Jan 1, 2026·,,,,,,,,,,·
0 min read
Rabimba Karanjai
Qun Gu
Hemanth Hegadehalii Madhavarao
Wenhuan Sun
Xiaojiao Yu
Suryabhan Singh Hada
Libin N. George
Uma Kona
Richard Williamson
Linsey Pang
Prakhar Mehrotra

Abstract
Preprint: A preference optimization framework that forces planning paths to obey hard system limits.
Type
Publication
NeurIPS 2026 (Under Review)