Aligning Vision Language Models via anchor
Jan 1, 2026·,,,,·
0 min read
Yilin Yang
Yuke Wang
Rabimba Karanjai
Weidong Shi
Chengming Zhang

Abstract
Preprint: Aligning visual tokens with semantic text anchors to enhance multi-modal reasoning capabilities.
Type
Publication
NeurIPS 2026 (Under Review)