🏢 Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences
Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning
·3123 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences
Vision-R1: Improves LVLMs via vision-guided reinforcement learning, eliminating the need for human feedback and specialized reward models.