🏢 Shanghai Artificial Intelligence Laboratory, Fudan University
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
·3551 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Shanghai Artificial Intelligence Laboratory, Fudan University
Critic-V enhances VLM reasoning accuracy by incorporating a critic model that provides constructive feedback, significantly outperforming existing methods on several benchmarks.