Skip to main content

🏢 Shanghai Artificial Intelligence Laboratory, Fudan University

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
·3551 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Shanghai Artificial Intelligence Laboratory, Fudan University
Critic-V enhances VLM reasoning accuracy by incorporating a critic model that provides constructive feedback, significantly outperforming existing methods on several benchmarks.