↓Skip to main content

🏢 Zhongguancun Laboratory

VisualSimpleQA: A Benchmark for Decoupled Evaluation of Large Vision-Language Models in Fact-Seeking Question Answering

9 March 2025·2597 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Zhongguancun Laboratory

VisualSimpleQA: A new benchmark for fine-grained evaluation of visual and linguistic modules in fact-seeking LVLMs.