Skip to main content

Visual Question Answering

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
·3392 words·16 mins
AI Generated 🤗 Daily Papers Computer Vision Visual Question Answering 🏢 University of California, Berkeley
DynaMath, a novel benchmark, reveals that state-of-the-art VLMs struggle with variations of simple math problems, showcasing their reasoning fragility. It offers 501 high-quality seed questions, dyna…