Visual Question Answering
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models
·3392 words·16 mins
AI Generated
🤗 Daily Papers
Computer Vision
Visual Question Answering
🏢 University of California, Berkeley
DynaMath, a novel benchmark, reveals that state-of-the-art VLMs struggle with variations of simple math problems, showcasing their reasoning fragility. It offers 501 high-quality seed questions, dyna…