↓Skip to main content

🏢 MBZUAI

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

24 March 2025·4635 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 MBZUAI

Video SimpleQA: A New Benchmark for Factuality Evaluation in Large Video Language Models.

Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia

3 March 2025·2734 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MBZUAI

LLMs primarily rely on word form, unlike humans, when reconstructing semantics, indicating a need for context-aware mechanisms to enhance LLMs’ adaptability.

KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

20 February 2025·3707 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Scene Understanding 🏢 MBZUAI

KITAB-Bench: A new multi-domain Arabic OCR benchmark to bridge the performance gap with English OCR technologies.

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

19 February 2025·2585 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Scene Understanding 🏢 MBZUAI

New geolocation dataset & reasoning framework enhance accuracy and interpretability by leveraging human gameplay data.