↓Skip to main content

🏢 Center for AI Safety

Humanity's Last Exam

24 January 2025·2314 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Center for AI Safety

Humanity’s Last Exam (HLE): a groundbreaking multi-modal benchmark pushing the boundaries of large language model (LLM) capabilities, revealing a significant gap between current LLMs and human experts…