Skip to main content

🏢 Model Evaluation & Threat Research (METR)

Measuring AI Ability to Complete Long Tasks
·6252 words·30 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Safety 🏢 Model Evaluation & Threat Research (METR)
AI progress is tracked with a new metric, 50%-task-completion time horizon, showing exponential growth with a doubling time of ~7 months, hinting at significant automation potential in the near future…