🏢 Model Evaluation & Threat Research (METR)
Measuring AI Ability to Complete Long Tasks
·6252 words·30 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Theory
Safety
🏢 Model Evaluation & Threat Research (METR)
AI progress is tracked with a new metric, 50%-task-completion time horizon, showing exponential growth with a doubling time of ~7 months, hinting at significant automation potential in the near future…