Skip to main content

🏢 Redwood Research

Stress-Testing Capability Elicitation With Password-Locked Models
·2650 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Redwood Research
Fine-tuning, even on a single demonstration, effectively uncovers hidden LLM capabilities, surpassing simple prompting methods.