🏢 Redwood Research
Stress-Testing Capability Elicitation With Password-Locked Models
·2650 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Redwood Research
Fine-tuning, even on a single demonstration, effectively uncovers hidden LLM capabilities, surpassing simple prompting methods.