Skip to main content

🏢 OpenAI

Transcendence: Generative Models Can Outperform The Experts That Train Them
·2384 words·12 mins· loading · loading
AI Theory Generalization 🏢 OpenAI
Generative models can outperform their human trainers: A groundbreaking study shows how autoregressive transformers, trained on chess game data, can achieve higher game ratings than any of the human …
Rule Based Rewards for Language Model Safety
·3342 words·16 mins· loading · loading
AI Theory Safety 🏢 OpenAI
Rule-Based Rewards (RBRs) enhance LLM safety by using AI feedback and a few-shot prompt-based approach, achieving higher safety-behavior accuracy with less human annotation than existing methods.