↓Skip to main content

🏢 OpenAI

Transcendence: Generative Models Can Outperform The Experts That Train Them

26 September 2024·2384 words·12 mins· loading · loading

AI Theory Generalization 🏢 OpenAI

Generative models can outperform their human trainers: A groundbreaking study shows how autoregressive transformers, trained on chess game data, can achieve higher game ratings than any of the human …

Rule Based Rewards for Language Model Safety

26 September 2024·3342 words·16 mins· loading · loading

AI Theory Safety 🏢 OpenAI

Rule-Based Rewards (RBRs) enhance LLM safety by using AI feedback and a few-shot prompt-based approach, achieving higher safety-behavior accuracy with less human annotation than existing methods.