Skip to main content

🏢 University of Tsukuba

Stepwise Alignment for Constrained Language Model Policy Optimization
·2517 words·12 mins· loading · loading
AI Theory Safety 🏢 University of Tsukuba
Stepwise Alignment for Constrained Policy Optimization (SACPO) efficiently aligns LLMs with human values, prioritizing both helpfulness and harmlessness via a novel stepwise approach.