↓Skip to main content

🏢 University of Tsukuba

Stepwise Alignment for Constrained Language Model Policy Optimization

26 September 2024·2517 words·12 mins· loading · loading

AI Theory Safety 🏢 University of Tsukuba

Stepwise Alignment for Constrained Policy Optimization (SACPO) efficiently aligns LLMs with human values, prioritizing both helpfulness and harmlessness via a novel stepwise approach.