🏢 University of Tsukuba
Stepwise Alignment for Constrained Language Model Policy Optimization
·2517 words·12 mins·
loading
·
loading
AI Theory
Safety
🏢 University of Tsukuba
Stepwise Alignment for Constrained Policy Optimization (SACPO) efficiently aligns LLMs with human values, prioritizing both helpfulness and harmlessness via a novel stepwise approach.