Skip to main content

🏢 Institute of Artificial Intelligence (TeleAI), China Telecom

Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
·2209 words·11 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Institute of Artificial Intelligence (TeleAI), China Telecom
A novel regularized conditional diffusion model enables effective multi-task preference alignment in sequential decision-making by learning unified preference representations and maximizing mutual inf…