🏢 Institute of Artificial Intelligence (TeleAI), China Telecom
Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
·2209 words·11 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Institute of Artificial Intelligence (TeleAI), China Telecom
A novel regularized conditional diffusion model enables effective multi-task preference alignment in sequential decision-making by learning unified preference representations and maximizing mutual inf…