↓Skip to main content

🏢 School of Computer Science and Engineering, Sun Yat-Sen University

Weighted-Reward Preference Optimization for Implicit Model Fusion

4 December 2024·4595 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Computer Science and Engineering, Sun Yat-Sen University

WRPO: Implicitly fuse LLMs, boosting performance without complex alignment or merging!