↓Skip to main content

🏢 Case.edu

Thinking Preference Optimization

17 February 2025·5794 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Case.edu

ThinkPO improves LLM reasoning by preferring longer CoT, boosting performance without new data.