🏢 Beijing Jiaotong University
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
·2034 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Beijing Jiaotong University
OpenRFT adapts generalist reasoning models for domain-specific tasks using reinforcement fine-tuning, overcoming data scarcity and lack of reasoning step data via question augmentation, synthesized re…
o1-Coder: an o1 Replication for Coding
·1672 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Beijing Jiaotong University
O1-CODER replicates OpenAI’s o1 model for coding, integrating reinforcement learning and Monte Carlo Tree Search to enhance System-2 thinking and generate high-quality code with reasoning steps.