🏢 Shanghai Jiaotong University
Visual-RFT: Visual Reinforcement Fine-Tuning
·3386 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Shanghai Jiaotong University
Visual-RFT: Enhance LVLMs’ visual reasoning via reinforcement learning with verifiable rewards, achieving strong performance with limited data.