2025-01-17s
2025
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
·1945 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tsinghua University
This survey paper explores the exciting new frontier of Large Reasoning Models (LRMs), focusing on how reinforcement learning and clever prompting techniques are boosting LLMs’ reasoning capabilities.
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
·2347 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Yale University
SynthLight: A novel diffusion model relights portraits realistically by learning to re-render synthetic faces, generalizing remarkably well to real photographs.
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation
·4248 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Meta
Scaling visual tokenizers dramatically improves image and video generation, achieving state-of-the-art results and outperforming existing methods with fewer computations by focusing on decoder scaling…
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps
·5585 words·27 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 NYU
Boosting diffusion model performance at inference time, this research introduces a novel framework that goes beyond simply increasing denoising steps. By cleverly searching for better noise candidates…
FAST: Efficient Action Tokenization for Vision-Language-Action Models
·4290 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 UC Berkeley
FAST: A novel action tokenization method using discrete cosine transform drastically improves autoregressive vision-language-action models’ training and performance, enabling dexterous and high-freque…
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
·2252 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Dialogue Systems
🏢 Baichuan Inc.
AI-powered medical consultations often struggle with the inquiry phase. This paper presents a novel patient simulator trained on real interactions, revealing that effective inquiry significantly impac…
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation
·3330 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Graphics AI Lab, NC Research
CaPa: Carve-n-Paint Synthesis generates hyper-realistic 4K textured meshes in under 30 seconds, setting a new standard for efficient 3D asset creation.
AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation
·2125 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Alibaba Tongyi Lab
AnyStory: A unified framework enables high-fidelity personalized image generation for single and multiple subjects, addressing subject fidelity challenges in existing methods.
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
·5724 words·27 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Princeton University
RLHS, a novel alignment algorithm, leverages simulated hindsight feedback to mitigate misalignment in RLHF, significantly improving AI’s alignment with human values and goals.
Do generative video models learn physical principles from watching videos?
·3121 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Google DeepMind
Generative video models struggle to understand physics despite producing visually realistic videos; Physics-IQ benchmark reveals this critical limitation, highlighting the need for improved physical r…