🏢 Singapore University of Technology and Design
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
·4621 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Singapore University of Technology and Design
MotionLab: One framework to rule them all! Unifying human motion generation & editing via a novel Motion-Condition-Motion paradigm, boosting efficiency and generalization.
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
·3250 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Reasoning
🏢 Singapore University of Technology and Design
GPT models’ multimodal reasoning abilities are tracked over time on challenging visual puzzles, revealing surprisingly steady improvement and cost trade-offs.
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
·3050 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Text Generation
🏢 Singapore University of Technology and Design
TANGOFLUX: Blazing-fast, high-fidelity text-to-audio generation using novel CLAP-Ranked Preference Optimization.
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
·2696 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Question Answering
🏢 Singapore University of Technology and Design
M-LongDoc: a new benchmark and retrieval-aware tuning framework revolutionizes multimodal long document understanding, improving model accuracy by 4.6%.