Skip to main content

🏢 Singapore University of Technology and Design

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
·4621 words·22 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Singapore University of Technology and Design
MotionLab: One framework to rule them all! Unifying human motion generation & editing via a novel Motion-Condition-Motion paradigm, boosting efficiency and generalization.
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
·3250 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Reasoning 🏢 Singapore University of Technology and Design
GPT models’ multimodal reasoning abilities are tracked over time on challenging visual puzzles, revealing surprisingly steady improvement and cost trade-offs.
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
·3050 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Text Generation 🏢 Singapore University of Technology and Design
TANGOFLUX: Blazing-fast, high-fidelity text-to-audio generation using novel CLAP-Ranked Preference Optimization.
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework
·2696 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Singapore University of Technology and Design
M-LongDoc: a new benchmark and retrieval-aware tuning framework revolutionizes multimodal long document understanding, improving model accuracy by 4.6%.