Paper Reviews by AI

2025

Chain of Draft: Thinking Faster by Writing Less

25 February 2025·1398 words·7 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Zoom Communications

CoD: LLMs think faster by writing less! A novel prompting strategy cuts costs and latency while maintaining reasoning accuracy.

X-Dancer: Expressive Music to Human Dance Video Generation

24 February 2025·1759 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 UC San Diego

X-Dancer: Expressive dance video generation from music and a single image!

VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing

24 February 2025·2983 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 ReLER Lab, AAII, University of Technology Sydney

VideoGrain: Fine-grained video editing via space-time attention!

Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model

24 February 2025·3468 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Scene Understanding 🏢 Hong Kong Center for Construction Robotics, the Hong Kong University of Science and Technology

Plane-DUSt3R: Leveraging pre-trained models for unposed sparse views room layout reconstruction, enhancing robustness and generalization.

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam

24 February 2025·2799 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Exeter

Stable-SPAM stabilizes 4-bit LLM training, outperforming Adam.

Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration

24 February 2025·4130 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Beijing Jiaotong University

Mobile-Agent-V: Automating mobile tasks using video guidance for efficient, scalable operation, outperforming existing frameworks by 30%.

Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

24 February 2025·3779 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

GOAT: Adaptively boosts LoRA with SVD & MoE alignment, closing the gap with Full FT.

Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning

24 February 2025·9576 words·45 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Yonsei University

Test-time scaling isn’t a universal solve-all for multilingual math reasoning, unlike pre-training scaling, shows MCLM benchmark.

Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance

24 February 2025·3383 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Microsoft

DVPO: A lean RLHF framework that decouples value & policy optimization with global value guidance, cutting GPU use by 40% and training time by 35%.

GCC: Generative Color Constancy via Diffusing a Color Checker

24 February 2025·362 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National Yang Ming Chiao Tung University

GCC: Color constancy through diffusion, inpainting a color checker for stable illumination estimation.

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

24 February 2025·3965 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Segmentation 🏢 Zhejiang University

DICEPTION: A generalist diffusion model for visual perceptual tasks.

Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties

24 February 2025·2937 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Southeast University

CTM: A new benchmark for assessing temporal reasoning in LLMs across Chinese dynastic history.

Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation

23 February 2025·3690 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Cornell University

Reflect VLM: Improving robotic manipulation via vision-language models with a novel reflection mechanism and a diffusion model for imagined futures.

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models

23 February 2025·2433 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Robustness 🏢 M-a-P

CodeCriticBench: A new benchmark for holistic code critique by Large Language Models.

Beyond Release: Access Considerations for Generative AI Systems

23 February 2025·1284 words·7 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Safety 🏢 Hugging Face

AI system access is more than just release; it’s about how accessible system components are, impacting benefits, risks, and scalability.

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

22 February 2025·1916 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Reasoning 🏢 University of California, Santa Cruz

MMIR: A new benchmark to assess and improve multimodal reasoning models’ ability to detect inconsistencies in real-world content.

The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer

21 February 2025·2673 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Vrije Universiteit Brussel

LLMs: 03-mini achieves superior accuracy without longer reasoning chains, suggesting ’thinking harder’ matters more than ’thinking longer'.

TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning

21 February 2025·1243 words·6 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Noah's Ark Lab, Huawei Technologies France

TAG: A decentralized framework for scalable multi-agent hierarchical reinforcement learning.

One-step Diffusion Models with $f$-Divergence Distribution Matching

21 February 2025·6126 words·29 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 NVIDIA

f-distill: One-step diffusion models through f-divergence minimization, outperforming reverse-KL with better mode coverage and lower variance.

MONSTER: Monash Scalable Time Series Evaluation Repository

21 February 2025·4728 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Monash University

MONSTER: Large datasets for time series classification!