Skip to main content

Paper Reviews by AI

2025

Chain of Draft: Thinking Faster by Writing Less
·1398 words·7 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Zoom Communications
CoD: LLMs think faster by writing less! A novel prompting strategy cuts costs and latency while maintaining reasoning accuracy.
X-Dancer: Expressive Music to Human Dance Video Generation
·1759 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 UC San Diego
X-Dancer: Expressive dance video generation from music and a single image!
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing
·2983 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 ReLER Lab, AAII, University of Technology Sydney
VideoGrain: Fine-grained video editing via space-time attention!
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
·3468 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Scene Understanding 🏢 Hong Kong Center for Construction Robotics, the Hong Kong University of Science and Technology
Plane-DUSt3R: Leveraging pre-trained models for unposed sparse views room layout reconstruction, enhancing robustness and generalization.
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
·2799 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Exeter
Stable-SPAM stabilizes 4-bit LLM training, outperforming Adam.
Mobile-Agent-V: Learning Mobile Device Operation Through Video-Guided Multi-Agent Collaboration
·4130 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Beijing Jiaotong University
Mobile-Agent-V: Automating mobile tasks using video guidance for efficient, scalable operation, outperforming existing frameworks by 30%.
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
·3779 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology
GOAT: Adaptively boosts LoRA with SVD & MoE alignment, closing the gap with Full FT.
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
·9576 words·45 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Yonsei University
Test-time scaling isn’t a universal solve-all for multilingual math reasoning, unlike pre-training scaling, shows MCLM benchmark.
Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance
·3383 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Microsoft
DVPO: A lean RLHF framework that decouples value & policy optimization with global value guidance, cutting GPU use by 40% and training time by 35%.
GCC: Generative Color Constancy via Diffusing a Color Checker
·362 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 National Yang Ming Chiao Tung University
GCC: Color constancy through diffusion, inpainting a color checker for stable illumination estimation.
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
·3965 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Segmentation 🏢 Zhejiang University
DICEPTION: A generalist diffusion model for visual perceptual tasks.
Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties
·2937 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Southeast University
CTM: A new benchmark for assessing temporal reasoning in LLMs across Chinese dynastic history.
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
·3690 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Cornell University
Reflect VLM: Improving robotic manipulation via vision-language models with a novel reflection mechanism and a diffusion model for imagined futures.
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
·2433 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Robustness 🏢 M-a-P
CodeCriticBench: A new benchmark for holistic code critique by Large Language Models.
Beyond Release: Access Considerations for Generative AI Systems
·1284 words·7 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Safety 🏢 Hugging Face
AI system access is more than just release; it’s about how accessible system components are, impacting benefits, risks, and scalability.
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models
·1916 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Reasoning 🏢 University of California, Santa Cruz
MMIR: A new benchmark to assess and improve multimodal reasoning models’ ability to detect inconsistencies in real-world content.
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
·2673 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Vrije Universiteit Brussel
LLMs: 03-mini achieves superior accuracy without longer reasoning chains, suggesting ’thinking harder’ matters more than ’thinking longer'.
TAG: A Decentralized Framework for Multi-Agent Hierarchical Reinforcement Learning
·1243 words·6 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Noah's Ark Lab, Huawei Technologies France
TAG: A decentralized framework for scalable multi-agent hierarchical reinforcement learning.
One-step Diffusion Models with $f$-Divergence Distribution Matching
·6126 words·29 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 NVIDIA
f-distill: One-step diffusion models through f-divergence minimization, outperforming reverse-KL with better mode coverage and lower variance.
MONSTER: Monash Scalable Time Series Evaluation Repository
·4728 words·23 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Monash University
MONSTER: Large datasets for time series classification!