Paper Reviews by AI

MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers

11 February 2025·2884 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

MRS: a novel, training-free sampler, drastically speeds up controllable image generation using Mean Reverting Diffusion, achieving 10-20x speedup across various tasks.

Magic 1-For-1: Generating One Minute Video Clips within One Minute

11 February 2025·1947 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Peking University

Magic141 generates one-minute video clips in under a minute by cleverly factorizing the generation task and employing optimization techniques.

LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!

11 February 2025·3137 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Berkeley

LLMs can be effectively taught complex reasoning via efficient fine-tuning on demonstration data focusing on structure, not content, of the reasoning process.

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

11 February 2025·2654 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

LASP-2 revolutionizes linear attention training by achieving 36.6% faster speeds than Ring Attention via a novel sequence parallelism method, boosting efficiency for very long sequences.

Enhance-A-Video: Better Generated Video for Free

11 February 2025·3320 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 National University of Singapore

Enhance-A-Video boosts video generation quality without retraining, by enhancing cross-frame correlations in diffusion transformers, resulting in improved coherence and visual fidelity.

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

11 February 2025·5174 words·25 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

CODEI/O: Condensing reasoning patterns from code into LLM training data for enhanced reasoning.

Auditing Prompt Caching in Language Model APIs

11 February 2025·5759 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Stanford University

Researchers expose widespread prompt caching in LLMs via novel timing attacks, highlighting significant privacy risks and model architecture leakage.

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

10 February 2025·2951 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Texas at Austin

TripoSG: High-fidelity 3D shapes synthesized via large-scale rectified flow models, pushing image-to-3D generation to new heights.

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

10 February 2025·5896 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

SynthDetoxM generates high-quality multilingual parallel data for text detoxification using LLMs, outperforming existing datasets and models in few-shot settings.

Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM

10 February 2025·3355 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

Steel-LLM: A fully open-source, resource-efficient Chinese LLM trained with transparency, achieving competitive performance despite limited resources.

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

10 February 2025·2360 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Princeton University

ReasonFlux boosts LLM mathematical reasoning by using hierarchical thought templates, outperforming top LLMs like OpenAI’s 01-preview and DeepSeek V3.

Matryoshka Quantization

10 February 2025·9741 words·46 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Google DeepMind

Matryoshka Quantization (MatQuant) boosts low-precision model accuracy by up to 10% through a novel multi-scale training approach. It leverages the nested structure of integer data types, allowing a …

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT

10 February 2025·3016 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Chinese University of Hong Kong

Lumina-Video: Efficient and flexible video generation using a multi-scale Next-DiT architecture with motion control.

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

10 February 2025·3104 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Université Paris-Saclay

Boosting RL fine-tuning efficiency in LLMs: A novel KL penalty modification prioritizes exploration on critical tokens, dramatically improving model performance on arithmetic tasks.

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

10 February 2025·3376 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Amazon

Hephaestus-Forge, a new large-scale pre-training corpus, significantly boosts LLM agent capabilities in API function calling, reasoning, and adaptability through continual pre-training.

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

10 February 2025·1736 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

OREAL, a novel RL framework, achieves state-of-the-art mathematical reasoning in LLMs using only binary outcome rewards, demonstrating that a 7B model can match the performance of 32B models.

Expect the Unexpected: FailSafe Long Context QA for Finance

10 February 2025·2633 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 OpenAI

FailSafeQA benchmark rigorously evaluates LLMs’ resilience against diverse human-interaction variations, revealing critical weaknesses in even high-performing models, particularly regarding hallucinat…

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

10 February 2025·5073 words·24 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 DLUT

EVEv2.0: A novel encoder-free vision-language model outperforms existing approaches by using a divide-and-conquer architecture and a data-efficient training strategy, achieving strong vision-reasoning…

Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile

10 February 2025·4798 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Tsinghua University

EFFICIENT-VDIT accelerates video generation by 7.8x using sparse attention and multi-step distillation.

CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers

10 February 2025·2569 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology

CustomVideoX: Zero-shot personalized video generation, exceeding existing methods in quality & consistency via 3D reference attention and dynamic adaptation.