Skip to main content

Paper Reviews by AI

2025

Identifying Sensitive Weights via Post-quantization Integral
·2603 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Machine Learning Deep Learning 🏒 Tsinghua University
PQI: Accurately identify sensitive weights in post-quantization to enhance LLM compression & performance!
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models
·3091 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Multimodal Learning Vision-Language Models 🏒 Kuaishou Technology
HAIC improves MLLMs’ action understanding with high-quality video captions & new benchmark, boosting performance and generation.
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking
·3730 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers AI Applications Manufacturing 🏒 Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
DeepSolution enhances engineering design via tree exploration and bi-point thinking.
UniTok: A Unified Tokenizer for Visual Generation and Understanding
·3043 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Multimodal Learning Vision-Language Models 🏒 University of Hong Kong
UniTok: A unified tokenizer bridging the visual generation and understanding gap via multi-codebook quantization, achieving SOTA in MLLMs.
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
·2117 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Machine Learning Deep Learning 🏒 Nanjing University of Aeronautics and Astronautics
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers.
SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning
·1993 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers AI Applications Software Development 🏒 Peking University
SoRFT enhances LLMs for issue resolving via subtask-oriented reinforced fine-tuning, outperforming other open-source models.
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
·2441 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers AI Applications Robotics 🏒 UC Berkeley
Sim-to-real RL recipe achieves robust vision-based dexterous humanoid manipulation without human demos!
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
·3310 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Multimodal Learning Multimodal Reasoning 🏒 Johns Hopkins University
R2-T2: Boost multimodal MoE performance by re-routing experts in test-time, no retraining needed!
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
·219 words·2 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Machine Translation 🏒 Huawei, China
R1-T1: RL-driven framework incentivizing translation capability in LLMs via reasoning learning, achieving superior performance in multiple languages & domains.
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think
·2523 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Peking University
DREAM ENGINE: Text-image interleaved control made easy, unifying text and visual cues for creative image generation.
Mobius: Text to Seamless Looping Video Generation via Latent Shift
·2353 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Video Understanding 🏒 Chongqing University of Post and Telecommunications, China
Mobius generates seamless looping videos from text using latent shift, repurposing pre-trained models without training.
Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases
·2582 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 University of Oregon
MoR: Adaptive knowledge retrieval by fusing structural and textual data for better question answering.
LongRoPE2: Near-Lossless LLM Context Window Scaling
·3732 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Microsoft
LongRoPE2: Extends LLM context windows while preserving performance and reducing training costs!
Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling
·3037 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision 3D Vision 🏒 National University of Singapore
EDGS: Achieves faster, high-quality dynamic scene rendering by sparse time-variant attribute modeling and intelligent static area filtering.
Self-rewarding correction for mathematical reasoning
·3488 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Machine Learning Reinforcement Learning 🏒 University of Illinois Urbana-Champaign
LLM can now reason and correct itself using self-generated data, achieving performance on par with external reward models!
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
·1937 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Machine Learning Recommender Systems 🏒 KuaiShou Inc.
OneRec: A unified generative model that replaces the traditional retrieve-and-rank strategy, significantly improving recommendation quality in real-world scenarios.
NeoBERT: A Next-Generation BERT
·2699 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Polytechnique MontrΓ©al
NeoBERT: A new encoder that enhances bidirectional language understanding with cutting-edge architecture, data, and training, achieving SOTA results with only 250M parameters.
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens
·4298 words·21 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Text Generation 🏒 NLCo Lab, BIGAI
TokenSwift: Accelerate LLM ultra-long sequence generation up to 100K tokens with >3x speedup and lossless accuracy!
Exploring Rewriting Approaches for Different Conversational Tasks
·1596 words·8 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 Adobe Research
Rewriting method is critical to conversational assistant effectiveness.
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
·3674 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision 3D Vision 🏒 Tsinghua University
ArtGS: Achieves state-of-the-art, efficient interactable replicas of complex articulated objects via Gaussian Splatting.