Paper Reviews by AI
2025
Identifying Sensitive Weights via Post-quantization Integral
·2603 words·13 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Deep Learning
π’ Tsinghua University
PQI: Accurately identify sensitive weights in post-quantization to enhance LLM compression & performance!
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models
·3091 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Vision-Language Models
π’ Kuaishou Technology
HAIC improves MLLMs’ action understanding with high-quality video captions & new benchmark, boosting performance and generation.
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking
·3730 words·18 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Applications
Manufacturing
π’ Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
DeepSolution enhances engineering design via tree exploration and bi-point thinking.
UniTok: A Unified Tokenizer for Visual Generation and Understanding
·3043 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Vision-Language Models
π’ University of Hong Kong
UniTok: A unified tokenizer bridging the visual generation and understanding gap via multi-codebook quantization, achieving SOTA in MLLMs.
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers
·2117 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Deep Learning
π’ Nanjing University of Aeronautics and Astronautics
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers.
SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning
·1993 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Applications
Software Development
π’ Peking University
SoRFT enhances LLMs for issue resolving via subtask-oriented reinforced fine-tuning, outperforming other open-source models.
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids
·2441 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
AI Applications
Robotics
π’ UC Berkeley
Sim-to-real RL recipe achieves robust vision-based dexterous humanoid manipulation without human demos!
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
·3310 words·16 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Multimodal Learning
Multimodal Reasoning
π’ Johns Hopkins University
R2-T2: Boost multimodal MoE performance by re-routing experts in test-time, no retraining needed!
R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning
·219 words·2 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Machine Translation
π’ Huawei, China
R1-T1: RL-driven framework incentivizing translation capability in LLMs via reasoning learning, achieving superior performance in multiple languages & domains.
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think
·2523 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Peking University
DREAM ENGINE: Text-image interleaved control made easy, unifying text and visual cues for creative image generation.
Mobius: Text to Seamless Looping Video Generation via Latent Shift
·2353 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Video Understanding
π’ Chongqing University of Post and Telecommunications, China
Mobius generates seamless looping videos from text using latent shift, repurposing pre-trained models without training.
Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases
·2582 words·13 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Question Answering
π’ University of Oregon
MoR: Adaptive knowledge retrieval by fusing structural and textual data for better question answering.
LongRoPE2: Near-Lossless LLM Context Window Scaling
·3732 words·18 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Microsoft
LongRoPE2: Extends LLM context windows while preserving performance and reducing training costs!
Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling
·3037 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ National University of Singapore
EDGS: Achieves faster, high-quality dynamic scene rendering by sparse time-variant attribute modeling and intelligent static area filtering.
Self-rewarding correction for mathematical reasoning
·3488 words·17 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Reinforcement Learning
π’ University of Illinois Urbana-Champaign
LLM can now reason and correct itself using self-generated data, achieving performance on par with external reward models!
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
·1937 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Machine Learning
Recommender Systems
π’ KuaiShou Inc.
OneRec: A unified generative model that replaces the traditional retrieve-and-rank strategy, significantly improving recommendation quality in real-world scenarios.
NeoBERT: A Next-Generation BERT
·2699 words·13 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Polytechnique MontrΓ©al
NeoBERT: A new encoder that enhances bidirectional language understanding with cutting-edge architecture, data, and training, achieving SOTA results with only 250M parameters.
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens
·4298 words·21 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Text Generation
π’ NLCo Lab, BIGAI
TokenSwift: Accelerate LLM ultra-long sequence generation up to 100K tokens with >3x speedup and lossless accuracy!
Exploring Rewriting Approaches for Different Conversational Tasks
·1596 words·8 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Question Answering
π’ Adobe Research
Rewriting method is critical to conversational assistant effectiveness.
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
·3674 words·18 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
3D Vision
π’ Tsinghua University
ArtGS: Achieves state-of-the-art, efficient interactable replicas of complex articulated objects via Gaussian Splatting.