Skip to main content

Paper Reviews by AI

2025

EgoLife: Towards Egocentric Life Assistant
·3562 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Human-AI Interaction 🏢 NTU S-Lab
EgoLife: Ultra-long egocentric dataset & benchmark enabling AI assistants to understand and enhance daily life. Datasets and models released!
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
·5020 words·24 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 National University of Singapore
VLMs often disproportionately trust text over visual data, leading to performance drops and safety concerns.
Wikipedia in the Era of LLMs: Evolution and Risks
·3967 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Huazhong University of Science and Technology
LLMs modestly affect Wikipedia, subtly altering content and potentially skewing NLP benchmarks.
SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models
·2619 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Healthcare 🏢 HistAI
SPIDER: A comprehensive pathology dataset boosts AI diagnostic models.
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
·2593 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology
RectifiedHR: Enables training-free high-resolution image generation via energy rectification, boosting both efficiency and effectiveness.
QE4PE: Word-level Quality Estimation for Human Post-Editing
·6157 words·29 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Machine Translation 🏢 CLCG, University of Groningen
QE4PE: Word-level QE’s impact on MT post-editing with 42 pro-editors across English-Italian/Dutch is investigated. Usability&accuracy challenges in professional workflows are underlined.
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
·3985 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Shanghai Jiao Tong University
Q-Eval-100K: A new, large dataset for evaluating visual quality and text alignment in AI-generated content.
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
·2943 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University
Mask-DPO: Fine-grained Factuality Alignment improves LLMs’ factuality by masking sentence-level errors during DPO training for enhanced knowledge alignment.
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation
·4618 words·22 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Oxford
LINGOLY-TOO: A new benchmark to disentangle memorization from reasoning in LLMs using linguistic templatization and orthographic obfuscation.
Learning from Failures in Multi-Attempt Reinforcement Learning
·1948 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Cambridge
Multi-attempt RL refines LLMs, significantly boosting accuracy on math tasks by enabling them to learn from failures through user feedback.
Language Models can Self-Improve at State-Value Estimation for Better Search
·2765 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Georgia Institute of Technology
Self-Taught Lookahead improves LLM search via self-supervision, matching costly methods at a fraction of the compute!
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
·2938 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Microsoft GenAI
KODCODE: A new synthetic coding dataset with verified solutions and tests, enabling state-of-the-art performance for coding LLMs.
Iterative Value Function Optimization for Guided Decoding
·2523 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Text Generation 🏢 Shanghai Artificial Intelligence Laboratory
IVO: Iterative Value Function Optimization for Guided Decoding
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia
·2734 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MBZUAI
LLMs primarily rely on word form, unlike humans, when reconstructing semantics, indicating a need for context-aware mechanisms to enhance LLMs’ adaptability.
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
·3209 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Skolkovo Institute of Science and Technology (Skoltech)
This paper investigates when LLMs are apprehensive and when their uncertainty is justified.
Visual-RFT: Visual Reinforcement Fine-Tuning
·3386 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Shanghai Jiaotong University
Visual-RFT: Enhance LVLMs’ visual reasoning via reinforcement learning with verifiable rewards, achieving strong performance with limited data.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
·1959 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Generation 🏢 University of Technology Sydney
VideoUFO: A new user-focused, million-scale dataset that improves text-to-video generation by aligning training data with real user interests and preferences!
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity
·2929 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Meituan Group
SampleMix: Sample-wise Pre-training Data Mixing by Coordinating Data Quality and Diversity
SAGE: A Framework of Precise Retrieval for RAG
·3653 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Tsinghua University
SAGE: Precise RAG via semantic segmentation, adaptive chunking, and LLM feedback, boosting QA accuracy & cost-efficiency.
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
·3130 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Microsoft
Phi-4: Compact Multimodal Language Models via Mixture-of-LoRAs