Paper Reviews by AI
2025
EgoLife: Towards Egocentric Life Assistant
·3562 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Human-AI Interaction
🏢 NTU S-Lab
EgoLife: Ultra-long egocentric dataset & benchmark enabling AI assistants to understand and enhance daily life. Datasets and models released!
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
·5020 words·24 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 National University of Singapore
VLMs often disproportionately trust text over visual data, leading to performance drops and safety concerns.
Wikipedia in the Era of LLMs: Evolution and Risks
·3967 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Huazhong University of Science and Technology
LLMs modestly affect Wikipedia, subtly altering content and potentially skewing NLP benchmarks.
SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models
·2619 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Healthcare
🏢 HistAI
SPIDER: A comprehensive pathology dataset boosts AI diagnostic models.
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
·2593 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Hong Kong University of Science and Technology
RectifiedHR: Enables training-free high-resolution image generation via energy rectification, boosting both efficiency and effectiveness.
QE4PE: Word-level Quality Estimation for Human Post-Editing
·6157 words·29 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Machine Translation
🏢 CLCG, University of Groningen
QE4PE: Word-level QE’s impact on MT post-editing with 42 pro-editors across English-Italian/Dutch is investigated. Usability&accuracy challenges in professional workflows are underlined.
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
·3985 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Shanghai Jiao Tong University
Q-Eval-100K: A new, large dataset for evaluating visual quality and text alignment in AI-generated content.
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
·2943 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Shanghai Jiao Tong University
Mask-DPO: Fine-grained Factuality Alignment improves LLMs’ factuality by masking sentence-level errors during DPO training for enhanced knowledge alignment.
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation
·4618 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of Oxford
LINGOLY-TOO: A new benchmark to disentangle memorization from reasoning in LLMs using linguistic templatization and orthographic obfuscation.
Learning from Failures in Multi-Attempt Reinforcement Learning
·1948 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Reinforcement Learning
🏢 University of Cambridge
Multi-attempt RL refines LLMs, significantly boosting accuracy on math tasks by enabling them to learn from failures through user feedback.
Language Models can Self-Improve at State-Value Estimation for Better Search
·2765 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Reinforcement Learning
🏢 Georgia Institute of Technology
Self-Taught Lookahead improves LLM search via self-supervision, matching costly methods at a fraction of the compute!
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
·2938 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Deep Learning
🏢 Microsoft GenAI
KODCODE: A new synthetic coding dataset with verified solutions and tests, enabling state-of-the-art performance for coding LLMs.
Iterative Value Function Optimization for Guided Decoding
·2523 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Text Generation
🏢 Shanghai Artificial Intelligence Laboratory
IVO: Iterative Value Function Optimization for Guided Decoding
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia
·2734 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 MBZUAI
LLMs primarily rely on word form, unlike humans, when reconstructing semantics, indicating a need for context-aware mechanisms to enhance LLMs’ adaptability.
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
·3209 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Skolkovo Institute of Science and Technology (Skoltech)
This paper investigates when LLMs are apprehensive and when their uncertainty is justified.
Visual-RFT: Visual Reinforcement Fine-Tuning
·3386 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Shanghai Jiaotong University
Visual-RFT: Enhance LVLMs’ visual reasoning via reinforcement learning with verifiable rewards, achieving strong performance with limited data.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
·1959 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Generation
🏢 University of Technology Sydney
VideoUFO: A new user-focused, million-scale dataset that improves text-to-video generation by aligning training data with real user interests and preferences!
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity
·2929 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Meituan Group
SampleMix: Sample-wise Pre-training Data Mixing by Coordinating Data Quality and Diversity
SAGE: A Framework of Precise Retrieval for RAG
·3653 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Question Answering
🏢 Tsinghua University
SAGE: Precise RAG via semantic segmentation, adaptive chunking, and LLM feedback, boosting QA accuracy & cost-efficiency.
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
·3130 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Microsoft
Phi-4: Compact Multimodal Language Models via Mixture-of-LoRAs