2025-02-21s

Unstructured Evidence Attribution for Long Context Query Focused Summarization

20 February 2025·3830 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Text Summarization 🏢 University of Copenhagen

LLMs struggle with positional bias and lack transparency when summarizing long contexts. This paper introduces SUnsET dataset and fine-tuning methods to improve unstructured evidence citation and summ…

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

20 February 2025·4915 words·24 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Google DeepMind

SigLIP 2: Multilingual Vision-Language Encoders with Semantic Understanding, Localization, and Dense Features.

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

20 February 2025·4251 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 University of Pennsylvania

CoSyn: Code-guided synth data for scaling text-rich image understanding, achieving SOTA via targeted multimodal data generation!

S*: Test Time Scaling for Code Generation

20 February 2025·2539 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 UC Berkeley

S*: Hybrid test-time scaling for code generation, boosting both coverage and selection accuracy.

RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers

20 February 2025·2754 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Science and Technology of China

RelaCtrl: Relevance-guided control boosts diffusion transformer efficiency, cutting parameters by intelligently allocating resources.

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

20 February 2025·2325 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Embodied AI 🏢 MAIS, Institute of Automation, Chinese Academy of Sciences, China

PC-Agent: A new hierarchical framework that significantly improves complex task automation on PCs by 32%!

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

20 February 2025·1911 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 UC Santa Barbara

MLGYM: A new framework & benchmark to advance AI Research Agents

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

20 February 2025·3688 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Microsoft Research Asia

Logic-RL unlocks LLM reasoning via rule-based reinforcement learning, generalizing to math problems after training on logic puzzles.

LLM-based User Profile Management for Recommender System

20 February 2025·2332 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Recommender Systems 🏢 Ulsan National Institute of Science and Technology

PURE: LLM-driven user profile management boosts recommendation by harnessing user reviews for personalized insights while tackling token limits. PURE enhances LLMs for better recommendations.

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

20 February 2025·2645 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

Packing new knowledge into LoRA adapters can harm LLMs! A delicate balance is needed to prevent performance decline.

Dynamic Concepts Personalization from Single Videos

20 February 2025·2668 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Snap Research

Personalizing video models for dynamic concepts is now achievable with Set-and-Sequence: enabling high-fidelity generation, editing, and composition!

Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

20 February 2025·4876 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Korea University

LLMs have ‘Temporal Heads’ that process time-specific facts!

Discovering highly efficient low-weight quantum error-correcting codes with reinforcement learning

20 February 2025·6573 words·31 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Optimization 🏢 University of Texas at Austin

RL optimizes quantum error-correcting codes, slashing physical qubit overhead for fault-tolerant quantum computing.

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

20 February 2025·402 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Menlo Research

AlphaMaze enhances LLMs’ spatial intelligence via GRPO, achieving 93% accuracy in maze navigation and showing emergent reasoning.

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

19 February 2025·2585 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Scene Understanding 🏢 MBZUAI

New geolocation dataset & reasoning framework enhance accuracy and interpretability by leveraging human gameplay data.

S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

18 February 2025·3894 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Tencent

S2R: Teaches LLMs to self-verify and self-correct, boosting reasoning with efficient reinforcement learning.

How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild

18 February 2025·3895 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 WüNLP, CAIDAS, University of Würzburg

Multilingual LLMs Hallucinate! This study measures hallucination across 30 languages.