2025-02-18s

video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model

17 February 2025·4398 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Reasoning 🏢 Tsinghua University

video-SALMONN-01: An open-source audio-visual LLM enhances video understanding with a novel reasoning-intensive dataset and the pDPO method, achieving significant accuracy gains.

System Message Generation for User Preferences using Open-Source Models

17 February 2025·3777 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Upstage AI

SYSGEN: A novel pipeline generates effective system messages for LLMs using open-source models, improving model responses and addressing data scarcity in supervised fine-tuning.

SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL

17 February 2025·3833 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Department of Artificial Intelligence, Chung-Ang University

SAFE-SQL boosts Text-to-SQL accuracy by intelligently generating and filtering self-augmented examples for in-context learning, surpassing existing methods in challenging scenarios.

PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning

17 February 2025·2524 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Xi'an Jiaotong University

PhysReason benchmark evaluates physics-based reasoning in LLMs, revealing critical limitations and guiding future improvements.

MagicArticulate: Make Your 3D Models Articulation-Ready

17 February 2025·4321 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Nanyang Technological University

MagicArticulate automates 3D model animation preparation by generating skeletons and skinning weights, overcoming prior manual methods’ limitations, and introducing Articulation-XL, a large-scale benc…

Learning Getting-Up Policies for Real-World Humanoid Robots

17 February 2025·4423 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 University of Illinois Urbana-Champaign

HUMANUP: A novel two-stage reinforcement learning framework enables real-world humanoid robots to autonomously recover from falls on various terrains.

Large Language Models and Mathematical Reasoning Failures

17 February 2025·397 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KTH Royal Institute of Technology

Large language models struggle with mathematical word problems, demonstrating flaws in reasoning despite achieving high accuracy; a new study highlights these persistent gaps in generalization abiliti…

Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

17 February 2025·1604 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KTH Royal Institute of Technology

LLMs’ performance on language complexity tasks (LIX & ADD) reveals a strong correlation with general capabilities, suggesting complexity metrics as noisy zero-shot proxies for model evaluation.

Intuitive physics understanding emerges from self-supervised pretraining on natural videos

17 February 2025·4400 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Meta AI

AI models learn intuitive physics from self-supervised video pretraining.

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

17 February 2025·2102 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Peking University

HermesFlow seamlessly bridges the understanding-generation gap in MLLMs using a novel Pair-DPO framework and self-play optimization on homologous data, achieving significant performance improvements.

Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening

17 February 2025·2525 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Peking University

Diffusion-Sharpening enhances diffusion model fine-tuning by optimizing sampling trajectories, achieving faster convergence and high inference efficiency without extra NFEs, leading to improved alignm…

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

17 February 2025·2347 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

PoPilot, a novel proof-oriented programming LLM, outperforms GPT-40 by 64% under data scarcity by using synthetic data augmentation.

Towards Data-Efficient Pretraining for Atomic Property Prediction

16 February 2025·3694 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Transfer Learning 🏢 King Abdullah University of Science and Technology

High-quality, task-relevant pretraining data surpasses large-scale pretraining in atomic property prediction, achieving comparable performance at 1/24th the computational cost.

Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems

16 February 2025·3486 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Sony Group Corporation

TalkHier, a novel framework for LLM multi-agent systems, uses structured communication and hierarchical refinement to achieve state-of-the-art performance on various tasks, improving collaboration and…

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

16 February 2025·2722 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 DeepSeek-AI

NSA: a novel sparse attention mechanism achieves efficient long-context modeling by combining algorithmic innovations with hardware-aligned optimizations, surpassing full attention models across vario…

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

16 February 2025·7040 words·34 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Zhejiang University

LLMs’ knowledge acquisition is unveiled through the lens of evolving knowledge circuits, revealing how new knowledge integration depends on relevance to existing knowledge, exhibiting distinct phases …

Dyve: Thinking Fast and Slow for Dynamic Process Verification

16 February 2025·1995 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong

Dyve: A novel dynamic process verifier boosts LLM reasoning accuracy by cleverly combining fast, immediate checks with deeper, slower analyses for complex steps, achieving significant performance gain…

Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest

16 February 2025·3405 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Information Extraction 🏢 UC San Diego

Cuckoo: a novel information extraction (IE) model leverages LLM pre-training data, achieving superior performance in few-shot settings by reframing next-token prediction as token extraction.

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

14 February 2025·4399 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI

MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.

Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking

13 February 2025·1354 words·7 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 University of Copenhagen

Fact-checkers need explainable AI: This study reveals how AI tools can better support fact-checkers by providing explanations tailored to their workflows, addressing unmet needs, and improving the eff…