Paper Reviews by AI

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

3 March 2025·3403 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Illinois Urbana-Champaign

MultiAgentBench: A benchmark for evaluating collaboration and competition in LLM agents across diverse, interactive scenarios with novel metrics and protocols.

Liger: Linearizing Large Language Models to Gated Recurrent Structures

3 March 2025·4096 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

Liger: LLMs linearized to gated recurrent models, enabling efficient deployment via key matrix repurposing and LoRA fine-tuning.

Large-Scale Data Selection for Instruction Tuning

3 March 2025·2665 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Washington

RDS+ is the unsung hero for scaling instruction tuning data selection!

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

3 March 2025·2689 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 HKUST(GZ)

Kiss3DGen generates 3D assets by repurposing 2D diffusion models, enabling efficient 3D editing and enhancement.

Forgetting Transformer: Softmax Attention with a Forget Gate

3 March 2025·4225 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Mila & Université De Montréal

Transformers get forgetful! This paper introduces the Forgetting Transformer (FoX), incorporating a forget gate into the attention mechanism for improved sequence modeling.

Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective

3 March 2025·2296 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 SandLogic Technologies Pvt Ltd

Shakti SLMs: Fine-tuning compact language models for efficient, domain-specific AI on edge devices.

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

3 March 2025·2905 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 NVIDIA Research

Likelihood-based generative models get a GAN-like boost via a new Direct Discriminative Optimization, ditching the joint training complexity.

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

3 March 2025·2982 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 NVIDIA

DIFIX3D+ improves 3D reconstructions by reducing artifacts via single-step diffusion models, enhancing novel-view synthesis quality and consistency.

DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

3 March 2025·1645 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Speech and Audio Music Generation 🏢 Northwestern Polytechnical University

DiffRhythm: Fast & Simple End-to-End Song Generation via Latent Diffusion, creating full songs (4+ mins) with vocal & accompaniment in seconds!

CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom

3 March 2025·8404 words·40 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Huazhong University of Science and Technology

CROWDSELECT boosts instruction tuning by cleverly selecting synthetic data using multi-LLM wisdom, enhancing model performance across diverse tasks.

CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive Task Solving and Reasoning in UAVs

3 March 2025·1578 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 Skolkovo Institute of Science and Technology

CognitiveDrone: A novel VLA model and benchmark for real-time cognitive UAV tasks, improving reasoning and control.

CodeArena: A Collective Evaluation Platform for LLM Code Generation

3 March 2025·1693 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Nanyang Technological University

CodeArena: Collective evaluation for LLM code generation.

Speculative Ad-hoc Querying

2 March 2025·2957 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Finance 🏢 University of Texas at Austin

SpeQL: Near-instant results for ad-hoc queries using LLMs to predict and precompute, dramatically improving user experience.

SemViQA: A Semantic Question Answering System for Vietnamese Information Fact-Checking

2 March 2025·3011 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 FPT Software AI Center, Viet Nam

SemViQA: A new approach to boost Vietnamese fact-checking with semantic understanding and efficient evidence retrieval.

DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting

2 March 2025·2236 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Fudan University

DuoDecoding: Accelerating LLM inference by strategically deploying draft & target models on CPU & GPU for parallel decoding and dynamic drafting.

CLEA: Closed-Loop Embodied Agent for Enhancing Task Execution in Dynamic Environments

2 March 2025·1626 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Embodied AI 🏢 Shenzhen Future Network of Intelligence Institute

CLEA: Enhancing task execution in dynamic environments with a closed-loop embodied agent.

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

2 March 2025·2242 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 DAMO Academy, Alibaba Group

Babel: An open multilingual LLM supports over 90% of global speakers, filling the language coverage gap and setting new performance standards.

Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions

1 March 2025·3420 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Datasets 🏢 Xiaohongshu Inc.

Qilin: A multimodal dataset with APP-level user sessions for advancing search and recommendation systems.

Interact, Instruct to Improve: A LLM-Driven Parallel Actor-Reasoner Framework for Enhancing Autonomous Vehicle Interactions

1 March 2025·310 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Autonomous Vehicles 🏢 Tongji University

LLM-driven framework enhances autonomous vehicle interactions with human drivers in real-time.

RuCCoD: Towards Automated ICD Coding in Russian

28 February 2025·4222 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers AI Applications Healthcare 🏢 AIRI, Moscow, Russia

New dataset for automated ICD coding in Russian enhances clinical data accuracy.