Skip to main content

Natural Language Processing

Unstructured Evidence Attribution for Long Context Query Focused Summarization
·3830 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Text Summarization 🏒 University of Copenhagen
LLMs struggle with positional bias and lack transparency when summarizing long contexts. This paper introduces SUnsET dataset and fine-tuning methods to improve unstructured evidence citation and summ…
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
·2645 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 AIRI
Packing new knowledge into LoRA adapters can harm LLMs! A delicate balance is needed to prevent performance decline.
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
·4876 words·23 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Korea University
LLMs have ‘Temporal Heads’ that process time-specific facts!
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
·3075 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Zhejiang University
LORAM: Train small, infer large LLMs by memory-efficient LoRA training. Enables 70B parameter model training on a 20G HBM GPU, replacing A100-80G. Reduces parameter storage cost by 15.81x.
REFIND: Retrieval-Augmented Factuality Hallucination Detection in Large Language Models
·582 words·3 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Pohang University of Science and Technology
REFIND: Detects LLM hallucinations by directly leveraging retrieved documents, using a novel Context Sensitivity Ratio.
MoM: Linear Sequence Modeling with Mixture-of-Memories
·2764 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Shanghai AI Laboratory
MoM: Enhancing linear sequence modeling via mixture-of-memories for improved recall and reduced memory interference.
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
·2370 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 National University of Singapore
LongPO: Self-evolve LLMs to excel in long contexts via short-to-long preference optimization, boosting performance without sacrificing short-context skills.
Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering
·2478 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 Johns Hopkins University
Test-time scaling + confidence = better QA!
Craw4LLM: Efficient Web Crawling for LLM Pretraining
·3024 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
CRAW4LLM: Efficiently crawls web pages for LLM pretraining by prioritizing influence scores, boosting data quality & cutting crawling waste.
Autellix: An Efficient Serving Engine for LLM Agents as General Programs
·4705 words·23 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 UC Berkeley
Autellix: Efficient LLM Serving for Agents
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
·2481 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 KAIST
SafeRoute efficiently enhances LLM safety by adaptively using smaller and larger safety guard models, maximizing accuracy while minimizing costs.
Rethinking Diverse Human Preference Learning through Principal Component Analysis
·2799 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Rice University
Decomposed Reward Models (DRMs) extract diverse human preferences from binary comparisons using PCA, enabling flexible and interpretable LLM alignment.
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research
·3084 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Hong Kong University of Science and Technology
Perovskite-LLM: a new knowledge-enhanced system boosts perovskite solar cell research by integrating a domain-specific knowledge graph, high-quality datasets, and specialized LLMs for superior knowled…
PAFT: Prompt-Agnostic Fine-Tuning
·3569 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
PAFT dynamically adjusts prompts during LLM fine-tuning, improving model robustness and generalization across diverse prompts without sacrificing performance or efficiency.
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild
·3895 words·19 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 WΓΌNLP, CAIDAS, University of WΓΌrzburg
Multilingual LLMs Hallucinate! This study measures hallucination across 30 languages.
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
·4689 words·23 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 California Institute of Technology
HEADINFER achieves memory-efficient LLM inference by cleverly offloading key-value cache to the CPU, enabling 4 million token inference on a single consumer GPU.
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
·3819 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 City University of Hong Kong
Crowd-based comparative evaluation significantly boosts LLM-as-a-judge accuracy by using crowd responses to expose deeper details, resulting in more reliable and efficient auto-evaluation.
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
·2814 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 AIRI
LLMs can losslessly compress 1568 tokens into a single vector, surpassing prior methods by two orders of magnitude.
System Message Generation for User Preferences using Open-Source Models
·3777 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Upstage AI
SYSGEN: A novel pipeline generates effective system messages for LLMs using open-source models, improving model responses and addressing data scarcity in supervised fine-tuning.
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
·3833 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 Department of Artificial Intelligence, Chung-Ang University
SAFE-SQL boosts Text-to-SQL accuracy by intelligently generating and filtering self-augmented examples for in-context learning, surpassing existing methods in challenging scenarios.