Skip to main content

Natural Language Processing

EAI: Emotional Decision-Making of LLMs in Strategic Games and Ethical Dilemmas
·4154 words·20 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 AIRI
LLMs’ emotional decision-making is assessed using a novel framework, EAI, showing that emotions significantly alter ethical and strategic choices in games. This reveals crucial biases, necessitati…
Dual-Personalizing Adapter for Federated Foundation Models
·2721 words·13 mins· loading · loading
Natural Language Processing Federated Learning 🏢 Australian AI Institute
Federated Dual-Personalizing Adapter (FedDPA) tackles test-time distribution shifts and personalization in federated foundation models using a global and local adapter co-working mechanism, achieving …
DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
·2987 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Seoul National University
DropBP: Accelerate LLM fine-tuning by 44% while preserving accuracy!
Doing Experiments and Revising Rules with Natural Language and Probabilistic Reasoning
·3039 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Cornell University
This paper introduces ActiveACRE, a model that uses LLMs and probabilistic inference to infer natural language rules through online experimentation, demonstrating higher accuracy than existing methods…
DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting
·2536 words·12 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Nanjing University of Science and Technology
DoFIT: A novel domain-aware framework significantly reduces catastrophic forgetting in federated instruction tuning by finely aggregating overlapping weights and using a proximal perturbation initiali…
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models
·2327 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Microsoft Research
LLMs’ reasoning abilities are assessed via a novel framework that leverages probabilities of causation, revealing that while capable, their understanding of causality falls short of human-level reason…
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
·2914 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Department of Computer Science, University of Chicago
LLMs’ fact retrieval is easily manipulated by context, highlighting their associative memory behavior; this paper studies this with transformers, showing how self-attention and value matrices support …
Do LLMs Build World Representations? Probing Through the Lens of State Abstraction
·2243 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Mila, McGill University
LLMs prioritize task completion over full world-state understanding by using goal-oriented abstractions.
DLAD: Improving Logits-based Detector without Logits from Black-box LLMs
·2559 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 MBZUAI
DALD: A novel framework for black-box LLM text detection, achieving state-of-the-art performance without relying on source model logits, by aligning surrogate model distributions.
Divergences between Language Models and Human Brains
·2519 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Carnegie Mellon University
Language models struggle with social/emotional intelligence and physical commonsense, unlike human brains. Fine-tuning models on these aspects improves their brain response prediction accuracy.
Distributional Preference Alignment of LLMs via Optimal Transport
·2204 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 IBM Research
LLMs are aligned to human preferences distributionally using Optimal Transport, achieving state-of-the-art performance.
DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models
·3179 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Samsung Research
DISP-LLM: A novel dimension-independent structural pruning method for LLMs achieves accuracy similar to semi-structural pruning while improving flexibility and efficiency, outperforming state-of-the-a…
Discrete Modeling via Boundary Conditional Diffusion Processes
·2908 words·14 mins· loading · loading
AI Generated Natural Language Processing Text Generation 🏢 Harbin Institute of Technology
Bridging the gap between continuous diffusion models and discrete data, this work introduces a novel boundary-conditional approach achieving superior performance in language modeling and image generat…
Discovery of the Hidden World with Large Language Models
·6303 words·30 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Hong Kong Baptist University
COAT leverages LLMs to identify high-level causal factors from unstructured data, enabling causal discovery in real-world scenarios where well-defined variables are lacking.
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
·1939 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology
DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.
Discovering Preference Optimization Algorithms with and for Large Language Models
·4948 words·24 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Sakana AI
LLMs discover novel offline preference optimization algorithms, achieving state-of-the-art performance on various tasks.
Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models
·2902 words·14 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab
Diffusion-of-Thought (DoT) boosts reasoning in diffusion language models by enabling parallel reasoning steps, outperforming larger autoregressive models in speed and accuracy.
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
·2806 words·14 mins· loading · loading
AI Generated Natural Language Processing Machine Translation 🏢 Johns Hopkins University
DIFFNORM boosts non-autoregressive speech-to-speech translation by normalizing speech data with a diffusion model and classifier-free guidance, achieving significant quality improvements.
Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models
·2220 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University
Diff-eRank: A novel rank-based metric assessing LLMs’ efficiency in eliminating redundant information during training, showing improved correlation with model size and performance.
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
·2779 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Baidu Inc.
Decoupled-Head Attention (DHA) drastically cuts LLM inference costs by adaptively sharing key/value heads, achieving 97.6% of original performance with only 0.25% pre-training.