Natural Language Processing

EAI: Emotional Decision-Making of LLMs in Strategic Games and Ethical Dilemmas

26 September 2024·4154 words·20 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 AIRI

LLMs’ emotional decision-making is assessed using a novel framework, EAI, showing that emotions significantly alter ethical and strategic choices in games. This reveals crucial biases, necessitati…

Dual-Personalizing Adapter for Federated Foundation Models

26 September 2024·2721 words·13 mins· loading · loading

Natural Language Processing Federated Learning 🏢 Australian AI Institute

Federated Dual-Personalizing Adapter (FedDPA) tackles test-time distribution shifts and personalization in federated foundation models using a global and local adapter co-working mechanism, achieving …

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation

26 September 2024·2987 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Seoul National University

DropBP: Accelerate LLM fine-tuning by 44% while preserving accuracy!

Doing Experiments and Revising Rules with Natural Language and Probabilistic Reasoning

26 September 2024·3039 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Cornell University

This paper introduces ActiveACRE, a model that uses LLMs and probabilistic inference to infer natural language rules through online experimentation, demonstrating higher accuracy than existing methods…

DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting

26 September 2024·2536 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Nanjing University of Science and Technology

DoFIT: A novel domain-aware framework significantly reduces catastrophic forgetting in federated instruction tuning by finely aggregating overlapping weights and using a proximal perturbation initiali…

Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models

26 September 2024·2327 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Microsoft Research

LLMs’ reasoning abilities are assessed via a novel framework that leverages probabilities of causation, revealing that while capable, their understanding of causality falls short of human-level reason…

Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

26 September 2024·2914 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Department of Computer Science, University of Chicago

LLMs’ fact retrieval is easily manipulated by context, highlighting their associative memory behavior; this paper studies this with transformers, showing how self-attention and value matrices support …

Do LLMs Build World Representations? Probing Through the Lens of State Abstraction

26 September 2024·2243 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Mila, McGill University

LLMs prioritize task completion over full world-state understanding by using goal-oriented abstractions.

DLAD: Improving Logits-based Detector without Logits from Black-box LLMs

26 September 2024·2559 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 MBZUAI

DALD: A novel framework for black-box LLM text detection, achieving state-of-the-art performance without relying on source model logits, by aligning surrogate model distributions.

Divergences between Language Models and Human Brains

26 September 2024·2519 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Language models struggle with social/emotional intelligence and physical commonsense, unlike human brains. Fine-tuning models on these aspects improves their brain response prediction accuracy.

Distributional Preference Alignment of LLMs via Optimal Transport

26 September 2024·2204 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 IBM Research

LLMs are aligned to human preferences distributionally using Optimal Transport, achieving state-of-the-art performance.

DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models

26 September 2024·3179 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Samsung Research

DISP-LLM: A novel dimension-independent structural pruning method for LLMs achieves accuracy similar to semi-structural pruning while improving flexibility and efficiency, outperforming state-of-the-a…

Discrete Modeling via Boundary Conditional Diffusion Processes

26 September 2024·2908 words·14 mins· loading · loading

AI Generated Natural Language Processing Text Generation 🏢 Harbin Institute of Technology

Bridging the gap between continuous diffusion models and discrete data, this work introduces a novel boundary-conditional approach achieving superior performance in language modeling and image generat…

Discovery of the Hidden World with Large Language Models

26 September 2024·6303 words·30 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hong Kong Baptist University

COAT leverages LLMs to identify high-level causal factors from unstructured data, enabling causal discovery in real-world scenarios where well-defined variables are lacking.

Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models

26 September 2024·1939 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.

Discovering Preference Optimization Algorithms with and for Large Language Models

26 September 2024·4948 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Sakana AI

LLMs discover novel offline preference optimization algorithms, achieving state-of-the-art performance on various tasks.

Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models

26 September 2024·2902 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Diffusion-of-Thought (DoT) boosts reasoning in diffusion language models by enabling parallel reasoning steps, outperforming larger autoregressive models in speed and accuracy.

DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation

26 September 2024·2806 words·14 mins· loading · loading

AI Generated Natural Language Processing Machine Translation 🏢 Johns Hopkins University

DIFFNORM boosts non-autoregressive speech-to-speech translation by normalizing speech data with a diffusion model and classifier-free guidance, achieving significant quality improvements.

Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models

26 September 2024·2220 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University

Diff-eRank: A novel rank-based metric assessing LLMs’ efficiency in eliminating redundant information during training, showing improved correlation with model size and performance.

DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion

26 September 2024·2779 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Baidu Inc.

Decoupled-Head Attention (DHA) drastically cuts LLM inference costs by adaptively sharing key/value heads, achieving 97.6% of original performance with only 0.25% pre-training.