Large Language Models
Doing Experiments and Revising Rules with Natural Language and Probabilistic Reasoning
·3039 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Cornell University
This paper introduces ActiveACRE, a model that uses LLMs and probabilistic inference to infer natural language rules through online experimentation, demonstrating higher accuracy than existing methods…
DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting
·2536 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Nanjing University of Science and Technology
DoFIT: A novel domain-aware framework significantly reduces catastrophic forgetting in federated instruction tuning by finely aggregating overlapping weights and using a proximal perturbation initiali…
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models
·2327 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Microsoft Research
LLMs’ reasoning abilities are assessed via a novel framework that leverages probabilities of causation, revealing that while capable, their understanding of causality falls short of human-level reason…
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
·2914 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Department of Computer Science, University of Chicago
LLMs’ fact retrieval is easily manipulated by context, highlighting their associative memory behavior; this paper studies this with transformers, showing how self-attention and value matrices support …
Do LLMs Build World Representations? Probing Through the Lens of State Abstraction
·2243 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Mila, McGill University
LLMs prioritize task completion over full world-state understanding by using goal-oriented abstractions.
DLAD: Improving Logits-based Detector without Logits from Black-box LLMs
·2559 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 MBZUAI
DALD: A novel framework for black-box LLM text detection, achieving state-of-the-art performance without relying on source model logits, by aligning surrogate model distributions.
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
·2382 words·12 mins·
loading
·
loading
Large Language Models
🏢 Harbin Institute of Technology
FUNCODER: a novel code generation framework that uses a divide-and-conquer approach with functional consensus to generate code that meets complex requirements.
Divergences between Language Models and Human Brains
·2519 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
Language models struggle with social/emotional intelligence and physical commonsense, unlike human brains. Fine-tuning models on these aspects improves their brain response prediction accuracy.
Distributional Preference Alignment of LLMs via Optimal Transport
·2204 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 IBM Research
LLMs are aligned to human preferences distributionally using Optimal Transport, achieving state-of-the-art performance.
DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models
·3179 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Samsung Research
DISP-LLM: A novel dimension-independent structural pruning method for LLMs achieves accuracy similar to semi-structural pruning while improving flexibility and efficiency, outperforming state-of-the-a…
Discrete Flow Matching
·2076 words·10 mins·
loading
·
loading
Large Language Models
🏢 Meta FAIR
Discrete Flow Matching (DFM) revolutionizes discrete data generation by introducing a novel flow paradigm that surpasses existing methods. DFM leverages flexible probability paths, enabling efficient …
Discovery of the Hidden World with Large Language Models
·6303 words·30 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Hong Kong Baptist University
COAT leverages LLMs to identify high-level causal factors from unstructured data, enabling causal discovery in real-world scenarios where well-defined variables are lacking.
Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
·1939 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Hong Kong University of Science and Technology
DSA, a novel automated framework, discovers optimal sparsity allocation for layer-wise LLM pruning, achieving significant performance gains across various models and tasks.
Discovering Preference Optimization Algorithms with and for Large Language Models
·4948 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Sakana AI
LLMs discover novel offline preference optimization algorithms, achieving state-of-the-art performance on various tasks.
Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models
·2902 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Tencent AI Lab
Diffusion-of-Thought (DoT) boosts reasoning in diffusion language models by enabling parallel reasoning steps, outperforming larger autoregressive models in speed and accuracy.
Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models
·2220 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University
Diff-eRank: A novel rank-based metric assessing LLMs’ efficiency in eliminating redundant information during training, showing improved correlation with model size and performance.
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
·2779 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Baidu Inc.
Decoupled-Head Attention (DHA) drastically cuts LLM inference costs by adaptively sharing key/value heads, achieving 97.6% of original performance with only 0.25% pre-training.
DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
·2333 words·11 mins·
loading
·
loading
Large Language Models
🏢 University of Mannheim
DeTikZify: AI synthesizes publication-ready scientific figures from sketches and existing figures, automatically generating semantically-preserving TikZ code.
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
·2740 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 ByteDance
DeTeCtive: a novel multi-task contrastive learning framework, achieves state-of-the-art AI-generated text detection by distinguishing diverse writing styles instead of simple binary classification.
DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning
·3087 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 National University of Singapore
DETAIL: A novel attribution method reveals the impact of individual demonstrations in in-context learning, boosting interpretability and improving transformer-based model performance.