Natural Language Processing
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
·2740 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 ByteDance
DeTeCtive: a novel multi-task contrastive learning framework, achieves state-of-the-art AI-generated text detection by distinguishing diverse writing styles instead of simple binary classification.
DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning
·3087 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 National University of Singapore
DETAIL: A novel attribution method reveals the impact of individual demonstrations in in-context learning, boosting interpretability and improving transformer-based model performance.
DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
·3222 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 EPFL
DenseFormer enhances transformers by adding a depth-weighted averaging step, improving data efficiency and outperforming baselines in memory and inference time without increasing model size.
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
·3631 words·18 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Zhejiang University
Large language models struggle to generalize knowledge when facing seemingly simple reversals, a phenomenon termed the ‘reversal curse.’ This study reveals that this limitation is strongly linked to t…
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
·2535 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Peking University
Delta-CoMe: Training-free mixed-precision delta compression boosts LLM deployment efficiency.
DeiSAM: Segment Anything with Deictic Prompting
·3865 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 Technical University of Darmstadt
DeiSAM uses large language models and differentiable logic to achieve highly accurate image segmentation using complex, context-dependent descriptions.
Deep Bayesian Active Learning for Preference Modeling in Large Language Models
·2339 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Oxford
BAL-PM, a novel active learning approach, drastically reduces human feedback in LLM preference modeling by leveraging both model uncertainty and prompt distribution diversity, achieving 33%-68% fewer …
Decoding-Time Language Model Alignment with Multiple Objectives
·3392 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Tsinghua University
Multi-objective decoding (MOD) efficiently aligns language models to diverse user needs by decoding the next token from a weighted combination of predictions from multiple base models trained on indiv…
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context
·2519 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Illinois at Urbana-Champaign
New framework reveals LLMs’ human-like decision-making tendencies but highlights significant variations and biases influenced by demographic factors, underscoring ethical deployment needs.
DDK: Distilling Domain Knowledge for Efficient Large Language Models
·2140 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Taobao & Tmall Group of Alibaba
DDK: Dynamically Distilling Domain Knowledge for efficient LLMs.
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
·3234 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Apple
This paper introduces dataset decomposition (DD), a novel approach to accelerate LLM training while enhancing performance. DD significantly reduces training time by decomposing datasets into buckets …
Data-Efficient Learning with Neural Programs
·2234 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Pennsylvania
ISED: a novel, data-efficient algorithm learns neural programs by sampling from neural predictions to estimate gradients of black-box components, outperforming baselines on various benchmarks.
Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions
·3904 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Washington
Researchers uncover hidden training data secrets of large language models by analyzing their byte-pair encoding tokenizers, revealing the proportions of different languages and domains.
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
·1939 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Tsinghua University
DART-Math tackles LLM limitations in mathematical problem-solving by introducing Difficulty-Aware Rejection Tuning, a novel method that generates high-quality, bias-reduced datasets, resulting in supe…
DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
·3306 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Dartmouth College
DARG dynamically evaluates LLMs via adaptive reasoning graphs, revealing performance drops with increased complexity and exposing model biases.
DAPE: Data-Adaptive Positional Encoding for Length Extrapolation
·3365 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 CUHK
DAPE: A novel data-adaptive positional encoding method dynamically adjusts positional information based on input context, improving transformer performance and length generalization.
DAGER: Exact Gradient Inversion for Large Language Models
·2286 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 INSAIT
DAGER: Exact gradient inversion for LLMs; recovers full input text batches precisely.
D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models
·2704 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Huawei Technologies Co., Ltd.
D-LLM dynamically allocates computing resources during LLM token processing, reducing computational costs and memory usage by up to 50% without sacrificing accuracy.
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
·3930 words·19 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Taobao & Tmall Group of Alibaba
New D-CPT Law optimizes continual pre-training for LLMs by predicting optimal data mixture ratios, drastically cutting training costs.
Customizing Language Models with Instance-wise LoRA for Sequential Recommendation
·1854 words·9 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Science and Technology of China
Instance-wise LoRA (iLoRA) boosts LLM sequential recommendation accuracy by customizing model parameters for each user, mitigating negative transfer and improving performance.