Skip to main content

Large Language Models

BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
·2612 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Department of Statistics, University of Chicago
BoNBON alignment optimizes large language model (LLM) outputs towards human preferences using best-of-n sampling, maximizing win-rate against base models with minimal off-target impact.
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
·3160 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Rutgers University
BLoB: Bayesian Low-Rank Adaptation by Backpropagation enhances LLMs by jointly tuning mean and covariance of parameters during fine-tuning, improving uncertainty estimation and generalization.
BitDelta: Your Fine-Tune May Only Be Worth One Bit
·2156 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 MIT
BitDelta drastically shrinks fine-tuned LLMs by quantizing their weight deltas to just one bit, achieving 10x memory reduction and latency improvements without sacrificing performance.
BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
·2196 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Purdue University
BISCOPE: AI-generated text detection using a novel bidirectional method that outperforms existing techniques by leveraging both prediction and memorization of preceding tokens.
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
·2033 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Northeastern University
Bileve: a novel bi-level signature secures text provenance in LLMs against spoofing, enhancing detectability and reliability via fine-grained integrity checks and coarse-grained source tracing.
Bias Amplification in Language Model Evolution: An Iterated Learning Perspective
·3378 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏢 UBC
LLMs’ iterative interactions amplify subtle biases; this paper uses a Bayesian Iterated Learning framework to explain this phenomenon and offers strategies to guide LLM evolution.
BERTs are Generative In-Context Learners
·2108 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Language Technology Group, University of Oslo
Masked language models can perform in-context learning, challenging the dominance of causal models in this area.
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
·2673 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland
Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.
Base of RoPE Bounds Context Length
·3176 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Baichuan Inc.
LLM long-context ability is fundamentally limited by RoPE’s base parameter, which determines an absolute lower bound for achievable context length.
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
·1807 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Oxford
BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
·2359 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong, Shenzhen
BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…
BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
·2859 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Wisconsin-Madison
BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
·1821 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 AWS AI Labs
B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
·3104 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Stanford University
AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.
AutoTimes: Autoregressive Time Series Forecasters via Large Language Models
·5046 words·24 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University
AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.
AutoSurvey: Large Language Models Can Automatically Write Surveys
·2587 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Peking University
AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…
AutoPSV: Automated Process-Supervised Verifier
·2548 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Hong Kong
AutoPSV automates process annotation for LLMs, improving reasoning by detecting confidence shifts in reasoning steps, thus efficiently enhancing model performance.
AutoMix: Automatically Mixing Language Models
·2953 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Carnegie Mellon University
AutoMix intelligently routes queries to different-sized LLMs based on a smaller model’s self-verification, minimizing cost while maintaining performance.
AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning
·2527 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Hangzhou Dianzi University
LLM agents can now autonomously build environmental understanding via interactive learning, generating human-readable instruction manuals that boost task success rates.
AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents
·2274 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Michigan
AutoGuide: Automated generation of context-aware guidelines significantly improves LLM agent performance in unfamiliar domains.