Large Language Models

BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling

26 September 2024·2612 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Department of Statistics, University of Chicago

BoNBON alignment optimizes large language model (LLM) outputs towards human preferences using best-of-n sampling, maximizing win-rate against base models with minimal off-target impact.

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

26 September 2024·3160 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Rutgers University

BLoB: Bayesian Low-Rank Adaptation by Backpropagation enhances LLMs by jointly tuning mean and covariance of parameters during fine-tuning, improving uncertainty estimation and generalization.

BitDelta: Your Fine-Tune May Only Be Worth One Bit

26 September 2024·2156 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 MIT

BitDelta drastically shrinks fine-tuned LLMs by quantizing their weight deltas to just one bit, achieving 10x memory reduction and latency improvements without sacrificing performance.

BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens

26 September 2024·2196 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Purdue University

BISCOPE: AI-generated text detection using a novel bidirectional method that outperforms existing techniques by leveraging both prediction and memorization of preceding tokens.

Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature

26 September 2024·2033 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Northeastern University

Bileve: a novel bi-level signature secures text provenance in LLMs against spoofing, enhancing detectability and reliability via fine-grained integrity checks and coarse-grained source tracing.

Bias Amplification in Language Model Evolution: An Iterated Learning Perspective

26 September 2024·3378 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UBC

LLMs’ iterative interactions amplify subtle biases; this paper uses a Bayesian Iterated Learning framework to explain this phenomenon and offers strategies to guide LLM evolution.

BERTs are Generative In-Context Learners

26 September 2024·2108 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Language Technology Group, University of Oslo

Masked language models can perform in-context learning, challenging the dominance of causal models in this area.

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

26 September 2024·2673 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.

Base of RoPE Bounds Context Length

26 September 2024·3176 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Baichuan Inc.

LLM long-context ability is fundamentally limited by RoPE’s base parameter, which determines an absolute lower bound for achievable context length.

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

26 September 2024·1807 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Oxford

BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.

BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

26 September 2024·2359 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong, Shenzhen

BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…

BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment

26 September 2024·2859 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Wisconsin-Madison

BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…

B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory

26 September 2024·1821 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 AWS AI Labs

B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…

AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

26 September 2024·3104 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

26 September 2024·5046 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.

AutoSurvey: Large Language Models Can Automatically Write Surveys

26 September 2024·2587 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…

AutoPSV: Automated Process-Supervised Verifier

26 September 2024·2548 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

AutoPSV automates process annotation for LLMs, improving reasoning by detecting confidence shifts in reasoning steps, thus efficiently enhancing model performance.

AutoMix: Automatically Mixing Language Models

26 September 2024·2953 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

AutoMix intelligently routes queries to different-sized LLMs based on a smaller model’s self-verification, minimizing cost while maintaining performance.

AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning

26 September 2024·2527 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hangzhou Dianzi University

LLM agents can now autonomously build environmental understanding via interactive learning, generating human-readable instruction manuals that boost task success rates.

AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents

26 September 2024·2274 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Michigan

AutoGuide: Automated generation of context-aware guidelines significantly improves LLM agent performance in unfamiliar domains.