Large Language Models

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

26 September 2024·4652 words·22 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Meta AI

Rainbow Teaming: a novel black-box approach generates diverse adversarial prompts to enhance LLM robustness and safety, achieving over 90% attack success rate across various models.

Questioning the Survey Responses of Large Language Models

26 September 2024·2706 words·13 mins· loading · loading

Large Language Models 🏢 Max Planck Institute for Intelligent Systems

LLM survey responses are systematically biased, often masking genuine model capabilities and leading to misleading alignment conclusions.

Query-Based Adversarial Prompt Generation

26 September 2024·1773 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Washington

Researchers developed a query-based attack that generates adversarial prompts, fooling language models into producing harmful outputs with significantly higher success rates than previous methods, eff…

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

26 September 2024·3782 words·18 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 ETH Zurich

QuaRot: Revolutionizing 4-bit LLM inference with lossless quantization via rotation!

Quantifying the Gain in Weak-to-Strong Generalization

26 September 2024·2368 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

Weakly supervised strong models outperform weak models; this gain is precisely quantified by the strong model’s misfit error on weak labels.

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

26 September 2024·3333 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 MIT

QuanTA: Quantum-inspired Tensor Adaptation efficiently fine-tunes LLMs with high-rank updates, surpassing low-rank methods like LoRA for complex tasks while minimizing additional parameters.

QTIP: Quantization with Trellises and Incoherence Processing

26 September 2024·2586 words·13 mins· loading · loading

Large Language Models 🏢 Cornell University

QTIP: Ultra-high dimensional LLM quantization using trellis codes for faster, higher-quality inference.

QBB: Quantization with Binary Bases for LLMs

26 September 2024·1816 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Samsung AI Cambridge

QBB: A novel post-training quantization method for LLMs dramatically improves efficiency by replacing multiplications with summations, achieving state-of-the-art results with minimal accuracy loss.

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression

26 September 2024·3701 words·18 mins· loading · loading

Large Language Models 🏢 Yandex

PV-Tuning achieves new state-of-the-art in extreme LLM compression by going beyond traditional straight-through estimators (STE). This novel framework provides a more accurate and efficient fine-tunin…

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning

26 September 2024·1305 words·7 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Department of Computer Science, City University of Hong Kong

Transformers excel at in-context learning (ICL), solving new tasks with just prompts. This paper provides a mathematical explanation, showing how transformers use multi-concept word semantics to achie…

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer

26 September 2024·1760 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Northwestern University

RLHF’s overoptimization problem is mitigated by RPO, a novel algorithm that uses SFT loss as an implicit adversarial regularizer, ensuring efficient and effective LLM alignment.

Protecting Your LLMs with Information Bottleneck

26 September 2024·2699 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Microsoft Research

IBProtector shields LLMs from harmful outputs via prompt compression, selectively preserving essential information using a trainable extractor.

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

26 September 2024·2063 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Rice University

LoPA: a novel parameter-efficient fine-tuning method matches state-of-the-art performance while requiring no server-side adapters, improving upon traditional prompt tuning.

Probing the Decision Boundaries of In-context Learning in Large Language Models

26 September 2024·3963 words·19 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC Los Angeles

LLMs’ in-context learning, though effective, exhibits surprisingly irregular decision boundaries, hindering generalization; this paper reveals this issue and proposes methods to improve smoothness via…

Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context

26 September 2024·1358 words·7 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of California, Berkeley

Pretrained transformers surprisingly learn low-dimensional nonlinear functions efficiently from few in-context examples, outperforming baseline algorithms.

Preference Learning Algorithms Do Not Learn Preference Rankings

26 September 2024·2930 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 New York University

Despite common belief, state-of-the-art preference learning algorithms for LLMs achieve surprisingly low ranking accuracy, highlighting significant flaws in current alignment techniques.

Prediction-Powered Ranking of Large Language Models

26 September 2024·4368 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Max Planck Institute for Software Systems

This paper presents a novel statistical framework for ranking LLMs using pairwise comparisons, accounting for the uncertainty introduced when using an LLM instead of human preferences. The framework …

Predicting the Performance of Foundation Models via Agreement-on-the-Line

26 September 2024·4845 words·23 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Foundation model OOD performance prediction is reliably achieved via ensemble diversity, especially through random linear head initialization, enabling precise estimations without extensive OOD labels…

Pre-trained Large Language Models Use Fourier Features to Compute Addition

26 September 2024·7726 words·37 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC Los Angeles

Pre-trained LLMs surprisingly use Fourier features to perform addition, with MLP layers approximating magnitude and attention layers handling modular arithmetic; this mechanism requires pre-training.

Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

26 September 2024·3011 words·15 mins· loading · loading

Large Language Models 🏢 Nanjing University

Researchers developed Policy Learning from tutorial Books (PLfB), a novel method that trains AI agents using knowledge from tutorial books instead of relying solely on real-world data.