Large Language Models
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
·4652 words·22 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Meta AI
Rainbow Teaming: a novel black-box approach generates diverse adversarial prompts to enhance LLM robustness and safety, achieving over 90% attack success rate across various models.
Questioning the Survey Responses of Large Language Models
·2706 words·13 mins·
loading
·
loading
Large Language Models
🏢 Max Planck Institute for Intelligent Systems
LLM survey responses are systematically biased, often masking genuine model capabilities and leading to misleading alignment conclusions.
Query-Based Adversarial Prompt Generation
·1773 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Washington
Researchers developed a query-based attack that generates adversarial prompts, fooling language models into producing harmful outputs with significantly higher success rates than previous methods, eff…
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
·3782 words·18 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 ETH Zurich
QuaRot: Revolutionizing 4-bit LLM inference with lossless quantization via rotation!
Quantifying the Gain in Weak-to-Strong Generalization
·2368 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
Weakly supervised strong models outperform weak models; this gain is precisely quantified by the strong model’s misfit error on weak labels.
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
·3333 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 MIT
QuanTA: Quantum-inspired Tensor Adaptation efficiently fine-tunes LLMs with high-rank updates, surpassing low-rank methods like LoRA for complex tasks while minimizing additional parameters.
QTIP: Quantization with Trellises and Incoherence Processing
·2586 words·13 mins·
loading
·
loading
Large Language Models
🏢 Cornell University
QTIP: Ultra-high dimensional LLM quantization using trellis codes for faster, higher-quality inference.
QBB: Quantization with Binary Bases for LLMs
·1816 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Samsung AI Cambridge
QBB: A novel post-training quantization method for LLMs dramatically improves efficiency by replacing multiplications with summations, achieving state-of-the-art results with minimal accuracy loss.
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression
·3701 words·18 mins·
loading
·
loading
Large Language Models
🏢 Yandex
PV-Tuning achieves new state-of-the-art in extreme LLM compression by going beyond traditional straight-through estimators (STE). This novel framework provides a more accurate and efficient fine-tunin…
Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning
·1305 words·7 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Department of Computer Science, City University of Hong Kong
Transformers excel at in-context learning (ICL), solving new tasks with just prompts. This paper provides a mathematical explanation, showing how transformers use multi-concept word semantics to achie…
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
·1760 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Northwestern University
RLHF’s overoptimization problem is mitigated by RPO, a novel algorithm that uses SFT loss as an implicit adversarial regularizer, ensuring efficient and effective LLM alignment.
Protecting Your LLMs with Information Bottleneck
·2699 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Microsoft Research
IBProtector shields LLMs from harmful outputs via prompt compression, selectively preserving essential information using a trainable extractor.
Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation
·2063 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Rice University
LoPA: a novel parameter-efficient fine-tuning method matches state-of-the-art performance while requiring no server-side adapters, improving upon traditional prompt tuning.
Probing the Decision Boundaries of In-context Learning in Large Language Models
·3963 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Los Angeles
LLMs’ in-context learning, though effective, exhibits surprisingly irregular decision boundaries, hindering generalization; this paper reveals this issue and proposes methods to improve smoothness via…
Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context
·1358 words·7 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of California, Berkeley
Pretrained transformers surprisingly learn low-dimensional nonlinear functions efficiently from few in-context examples, outperforming baseline algorithms.
Preference Learning Algorithms Do Not Learn Preference Rankings
·2930 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 New York University
Despite common belief, state-of-the-art preference learning algorithms for LLMs achieve surprisingly low ranking accuracy, highlighting significant flaws in current alignment techniques.
Prediction-Powered Ranking of Large Language Models
·4368 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Max Planck Institute for Software Systems
This paper presents a novel statistical framework for ranking LLMs using pairwise comparisons, accounting for the uncertainty introduced when using an LLM instead of human preferences. The framework …
Predicting the Performance of Foundation Models via Agreement-on-the-Line
·4845 words·23 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
Foundation model OOD performance prediction is reliably achieved via ensemble diversity, especially through random linear head initialization, enabling precise estimations without extensive OOD labels…
Pre-trained Large Language Models Use Fourier Features to Compute Addition
·7726 words·37 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Los Angeles
Pre-trained LLMs surprisingly use Fourier features to perform addition, with MLP layers approximating magnitude and attention layers handling modular arithmetic; this mechanism requires pre-training.
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
·3011 words·15 mins·
loading
·
loading
Large Language Models
🏢 Nanjing University
Researchers developed Policy Learning from tutorial Books (PLfB), a novel method that trains AI agents using knowledge from tutorial books instead of relying solely on real-world data.