Natural Language Processing

Protecting Your LLMs with Information Bottleneck

26 September 2024·2699 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Microsoft Research

IBProtector shields LLMs from harmful outputs via prompt compression, selectively preserving essential information using a trainable extractor.

Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

26 September 2024·2063 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Rice University

LoPA: a novel parameter-efficient fine-tuning method matches state-of-the-art performance while requiring no server-side adapters, improving upon traditional prompt tuning.

Probing the Decision Boundaries of In-context Learning in Large Language Models

26 September 2024·3963 words·19 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC Los Angeles

LLMs’ in-context learning, though effective, exhibits surprisingly irregular decision boundaries, hindering generalization; this paper reveals this issue and proposes methods to improve smoothness via…

Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach

26 September 2024·3286 words·16 mins· loading · loading

AI Generated Natural Language Processing Text Generation 🏢 Department of Mathematical and Statistical Sciences, University of Alberta, Canada

ChatGPT amplifies gender bias in job applications, revealing AI’s potential to worsen labor market inequality.

Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context

26 September 2024·1358 words·7 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of California, Berkeley

Pretrained transformers surprisingly learn low-dimensional nonlinear functions efficiently from few in-context examples, outperforming baseline algorithms.

Preference Learning Algorithms Do Not Learn Preference Rankings

26 September 2024·2930 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 New York University

Despite common belief, state-of-the-art preference learning algorithms for LLMs achieve surprisingly low ranking accuracy, highlighting significant flaws in current alignment techniques.

Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning

26 September 2024·3127 words·15 mins· loading · loading

AI Generated Natural Language Processing Machine Translation 🏢 Microsoft Research

PCformer boosts Transformer performance by using a predictor-corrector learning framework and exponential moving average coefficient learning for high-order prediction, achieving state-of-the-art resu…

Prediction-Powered Ranking of Large Language Models

26 September 2024·4368 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Max Planck Institute for Software Systems

This paper presents a novel statistical framework for ranking LLMs using pairwise comparisons, accounting for the uncertainty introduced when using an LLM instead of human preferences. The framework …

Predicting the Performance of Foundation Models via Agreement-on-the-Line

26 September 2024·4845 words·23 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Foundation model OOD performance prediction is reliably achieved via ensemble diversity, especially through random linear head initialization, enabling precise estimations without extensive OOD labels…

Pre-trained Large Language Models Use Fourier Features to Compute Addition

26 September 2024·7726 words·37 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC Los Angeles

Pre-trained LLMs surprisingly use Fourier features to perform addition, with MLP layers approximating magnitude and attention layers handling modular arithmetic; this mechanism requires pre-training.

Policy Improvement using Language Feedback Models

26 September 2024·3358 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Microsoft Research

Boosting AI instruction following, Language Feedback Models (LFMs) leverage Large Language Models (LLMs) to identify desirable behaviors from visual trajectories, significantly improving task completi…

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs

26 September 2024·2083 words·10 mins· loading · loading

Natural Language Processing Question Answering 🏢 Alibaba Cloud Computing

Plan-on-Graph (PoG) revolutionizes KG-augmented LLMs with a self-correcting adaptive planning paradigm, enabling more efficient and accurate reasoning over knowledge graphs by dynamically adjusting ex…

Pipeline Parallelism with Controllable Memory

26 September 2024·3116 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Sea AI Lab

New pipeline parallelism framework achieves up to 55% higher throughput and 50% less memory usage in large language model training by systematically controlling activation memory.

PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation

26 September 2024·3549 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Zhejiang University

PhyloGen uses a genomic language model to generate and optimize phylogenetic trees, offering faster and more accurate evolutionary analysis than traditional methods.

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization

26 September 2024·4408 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Pennsylvania State University

Bi-directional Preference Optimization (BiPO) generates superior steering vectors for personalized LLM control, improving upon existing methods by directly influencing the generation probability of hu…

Perplexity-aware Correction for Robust Alignment with Noisy Preferences

26 September 2024·3067 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Shandong University

PerpCorrect: Robust LLM alignment despite noisy human preferences, achieved via perplexity-based noisy preference detection and correction.

Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

26 September 2024·2033 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 College of Computer, National University of Defense Technology

This study reveals that large language models struggle with semi-open-ended questions, often hallucinating or providing insufficient answers. Researchers explored this by creating a new dataset of su…

PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications

26 September 2024·1920 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Academy for Engineering and Technology, Fudan University

PediatricsGPT: a novel Chinese pediatric LLM assistant trained on a large, high-quality dataset (PedCorpus) outperforms existing models, paving the way for improved pediatric healthcare.

Parameter Competition Balancing for Model Merging

26 September 2024·3629 words·18 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Harbin Institute of Technology

PCB-MERGING: A training-free model merging technique boosts performance by intelligently balancing parameter competition across multiple tasks.

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

26 September 2024·1639 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 MIT

DeltaNet, a linear transformer boosting associative recall, now trains efficiently via a novel algorithm, scaling to large language models and outperforming existing linear baselines.