Natural Language Processing
Protecting Your LLMs with Information Bottleneck
·2699 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Microsoft Research
IBProtector shields LLMs from harmful outputs via prompt compression, selectively preserving essential information using a trainable extractor.
Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation
·2063 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Rice University
LoPA: a novel parameter-efficient fine-tuning method matches state-of-the-art performance while requiring no server-side adapters, improving upon traditional prompt tuning.
Probing the Decision Boundaries of In-context Learning in Large Language Models
·3963 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Los Angeles
LLMs’ in-context learning, though effective, exhibits surprisingly irregular decision boundaries, hindering generalization; this paper reveals this issue and proposes methods to improve smoothness via…
Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach
·3286 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Text Generation
🏢 Department of Mathematical and Statistical Sciences, University of Alberta, Canada
ChatGPT amplifies gender bias in job applications, revealing AI’s potential to worsen labor market inequality.
Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context
·1358 words·7 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of California, Berkeley
Pretrained transformers surprisingly learn low-dimensional nonlinear functions efficiently from few in-context examples, outperforming baseline algorithms.
Preference Learning Algorithms Do Not Learn Preference Rankings
·2930 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 New York University
Despite common belief, state-of-the-art preference learning algorithms for LLMs achieve surprisingly low ranking accuracy, highlighting significant flaws in current alignment techniques.
Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning
·3127 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Machine Translation
🏢 Microsoft Research
PCformer boosts Transformer performance by using a predictor-corrector learning framework and exponential moving average coefficient learning for high-order prediction, achieving state-of-the-art resu…
Prediction-Powered Ranking of Large Language Models
·4368 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Max Planck Institute for Software Systems
This paper presents a novel statistical framework for ranking LLMs using pairwise comparisons, accounting for the uncertainty introduced when using an LLM instead of human preferences. The framework …
Predicting the Performance of Foundation Models via Agreement-on-the-Line
·4845 words·23 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
Foundation model OOD performance prediction is reliably achieved via ensemble diversity, especially through random linear head initialization, enabling precise estimations without extensive OOD labels…
Pre-trained Large Language Models Use Fourier Features to Compute Addition
·7726 words·37 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Los Angeles
Pre-trained LLMs surprisingly use Fourier features to perform addition, with MLP layers approximating magnitude and attention layers handling modular arithmetic; this mechanism requires pre-training.
Policy Improvement using Language Feedback Models
·3358 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Microsoft Research
Boosting AI instruction following, Language Feedback Models (LFMs) leverage Large Language Models (LLMs) to identify desirable behaviors from visual trajectories, significantly improving task completi…
Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs
·2083 words·10 mins·
loading
·
loading
Natural Language Processing
Question Answering
🏢 Alibaba Cloud Computing
Plan-on-Graph (PoG) revolutionizes KG-augmented LLMs with a self-correcting adaptive planning paradigm, enabling more efficient and accurate reasoning over knowledge graphs by dynamically adjusting ex…
Pipeline Parallelism with Controllable Memory
·3116 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Sea AI Lab
New pipeline parallelism framework achieves up to 55% higher throughput and 50% less memory usage in large language model training by systematically controlling activation memory.
PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation
·3549 words·17 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Zhejiang University
PhyloGen uses a genomic language model to generate and optimize phylogenetic trees, offering faster and more accurate evolutionary analysis than traditional methods.
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
·4408 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Pennsylvania State University
Bi-directional Preference Optimization (BiPO) generates superior steering vectors for personalized LLM control, improving upon existing methods by directly influencing the generation probability of hu…
Perplexity-aware Correction for Robust Alignment with Noisy Preferences
·3067 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Shandong University
PerpCorrect: Robust LLM alignment despite noisy human preferences, achieved via perplexity-based noisy preference detection and correction.
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
·2033 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 College of Computer, National University of Defense Technology
This study reveals that large language models struggle with semi-open-ended questions, often hallucinating or providing insufficient answers. Researchers explored this by creating a new dataset of su…
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
·1920 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Academy for Engineering and Technology, Fudan University
PediatricsGPT: a novel Chinese pediatric LLM assistant trained on a large, high-quality dataset (PedCorpus) outperforms existing models, paving the way for improved pediatric healthcare.
Parameter Competition Balancing for Model Merging
·3629 words·18 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Harbin Institute of Technology
PCB-MERGING: A training-free model merging technique boosts performance by intelligently balancing parameter competition across multiple tasks.
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
·1639 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 MIT
DeltaNet, a linear transformer boosting associative recall, now trains efficiently via a novel algorithm, scaling to large language models and outperforming existing linear baselines.