Natural Language Processing

Aligning Large Language Models with Representation Editing: A Control Perspective

26 September 2024·2249 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Cornell University

RE-Control: Aligning LLMs via dynamic representation editing using optimal control theory, achieving superior alignment with significantly fewer resources than fine-tuning.

ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation

26 September 2024·2033 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 National University of Singapore

ALI-Agent uses LLM-powered agents for in-depth, adaptive assessment of LLMs’ alignment with human values, overcoming limitations of existing static benchmarks.

Algorithmic progress in language models

26 September 2024·4934 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 MIT FutureTech

Language model algorithms have improved drastically, halving compute needs every 8 months since 2012, surpassing Moore’s Law; however, compute scaling, not algorithms, drove most recent performance ga…

Algorithmic Capabilities of Random Transformers

26 September 2024·3079 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 MIT

Randomly initialized transformers, with only embedding layers optimized, surprisingly excel at various algorithmic tasks, revealing inherent capabilities even before training.

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

26 September 2024·2613 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tongji University

AlchemistCoder enhances code LLMs by pioneering hindsight tuning on multi-source data, harmonizing conflicting styles via AlchemistPrompts, and achieving state-of-the-art performance.

AGILE: A Novel Reinforcement Learning Framework of LLM Agents

26 September 2024·5046 words·24 mins· loading · loading

AI Generated Natural Language Processing Question Answering 🏢 ByteDance Research

AGILE, a novel reinforcement learning framework, significantly enhances LLM agents’ performance on complex conversational tasks by integrating memory, tools, expert interactions, and reflection, outpe…

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases

26 September 2024·2659 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Chicago

AGENTPOISON: A novel backdoor attack compromises LLM agents by poisoning their memory or knowledge bases, achieving high success rates with minimal performance impact.

Agent Planning with World Knowledge Model

26 September 2024·2981 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Zhejiang University

This paper introduces a parametric World Knowledge Model (WKM) to improve AI agent planning by integrating both global task knowledge and dynamic state knowledge, thereby overcoming current LLMs’ limi…

Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models

26 September 2024·1740 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Peking University

Adversarial Representation Engineering (ARE) offers a unified, interpretable approach for editing large language models (LLMs) by using a representation sensor as an editing oracle, enhancing model sa…

Adversarial Moment-Matching Distillation of Large Language Models

26 September 2024·2972 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 SI-TECH Information Technology

Boosting LLM efficiency, this study introduces adversarial moment-matching distillation, outperforming existing methods by matching action-value moments for superior knowledge transfer and achieving s…

Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees

26 September 2024·1987 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 National Key Laboratory for Novel Software Technology, Nanjing University

TP-LLaMA boosts tool-augmented LLMs by optimizing inference trajectories using preference learning from both successful and failed attempts, achieving superior performance and efficiency.

Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models

26 September 2024·2348 words·12 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 Greater Bay Area Institute for Innovation, Hunan University

RAIL, a novel continual learning method for vision-language models, tackles catastrophic forgetting and maintains zero-shot abilities without domain-identity hints or reference data. Using a recursiv…

Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment

26 September 2024·2979 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Birmingham

Adaptive Layer Sparsity (ALS) revolutionizes large language model (LLM) compression by intelligently pruning less important layers, achieving significant size reduction without performance loss. It o…

Adaptable Logical Control for Large Language Models

26 September 2024·2047 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UCLA

Ctrl-G: A neuro-symbolic framework enables adaptable control of LLM generation by combining any LLM with a Hidden Markov Model (HMM), ensuring outputs adhere to logical constraints specified as determ…

AdaNovo: Towards Robust mph{De Novo} Peptide Sequencing in Proteomics against Data Biases

26 September 2024·1791 words·9 mins· loading · loading

Natural Language Processing Text Generation 🏢 Westlake University

AdaNovo tackles data biases in de novo peptide sequencing by using Conditional Mutual Information, significantly improving PTM identification and overall accuracy.

Ad Auctions for LLMs via Retrieval Augmented Generation

26 September 2024·2337 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

This paper introduces segment auctions, maximizing logarithmic social welfare, for integrating ads into LLM outputs via Retrieval Augmented Generation, balancing ad revenue and output quality.

Accuracy is Not All You Need

26 September 2024·5583 words·27 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Microsoft Research

LLM compression accuracy hides crucial behavioral changes; use % flips and KL-divergence for better evaluation.

Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling

26 September 2024·2272 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 National University of Singapore

Probe sampling accelerates Greedy Coordinate Gradient (GCG) and other prompt optimization methods by up to 5.6x, achieving equal or better attack success rates, making LLM safety research faster and m…

Accelerating Blockwise Parallel Language Models with Draft Refinement

26 September 2024·2883 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 KAIST AI

Boost LLM inference speed by 3x! This paper refines blockwise parallel decoding (BPD) by cleverly refining draft predictions, resulting in faster text generation for large language models.

Abrupt Learning in Transformers: A Case Study on Matrix Completion

26 September 2024·5285 words·25 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Michigan

Transformers exhibit abrupt learning: training loss plateaus, then suddenly drops. This study uses matrix completion to demonstrate this phenomenon, providing insights into the model’s algorithmic sh…