Natural Language Processing
Aligning Large Language Models with Representation Editing: A Control Perspective
·2249 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Cornell University
RE-Control: Aligning LLMs via dynamic representation editing using optimal control theory, achieving superior alignment with significantly fewer resources than fine-tuning.
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
·2033 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 National University of Singapore
ALI-Agent uses LLM-powered agents for in-depth, adaptive assessment of LLMs’ alignment with human values, overcoming limitations of existing static benchmarks.
Algorithmic progress in language models
·4934 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 MIT FutureTech
Language model algorithms have improved drastically, halving compute needs every 8 months since 2012, surpassing Moore’s Law; however, compute scaling, not algorithms, drove most recent performance ga…
Algorithmic Capabilities of Random Transformers
·3079 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 MIT
Randomly initialized transformers, with only embedding layers optimized, surprisingly excel at various algorithmic tasks, revealing inherent capabilities even before training.
AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
·2613 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Tongji University
AlchemistCoder enhances code LLMs by pioneering hindsight tuning on multi-source data, harmonizing conflicting styles via AlchemistPrompts, and achieving state-of-the-art performance.
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
·5046 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Question Answering
🏢 ByteDance Research
AGILE, a novel reinforcement learning framework, significantly enhances LLM agents’ performance on complex conversational tasks by integrating memory, tools, expert interactions, and reflection, outpe…
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
·2659 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Chicago
AGENTPOISON: A novel backdoor attack compromises LLM agents by poisoning their memory or knowledge bases, achieving high success rates with minimal performance impact.
Agent Planning with World Knowledge Model
·2981 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Zhejiang University
This paper introduces a parametric World Knowledge Model (WKM) to improve AI agent planning by integrating both global task knowledge and dynamic state knowledge, thereby overcoming current LLMs’ limi…
Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models
·1740 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Peking University
Adversarial Representation Engineering (ARE) offers a unified, interpretable approach for editing large language models (LLMs) by using a representation sensor as an editing oracle, enhancing model sa…
Adversarial Moment-Matching Distillation of Large Language Models
·2972 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 SI-TECH Information Technology
Boosting LLM efficiency, this study introduces adversarial moment-matching distillation, outperforming existing methods by matching action-value moments for superior knowledge transfer and achieving s…
Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
·1987 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 National Key Laboratory for Novel Software Technology, Nanjing University
TP-LLaMA boosts tool-augmented LLMs by optimizing inference trajectories using preference learning from both successful and failed attempts, achieving superior performance and efficiency.
Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models
·2348 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 Greater Bay Area Institute for Innovation, Hunan University
RAIL, a novel continual learning method for vision-language models, tackles catastrophic forgetting and maintains zero-shot abilities without domain-identity hints or reference data. Using a recursiv…
Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment
·2979 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Birmingham
Adaptive Layer Sparsity (ALS) revolutionizes large language model (LLM) compression by intelligently pruning less important layers, achieving significant size reduction without performance loss. It o…
Adaptable Logical Control for Large Language Models
·2047 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 UCLA
Ctrl-G: A neuro-symbolic framework enables adaptable control of LLM generation by combining any LLM with a Hidden Markov Model (HMM), ensuring outputs adhere to logical constraints specified as determ…
AdaNovo: Towards Robust mph{De Novo} Peptide Sequencing in Proteomics against Data Biases
·1791 words·9 mins·
loading
·
loading
Natural Language Processing
Text Generation
🏢 Westlake University
AdaNovo tackles data biases in de novo peptide sequencing by using Conditional Mutual Information, significantly improving PTM identification and overall accuracy.
Ad Auctions for LLMs via Retrieval Augmented Generation
·2337 words·11 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Maryland
This paper introduces segment auctions, maximizing logarithmic social welfare, for integrating ads into LLM outputs via Retrieval Augmented Generation, balancing ad revenue and output quality.
Accuracy is Not All You Need
·5583 words·27 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Microsoft Research
LLM compression accuracy hides crucial behavioral changes; use % flips and KL-divergence for better evaluation.
Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling
·2272 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 National University of Singapore
Probe sampling accelerates Greedy Coordinate Gradient (GCG) and other prompt optimization methods by up to 5.6x, achieving equal or better attack success rates, making LLM safety research faster and m…
Accelerating Blockwise Parallel Language Models with Draft Refinement
·2883 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 KAIST AI
Boost LLM inference speed by 3x! This paper refines blockwise parallel decoding (BPD) by cleverly refining draft predictions, resulting in faster text generation for large language models.
Abrupt Learning in Transformers: A Case Study on Matrix Completion
·5285 words·25 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Michigan
Transformers exhibit abrupt learning: training loss plateaus, then suddenly drops. This study uses matrix completion to demonstrate this phenomenon, providing insights into the model’s algorithmic sh…