Large Language Models
Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
·2475 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Peking University
Richelieu: a self-evolving LLM-based AI agent masters Diplomacy, a complex game requiring strategic planning and negotiation, without human data, by integrating self-play for continuous improvement.
Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference
·2947 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ UC Santa Barbara
Reverse the forget-retain objectives for efficient LLM unlearning!
Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models
·2992 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Ant Group
PaRO boosts LLM training speed by up to 266% through refined model state partitioning and optimized communication.
Rethinking LLM Memorization through the Lens of Adversarial Compression
·2014 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Carnegie Mellon University
Researchers propose Adversarial Compression Ratio (ACR) to assess LLM memorization, offering an adversarial, flexible, and computationally efficient method for monitoring data misuse and compliance.
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
·2799 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tsinghua University
ReST-MCTS*: A novel LLM self-training method using process reward guided tree search, outperforming existing methods by generating higher-quality reasoning traces for improved model accuracy.
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
·3628 words·18 mins·
loading
·
loading
Large Language Models
π’ Tel Aviv University
New research resolves discrepancies in language model scaling laws, revealing three key factors driving the differences and improving accuracy in predicting optimal model size based on compute budget.
Reranking Laws for Language Generation: A Communication-Theoretic Perspective
·1835 words·9 mins·
loading
·
loading
Large Language Models
π’ Instituto Superior TΓ©cnico, Universidade De Lisboa
Boost LLM reliability by adding redundancy! This paper uses a communication theory framework to show that generating multiple LLM outputs and reranking them significantly reduces errors, even with imp…
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe
·5026 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ University of Cambridge
This research unveils a compute-optimal recipe for fine-tuning language models into high-quality text embedding models, offering practical guidance and scaling laws for resource-constrained settings.
Representation Noising: A Defence Mechanism Against Harmful Finetuning
·3502 words·17 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Dalhousie University
RepNoise: a novel defense against harmful fine-tuning of LLMs by removing information about harmful representations, generalizing across different harmful tasks, and maintaining LLM capabilities.
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
·4799 words·23 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Korea Advanced Institute of Science and Technology
ReMoDetect leverages reward models to identify and classify LLM-generated text. By using continual preference fine-tuning and incorporating human/LLM mixed text, ReMoDetect achieves state-of-the-art p…
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
·2925 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Shanghai Jiao Tong University
POAD enhances LLM agents by decomposing language agent optimization to the token level, achieving finer-grained credit assignment and improved learning efficiency and generalization.
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
·3598 words·17 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ University of Illinois Urbana-Champaign
Regularizing hidden states improves reward model generalization in RLHF for LLMs, boosting accuracy and mitigating over-optimization.
ReFT: Representation Finetuning for Language Models
·3382 words·16 mins·
loading
·
loading
Large Language Models
π’ Stanford University
ReFT: Revolutionizing language model finetuning by directly manipulating hidden representations, achieving superior efficiency and performance compared to existing methods.
Reflective Multi-Agent Collaboration based on Large Language Models
·2567 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Gaoling School of Artificial Intelligence, Renmin University of China
COPPER enhances LLM-based multi-agent collaboration via a self-reflection mechanism and counterfactual PPO. It improves reflection quality, alleviates credit assignment issues, and shows strong perfo…
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models
·2065 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Wuhan University
Reference Trustable Decoding (RTD) revolutionizes large language model adaptation by offering a training-free method, enabling efficient and cost-effective task adaptation without parameter adjustment…
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
·2727 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ MIT CSAIL
Cross-Layer Attention (CLA) shrinks Transformer Key-Value cache 2x, improving LLMs’ memory efficiency without accuracy loss.
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
·2681 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Carnegie Mellon University
RISE: Recursive Introspection teaches LLMs to iteratively improve their responses, enabling self-correction and enhanced performance on challenging reasoning tasks.
Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training
·2387 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ New York University
Overparameterized neural networks surprisingly recover from catastrophic interference when trained cyclically on repeated data sequences, exhibiting anticipatory knowledge reactivation.
Reasons and Solutions for the Decline in Model Performance after Editing
·2167 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Peking University
Boosting large language model performance after knowledge editing: A new method (D4S) minimizes model damage by regulating the explosive growth of parameter layers, enabling multiple effective edits.
Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer
·1495 words·8 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Courant Institute
New surrogate loss functions for learning-to-defer achieve Bayes-consistency, realizable H-consistency, and H-consistency bounds simultaneously, resolving open questions and improving L2D performance.