Large Language Models
Policy Improvement using Language Feedback Models
·3358 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Microsoft Research
Boosting AI instruction following, Language Feedback Models (LFMs) leverage Large Language Models (LLMs) to identify desirable behaviors from visual trajectories, significantly improving task completi…
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
·3100 words·15 mins·
loading
·
loading
Large Language Models
🏢 Peking University
PiSSA, a novel parameter-efficient fine-tuning method, surpasses LoRA by initializing adapter matrices using the principal components of the original model, achieving faster convergence and enhanced p…
Pipeline Parallelism with Controllable Memory
·3116 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Sea AI Lab
New pipeline parallelism framework achieves up to 55% higher throughput and 50% less memory usage in large language model training by systematically controlling activation memory.
PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation
·3549 words·17 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Zhejiang University
PhyloGen uses a genomic language model to generate and optimize phylogenetic trees, offering faster and more accurate evolutionary analysis than traditional methods.
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
·4408 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Pennsylvania State University
Bi-directional Preference Optimization (BiPO) generates superior steering vectors for personalized LLM control, improving upon existing methods by directly influencing the generation probability of hu…
Perplexity-aware Correction for Robust Alignment with Noisy Preferences
·3067 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Shandong University
PerpCorrect: Robust LLM alignment despite noisy human preferences, achieved via perplexity-based noisy preference detection and correction.
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
·2033 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 College of Computer, National University of Defense Technology
This study reveals that large language models struggle with semi-open-ended questions, often hallucinating or providing insufficient answers. Researchers explored this by creating a new dataset of su…
PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
·1920 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Academy for Engineering and Technology, Fudan University
PediatricsGPT: a novel Chinese pediatric LLM assistant trained on a large, high-quality dataset (PedCorpus) outperforms existing models, paving the way for improved pediatric healthcare.
Parameter Competition Balancing for Model Merging
·3629 words·18 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Harbin Institute of Technology
PCB-MERGING: A training-free model merging technique boosts performance by intelligently balancing parameter competition across multiple tasks.
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
·1639 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 MIT
DeltaNet, a linear transformer boosting associative recall, now trains efficiently via a novel algorithm, scaling to large language models and outperforming existing linear baselines.
Panacea: Pareto Alignment via Preference Adaptation for LLMs
·2565 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Peking University
Panacea: a novel LLM alignment method achieving Pareto optimality via online preference adaptation using a single model.
PaCE: Parsimonious Concept Engineering for Large Language Models
·2526 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Johns Hopkins University
PaCE, a novel activation engineering framework, efficiently aligns LLMs by removing undesirable concepts from activations using sparse coding, achieving state-of-the-art performance while preserving l…
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
·2892 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Gaoling School of Artificial Intelligence, Renmin University of China
Over-parameterized Distillation Framework (OPDF) boosts knowledge distillation by efficiently over-parameterizing student models via tensor decomposition, significantly improving performance without i…
Order-Independence Without Fine Tuning
·1791 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Harvard University
Set-Based Prompting guarantees order-independent LLM outputs by modifying input representations, eliminating unwanted inconsistencies without fine-tuning.
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
·1896 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google Research
Orchid: a novel deep learning architecture using data-dependent convolution achieves quasilinear scalability and outperforms attention-based models on various sequence modeling tasks.
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning
·4495 words·22 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 KAIST
LLMs boost tabular data prediction by generating optimized features via decision tree reasoning, outperforming existing methods.
Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives
·2599 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 CISPA Helmholtz Center for Information Security
Open LLMs outperform closed alternatives for private data adaptation, offering superior privacy, performance, and lower costs.
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
·1619 words·8 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Illinois Urbana-Champaign
This paper proposes a novel, reward-free RLHF framework using a general preference oracle, surpassing existing reward-based approaches in efficiency and generalizability.
Online Adaptation of Language Models with a Memory of Amortized Contexts
·2374 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 KAIST
MAC: Efficiently updates large language models (LLMs) using a memory of compressed contexts for improved real-time knowledge retention and adaptation.
OneBit: Towards Extremely Low-bit Large Language Models
·2001 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Research Center for Social Computing and Information Retrieval,Harbin Institute of Technology
OneBit achieves surprisingly good performance in 1-bit quantized LLMs by using a novel 1-bit parameter representation method and an effective parameter initialization method.