Natural Language Processing

FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models

26 September 2024·3523 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Beijing University of Posts and Telecommunications

FM-Delta: Lossless compression halves cloud storage for massive fine-tuned language models, saving costs without sacrificing accuracy.

FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions

26 September 2024·2004 words·10 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Meta AI

FlowLLM revolutionizes material design by cleverly merging large language models and Riemannian flow matching, yielding a 300% boost in stable material generation!

FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations

26 September 2024·1833 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Maryland

FLORA enables efficient & private federated fine-tuning of LLMs via novel stacking-based heterogeneous low-rank adaptation, surpassing existing methods.

FLAME : Factuality-Aware Alignment for Large Language Models

26 September 2024·2851 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Waterloo

FLAME: A novel alignment method enhances large language model factuality by addressing hallucination in supervised fine-tuning and reinforcement learning, resulting in more accurate and helpful AI ass…

Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond

26 September 2024·1351 words·7 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Michigan

Researchers crack the code of in-context learning in Transformers, revealing how architecture, low-rank parameters, and data correlations influence model optimization and generalization.

Fight Back Against Jailbreaking via Prompt Adversarial Tuning

26 September 2024·2100 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Peking University

Prompt Adversarial Tuning (PAT) defends against LLM jailbreaking by training a protective prompt prefix. PAT uses adversarial and benign prompts to optimize this prefix, significantly reducing succes…

Federated Fine-tuning of Large Language Models under Heterogeneous Tasks and Client Resources

26 September 2024·3653 words·18 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Alibaba Group

FlexLoRA: Efficient Federated Fine-tuning of LLMs for Heterogeneous Tasks and Resources.

FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model

26 September 2024·3348 words·16 mins· loading · loading

AI Generated Natural Language Processing Topic Modeling 🏢 Nanyang Technological University

FASTopic: a pretrained transformer-based topic model achieving superior speed, adaptivity, stability, and transferability compared to existing methods.

Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time

26 September 2024·2326 words·11 mins· loading · loading

Natural Language Processing Text Generation 🏢 UC Los Angeles

Accelerated discrete diffusion model sampling is achieved via novel discrete non-Markov diffusion models (DNDM) with predetermined transition times, enabling a training-free algorithm that significant…

Fast Best-of-N Decoding via Speculative Rejection

26 September 2024·1456 words·7 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Speculative Rejection: A novel algorithm boosts Large Language Model (LLM) alignment by speeding up inference-time alignment by 16-32x!

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

26 September 2024·2598 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 SenseTime Research

LLM-Infused Diffuser boosts text-to-image generation by smartly integrating LLMs, surpassing existing models in prompt understanding and image quality.

Exploiting LLM Quantization

26 September 2024·1836 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 ETH Zurich

LLM quantization, while improving efficiency, creates a security risk: attackers can craft seemingly benign models that exhibit malicious behavior only when quantized.

Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion

26 September 2024·2629 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Warsaw University of Technology

D2DMoE boosts Transformer efficiency by up to 60% via smart activation sparsity and dynamic expert selection, outperforming existing methods.

Explaining Datasets in Words: Statistical Models with Natural Language Parameters

26 September 2024·2281 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Berkeley

This paper introduces a model-agnostic algorithm that uses natural language predicates to make statistical model parameters directly interpretable, significantly improving explainability.

Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

26 September 2024·3278 words·16 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 University of Chinese Academy of Sciences

DEVIL: a novel text-to-video evaluation protocol focusing on video dynamics, resulting in more realistic video generation.

Estimating the Hallucination Rate of Generative AI

26 September 2024·3412 words·17 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Department of Statistics, Columbia University

New method estimates hallucination rates in generative AI’s in-context learning, improving model reliability.

ESPACE: Dimensionality Reduction of Activations for Model Compression

26 September 2024·2254 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 NVIDIA Research

ESPACE: A novel LLM compression technique achieving 50% model size reduction with minimal accuracy loss by cleverly projecting activations onto principal components.

Entity Alignment with Noisy Annotations from Large Language Models

26 September 2024·1820 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hong Kong Polytechnic University

LLM4EA: A novel framework efficiently merges knowledge graphs using LLMs, overcoming noisy annotations and high costs via active learning and unsupervised label refinement, boosting accuracy and effic…

Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus

26 September 2024·3384 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Advanced AI Innovation Center, Hitachi

Boosting AI reasoning! New research enhances LLMs’ logical abilities via a principled synthetic logic corpus, achieving substantial improvements across logic, math, and coding benchmarks.

Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control

26 September 2024·3239 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

Boosting LLM trustworthiness, researchers introduce Sparse Activation Control, a training-free method that concurrently enhances safety, factuality, and bias mitigation by selectively controlling atte…