Natural Language Processing

Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication

26 September 2024·1851 words·9 mins· loading · loading

Natural Language Processing Dialogue Systems 🏢 University of Southampton

AI agents developed a communication system using spatial relationships, achieving over 90% accuracy in conveying relative positions of objects within a scene.

SparseLLM: Towards Global Pruning of Pre-trained Language Models

26 September 2024·2184 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Emory University

SparseLLM globally prunes large language models efficiently by decomposing the problem into manageable subproblems, achieving significant performance improvements, especially at high sparsity.

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

26 September 2024·1675 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Rice University

SpaceByte: A novel byte-level decoder architecture achieving near-tokenized-model performance without tokenization!

Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases

26 September 2024·2889 words·14 mins· loading · loading

Natural Language Processing Text Summarization 🏢 Purdue University

ProRec, a novel framework, bridges the binary-source semantic gap by using a binary-source encoder-decoder model and LLMs, achieving significant improvements in zero-shot binary summarization and func…

Soft-Label Integration for Robust Toxicity Classification

26 September 2024·2918 words·14 mins· loading · loading

AI Generated Natural Language Processing Text Classification 🏢 Northwestern University

Boosting toxicity classification robustness, this paper introduces a novel bi-level optimization framework integrating crowdsourced soft-labels and GroupDRO to enhance resistance against out-of-distri…

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

26 September 2024·2028 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Technical University of Munich

Open-source LLMs are vulnerable to embedding space attacks, which efficiently bypass safety mechanisms and enable data extraction, even after unlearning.

SnapKV: LLM Knows What You are Looking for Before Generation

26 September 2024·2730 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

SnapKV: Slashing LLM memory usage & boosting speed via smart KV cache compression!

Smoothie: Label Free Language Model Routing

26 September 2024·3245 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

SMOOTHIE: Label-free LLM routing achieves up to 10% accuracy gains by using a latent variable model to estimate LLM quality without labeled data.

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

26 September 2024·2599 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Los Angeles

SMALLTOLARGE (S2L) revolutionizes large language model (LLM) fine-tuning by using a small model to summarize training loss trajectories, enabling efficient data selection for larger models.

SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining

26 September 2024·4422 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 RIKEN AIP

SLTrain: Sparsity+low-rank pretraining boosts LLM efficiency by up to 73% memory reduction without performance loss!

SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM

26 September 2024·4826 words·23 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 School of Data Science, Fudan University

SlowFocus significantly improves fine-grained temporal understanding in video LLMs by using mixed-frequency sampling and a novel multi-frequency attention mechanism.

SlimGPT: Layer-wise Structured Pruning for Large Language Models

26 September 2024·2966 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Alibaba Group

SlimGPT: Achieve near-optimal LLM structured pruning via Batched Greedy Pruning and Incremental Pruning Ratio, improving efficiency without sacrificing accuracy.

SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models

26 September 2024·2353 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google Research

Self Logits Evolution Decoding (SLED) boosts LLM factuality by up to 20% without extra data or fine-tuning!

SIRIUS : Contexual Sparisty with Correction for Efficient LLMs

26 September 2024·5392 words·26 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

SIRIUS: A novel correction mechanism boosts the efficiency of contextually sparse LLMs for complex reasoning tasks, achieving significant latency reduction.

SimPO: Simple Preference Optimization with a Reference-Free Reward

26 September 2024·3091 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

SimPO: a simpler, reference-free reward algorithm significantly outperforming existing offline preference optimization methods, achieving higher accuracy and efficiency in aligning LLMs with human pre…

Simplified and Generalized Masked Diffusion for Discrete Data

26 September 2024·2082 words·10 mins· loading · loading

Natural Language Processing Text Generation 🏢 Google DeepMind

Simplified and generalized masked diffusion models achieve state-of-the-art results in discrete data generation, surpassing previous methods in text and image modeling.

Simple and Effective Masked Diffusion Language Models

26 September 2024·2145 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Cornell Tech

Simple masked discrete diffusion models achieve state-of-the-art language modeling results, closing the performance gap with autoregressive methods by using a novel training recipe and a Rao-Blackwell…

SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices

26 September 2024·2275 words·11 mins· loading · loading

Natural Language Processing Speech Recognition 🏢 Peking University

SILENCE, a novel lightweight system, protects user privacy in offloaded speech understanding on resource-constrained devices by selectively masking short-term audio details without impacting long-term…

Should We Really Edit Language Models? On the Evaluation of Edited Language Models

26 September 2024·3638 words·18 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

Language model editing’s limitations exposed: Scaling current methods leads to knowledge loss and compromised safety, urging research into more robust techniques.

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

26 September 2024·3020 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

ShiftAddLLM accelerates pretrained LLMs via post-training, multiplication-less reparameterization, achieving significant memory and energy reductions with comparable or better accuracy than existing m…