Natural Language Processing

SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents

26 September 2024·3127 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 ETH Zurich

SWT-Bench, a new benchmark, reveals that LLMs excel at generating tests for real-world bug fixes, surpassing dedicated test generation systems and significantly improving code-fix precision.

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

26 September 2024·3239 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

SwitchHead: A novel MoE attention mechanism accelerates Transformers by significantly reducing computation and memory, matching baseline performance.

SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors

26 September 2024·2772 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Texas at Austin

SVFT: a novel parameter-efficient fine-tuning method achieves near full fine-tuning accuracy using only 0.006% to 0.25% of parameters, significantly outperforming existing techniques.

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

26 September 2024·2903 words·14 mins· loading · loading

Natural Language Processing Text Generation 🏢 University of Washington

Generate multiple text drafts from a single language model pass with Superposed Decoding, significantly boosting efficiency!

Stress-Testing Capability Elicitation With Password-Locked Models

26 September 2024·2650 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Redwood Research

Fine-tuning, even on a single demonstration, effectively uncovers hidden LLM capabilities, surpassing simple prompting methods.

StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses

26 September 2024·2873 words·14 mins· loading · loading

Natural Language Processing Dialogue Systems 🏢 Gaoling School of Artificial Intelligence, Renmin University of China

StreamingDialogue revolutionizes prolonged dialogue learning by compressing long contexts into conversational attention sinks, minimizing information loss and achieving a 4x speedup with 18x less memo…

Streaming Long Video Understanding with Large Language Models

26 September 2024·2706 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong

VideoStreaming, a novel vision-language model, enables efficient and accurate understanding of arbitrarily long videos using a constant number of tokens via streaming encoding and adaptive memory sele…

Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models

26 September 2024·1611 words·8 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind

Stratified Prediction-Powered Inference (StratPPI) significantly improves language model evaluation by combining human and automated ratings, using stratified sampling for enhanced accuracy and tighte…

StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving

26 September 2024·4275 words·21 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab

StrategyLLM uses four LLM agents to generate consistent, generalizable few-shot prompts, significantly improving LLM problem-solving performance across various tasks.

Stealth edits to large language models

26 September 2024·3221 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 King's College London

Researchers unveil stealth edits for large language models, offering a new metric to assess editability and reveal vulnerability to malicious attacks.

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning

26 September 2024·1847 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Huawei Noah's Ark Lab

Star-Agents automates data optimization for instruction-tuned LLMs via multi-agent collaboration, achieving a 12% average performance boost.

SSDM: Scalable Speech Dysfluency Modeling

26 September 2024·2807 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Berkeley

SSDM: Scalable Speech Dysfluency Modeling tackles challenges in speech dysfluency analysis by using articulatory gestures for scalable alignment, a connectionist subsequence aligner for efficient dysf…

SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform

26 September 2024·2142 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Rice University

SS1: A novel GPU-friendly operator accelerates deep learning inference by leveraging structured parameter sharing, achieving superior quality-efficiency tradeoffs compared to existing methods.

SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation

26 September 2024·2001 words·10 mins· loading · loading

Natural Language Processing Question Answering 🏢 Daegu Gyeongbuk Institute of Science and Technology

SpikedAttention: Training-free transformer-to-SNN conversion achieving state-of-the-art accuracy and 42% energy reduction!

SpeedLoader: An I/O efficient scheme for heterogeneous and distributed LLM operation

26 September 2024·1914 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 National University of Singapore

SpeedLoader: A groundbreaking I/O efficient scheme dramatically boosts LLM training & inference speed on diverse hardware, even with limited resources!

SpeechAlign: Aligning Speech Generation to Human Preferences

26 September 2024·1822 words·9 mins· loading · loading

Natural Language Processing Text Generation 🏢 Fudan University

SpeechAlign: Iteratively aligning speech generation models to human preferences via preference optimization, bridging distribution gaps for improved speech quality.

Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration

26 September 2024·1644 words·8 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences

Boosting LLM inference speed, a CTC-based draft model significantly improves speculative decoding’s acceptance rate, leading to faster inference.

Spectral Editing of Activations for Large Language Model Alignment

26 September 2024·2511 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Institute for Language, Cognition and Computation, University of Edinburgh

Spectral Editing of Activations (SEA) improves large language model truthfulness and fairness by projecting input representations to maximize covariance with positive demonstrations while minimizing c…

Spectral Adapter: Fine-Tuning in Spectral Space

26 September 2024·3909 words·19 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

Spectral Adapter boosts parameter-efficient fine-tuning by incorporating pretrained weight matrices’ spectral information, enhancing efficiency and multi-adapter fusion.

SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices

26 September 2024·2263 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Yandex HSE University

SpecExec achieves massively parallel speculative decoding, enabling interactive 50B+ parameter LLM inference on consumer devices at 4-6 tokens/second.