Skip to main content

Natural Language Processing

SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents
·3127 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 ETH Zurich
SWT-Bench, a new benchmark, reveals that LLMs excel at generating tests for real-world bug fixes, surpassing dedicated test generation systems and significantly improving code-fix precision.
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
·3239 words·16 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Stanford University
SwitchHead: A novel MoE attention mechanism accelerates Transformers by significantly reducing computation and memory, matching baseline performance.
SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors
·2772 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Texas at Austin
SVFT: a novel parameter-efficient fine-tuning method achieves near full fine-tuning accuracy using only 0.006% to 0.25% of parameters, significantly outperforming existing techniques.
Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
·2903 words·14 mins· loading · loading
Natural Language Processing Text Generation 🏢 University of Washington
Generate multiple text drafts from a single language model pass with Superposed Decoding, significantly boosting efficiency!
Stress-Testing Capability Elicitation With Password-Locked Models
·2650 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Redwood Research
Fine-tuning, even on a single demonstration, effectively uncovers hidden LLM capabilities, surpassing simple prompting methods.
StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses
·2873 words·14 mins· loading · loading
Natural Language Processing Dialogue Systems 🏢 Gaoling School of Artificial Intelligence, Renmin University of China
StreamingDialogue revolutionizes prolonged dialogue learning by compressing long contexts into conversational attention sinks, minimizing information loss and achieving a 4x speedup with 18x less memo…
Streaming Long Video Understanding with Large Language Models
·2706 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong
VideoStreaming, a novel vision-language model, enables efficient and accurate understanding of arbitrarily long videos using a constant number of tokens via streaming encoding and adaptive memory sele…
Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models
·1611 words·8 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind
Stratified Prediction-Powered Inference (StratPPI) significantly improves language model evaluation by combining human and automated ratings, using stratified sampling for enhanced accuracy and tighte…
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
·4275 words·21 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tencent AI Lab
StrategyLLM uses four LLM agents to generate consistent, generalizable few-shot prompts, significantly improving LLM problem-solving performance across various tasks.
Stealth edits to large language models
·3221 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏢 King's College London
Researchers unveil stealth edits for large language models, offering a new metric to assess editability and reveal vulnerability to malicious attacks.
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
·1847 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Huawei Noah's Ark Lab
Star-Agents automates data optimization for instruction-tuned LLMs via multi-agent collaboration, achieving a 12% average performance boost.
SSDM: Scalable Speech Dysfluency Modeling
·2807 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 UC Berkeley
SSDM: Scalable Speech Dysfluency Modeling tackles challenges in speech dysfluency analysis by using articulatory gestures for scalable alignment, a connectionist subsequence aligner for efficient dysf…
SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform
·2142 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Rice University
SS1: A novel GPU-friendly operator accelerates deep learning inference by leveraging structured parameter sharing, achieving superior quality-efficiency tradeoffs compared to existing methods.
SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation
·2001 words·10 mins· loading · loading
Natural Language Processing Question Answering 🏢 Daegu Gyeongbuk Institute of Science and Technology
SpikedAttention: Training-free transformer-to-SNN conversion achieving state-of-the-art accuracy and 42% energy reduction!
SpeedLoader: An I/O efficient scheme for heterogeneous and distributed LLM operation
·1914 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 National University of Singapore
SpeedLoader: A groundbreaking I/O efficient scheme dramatically boosts LLM training & inference speed on diverse hardware, even with limited resources!
SpeechAlign: Aligning Speech Generation to Human Preferences
·1822 words·9 mins· loading · loading
Natural Language Processing Text Generation 🏢 Fudan University
SpeechAlign: Iteratively aligning speech generation models to human preferences via preference optimization, bridging distribution gaps for improved speech quality.
Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration
·1644 words·8 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences
Boosting LLM inference speed, a CTC-based draft model significantly improves speculative decoding’s acceptance rate, leading to faster inference.
Spectral Editing of Activations for Large Language Model Alignment
·2511 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Institute for Language, Cognition and Computation, University of Edinburgh
Spectral Editing of Activations (SEA) improves large language model truthfulness and fairness by projecting input representations to maximize covariance with positive demonstrations while minimizing c…
Spectral Adapter: Fine-Tuning in Spectral Space
·3909 words·19 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Stanford University
Spectral Adapter boosts parameter-efficient fine-tuning by incorporating pretrained weight matrices’ spectral information, enhancing efficiency and multi-adapter fusion.
SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
·2263 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Yandex HSE University
SpecExec achieves massively parallel speculative decoding, enabling interactive 50B+ parameter LLM inference on consumer devices at 4-6 tokens/second.