Natural Language Processing
Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication
·1851 words·9 mins·
loading
·
loading
Natural Language Processing
Dialogue Systems
🏢 University of Southampton
AI agents developed a communication system using spatial relationships, achieving over 90% accuracy in conveying relative positions of objects within a scene.
SparseLLM: Towards Global Pruning of Pre-trained Language Models
·2184 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Emory University
SparseLLM globally prunes large language models efficiently by decomposing the problem into manageable subproblems, achieving significant performance improvements, especially at high sparsity.
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
·1675 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Rice University
SpaceByte: A novel byte-level decoder architecture achieving near-tokenized-model performance without tokenization!
Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases
·2889 words·14 mins·
loading
·
loading
Natural Language Processing
Text Summarization
🏢 Purdue University
ProRec, a novel framework, bridges the binary-source semantic gap by using a binary-source encoder-decoder model and LLMs, achieving significant improvements in zero-shot binary summarization and func…
Soft-Label Integration for Robust Toxicity Classification
·2918 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Text Classification
🏢 Northwestern University
Boosting toxicity classification robustness, this paper introduces a novel bi-level optimization framework integrating crowdsourced soft-labels and GroupDRO to enhance resistance against out-of-distri…
Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space
·2028 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Technical University of Munich
Open-source LLMs are vulnerable to embedding space attacks, which efficiently bypass safety mechanisms and enable data extraction, even after unlearning.
SnapKV: LLM Knows What You are Looking for Before Generation
·2730 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Illinois Urbana-Champaign
SnapKV: Slashing LLM memory usage & boosting speed via smart KV cache compression!
Smoothie: Label Free Language Model Routing
·3245 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
SMOOTHIE: Label-free LLM routing achieves up to 10% accuracy gains by using a latent variable model to estimate LLM quality without labeled data.
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
·2599 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 UC Los Angeles
SMALLTOLARGE (S2L) revolutionizes large language model (LLM) fine-tuning by using a small model to summarize training loss trajectories, enabling efficient data selection for larger models.
SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining
·4422 words·21 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 RIKEN AIP
SLTrain: Sparsity+low-rank pretraining boosts LLM efficiency by up to 73% memory reduction without performance loss!
SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM
·4826 words·23 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 School of Data Science, Fudan University
SlowFocus significantly improves fine-grained temporal understanding in video LLMs by using mixed-frequency sampling and a novel multi-frequency attention mechanism.
SlimGPT: Layer-wise Structured Pruning for Large Language Models
·2966 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Alibaba Group
SlimGPT: Achieve near-optimal LLM structured pruning via Batched Greedy Pruning and Incremental Pruning Ratio, improving efficiency without sacrificing accuracy.
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models
·2353 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google Research
Self Logits Evolution Decoding (SLED) boosts LLM factuality by up to 20% without extra data or fine-tuning!
SIRIUS : Contexual Sparisty with Correction for Efficient LLMs
·5392 words·26 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
SIRIUS: A novel correction mechanism boosts the efficiency of contextually sparse LLMs for complex reasoning tasks, achieving significant latency reduction.
SimPO: Simple Preference Optimization with a Reference-Free Reward
·3091 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Princeton University
SimPO: a simpler, reference-free reward algorithm significantly outperforming existing offline preference optimization methods, achieving higher accuracy and efficiency in aligning LLMs with human pre…
Simplified and Generalized Masked Diffusion for Discrete Data
·2082 words·10 mins·
loading
·
loading
Natural Language Processing
Text Generation
🏢 Google DeepMind
Simplified and generalized masked diffusion models achieve state-of-the-art results in discrete data generation, surpassing previous methods in text and image modeling.
Simple and Effective Masked Diffusion Language Models
·2145 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Cornell Tech
Simple masked discrete diffusion models achieve state-of-the-art language modeling results, closing the performance gap with autoregressive methods by using a novel training recipe and a Rao-Blackwell…
SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices
·2275 words·11 mins·
loading
·
loading
Natural Language Processing
Speech Recognition
🏢 Peking University
SILENCE, a novel lightweight system, protects user privacy in offloaded speech understanding on resource-constrained devices by selectively masking short-term audio details without impacting long-term…
Should We Really Edit Language Models? On the Evaluation of Edited Language Models
·3638 words·18 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Hong Kong University of Science and Technology
Language model editing’s limitations exposed: Scaling current methods leads to knowledge loss and compromised safety, urging research into more robust techniques.
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
·3020 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
ShiftAddLLM accelerates pretrained LLMs via post-training, multiplication-less reparameterization, achieving significant memory and energy reductions with comparable or better accuracy than existing m…