Skip to main content

🏢 Rice University

SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform
·2142 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Rice University
SS1: A novel GPU-friendly operator accelerates deep learning inference by leveraging structured parameter sharing, achieving superior quality-efficiency tradeoffs compared to existing methods.
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
·1675 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Rice University
SpaceByte: A novel byte-level decoder architecture achieving near-tokenized-model performance without tokenization!
Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation
·2063 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Rice University
LoPA: a novel parameter-efficient fine-tuning method matches state-of-the-art performance while requiring no server-side adapters, improving upon traditional prompt tuning.
Optimal Hypothesis Selection in (Almost) Linear Time
·1628 words·8 mins· loading · loading
AI Theory Optimization 🏢 Rice University
This paper presents the first almost linear-time algorithm achieving the optimal accuracy parameter for hypothesis selection, solving a decades-long open problem.
Optimal Algorithms for Augmented Testing of Discrete Distributions
·1848 words·9 mins· loading · loading
AI Theory Optimization 🏢 Rice University
Leveraging predictions, this research presents novel algorithms for uniformity, identity, and closeness testing of discrete distributions, achieving information-theoretically optimal sample complexity…
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
·2513 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Rice University
NoMAD-Attention achieves up to 2x speedup in 4-bit quantized LLaMA inference on CPUs by replacing computationally expensive multiply-add operations with ultra-low-latency in-register lookups.
Learning Transferable Features for Implicit Neural Representations
·4038 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Rice University
STRAINER: A new framework enabling faster, higher-quality INR fitting by leveraging transferable features across similar signals, significantly boosting INR performance.
Fair GLASSO: Estimating Fair Graphical Models with Unbiased Statistical Behavior
·1979 words·10 mins· loading · loading
AI Theory Fairness 🏢 Rice University
Fair GLASSO ensures fair Gaussian graphical models by introducing novel bias metrics and a penalized maximum likelihood estimator to mitigate group biases in data.