🏢 University of Texas at Austin

MatFormer: Nested Transformer for Elastic Inference

26 September 2024·3341 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Texas at Austin

MatFormer: Train one universal model, extract hundreds of accurate submodels for elastic inference!

LoFiT: Localized Fine-tuning on LLM Representations

26 September 2024·4045 words·19 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Texas at Austin

LOFIT: Localized fine-tuning boosts LLMs’ performance by selectively training only a small subset of attention heads, achieving comparable accuracy to other methods while using significantly fewer par…

LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS

26 September 2024·2198 words·11 mins· loading · loading

3D Vision 🏢 University of Texas at Austin

LightGaussian achieves 15x compression of 3D Gaussian scene representations, boosting rendering speed to 200+ FPS while maintaining visual quality, solving storage and efficiency issues in real-time n…

Learning Noisy Halfspaces with a Margin: Massart is No Harder than Random

26 September 2024·289 words·2 mins· loading · loading

Active Learning 🏢 University of Texas at Austin

Proper learning of noisy halfspaces with margins is achievable with sample complexity matching random classification noise, defying prior expectations.

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

26 September 2024·1771 words·9 mins· loading · loading

Meta Learning 🏢 University of Texas at Austin

Softmax attention in transformers adapts its attention window to function Lipschitzness and noise, enabling efficient in-context learning.

Improved Sample Complexity Bounds for Diffusion Model Training

26 September 2024·360 words·2 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Texas at Austin

Training high-quality diffusion models efficiently is now possible, thanks to novel sample complexity bounds improving exponentially on previous work.

Identifying General Mechanism Shifts in Linear Causal Representations

26 September 2024·3163 words·15 mins· loading · loading

AI Generated AI Theory Representation Learning 🏢 University of Texas at Austin

Researchers can now pinpoint the sources of data shifts in complex linear causal systems using a new algorithm, even with limited perfect interventions, opening exciting possibilities for causal disco…

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

26 September 2024·1914 words·9 mins· loading · loading

Large Language Models 🏢 University of Texas at Austin

HydraLoRA: Asymmetric LoRA boosts LLM fine-tuning efficiency by sharing parameters across tasks while specializing others, outperforming existing methods.

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

26 September 2024·4260 words·20 mins· loading · loading

AI Generated Computer Vision Video Understanding 🏢 University of Texas at Austin

HOI-Swap: a novel diffusion model flawlessly swaps objects in videos while intelligently preserving natural hand interactions, producing high-quality edits.

Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions

26 September 2024·2222 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Texas at Austin

Hierarchical Hybrid Sliced Wasserstein (H2SW) solves the challenge of comparing complex, heterogeneous joint distributions by introducing novel slicing operators, leading to a scalable and statistical…

Heterogeneity-Guided Client Sampling: Towards Fast and Efficient Non-IID Federated Learning

26 September 2024·2418 words·12 mins· loading · loading

Machine Learning Federated Learning 🏢 University of Texas at Austin

HiCS-FL: A novel federated learning client sampling method that leverages data heterogeneity for faster, more efficient global model training in non-IID settings.

Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models

26 September 2024·4898 words·23 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Texas at Austin

This paper introduces a rate-distortion framework for prompt compression in LLMs, bridging the gap between existing methods and optimal performance. By formulating prompt compression as a linear progr…

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

26 September 2024·2248 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Texas at Austin

Ms-PoE, a simple plug-and-play positional encoding, significantly improves LLMs’ ability to utilize long contexts by mitigating the ’lost-in-the-middle’ problem and enhancing the capacity to capture i…

Expressive Gaussian Human Avatars from Monocular RGB Video

26 September 2024·1431 words·7 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Texas at Austin

EVA: a novel method generates expressive 3D Gaussian human avatars from monocular RGB videos, excelling in detailed hand and facial expressions via context-aware density control and improved SMPL-X al…

Efficient Discrepancy Testing for Learning with Distribution Shift

26 September 2024·1471 words·7 mins· loading · loading

Machine Learning Transfer Learning 🏢 University of Texas at Austin

Provably efficient algorithms for learning with distribution shift are introduced, generalizing and improving prior work by achieving near-optimal error rates and offering universal learners for large…

Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning

26 September 2024·1800 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 University of Texas at Austin

Dynamic Model Predictive Shielding (DMPS) ensures provably safe reinforcement learning by dynamically optimizing reinforcement learning objectives while maintaining provable safety, achieving higher r…

Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning

26 September 2024·1850 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 University of Texas at Austin

DUSDi: A novel method for learning disentangled skills in unsupervised reinforcement learning, enabling efficient reuse for diverse downstream tasks.

Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration

26 September 2024·1669 words·8 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 University of Texas at Austin

DUPLEX: a novel RL method trains diverse, near-optimal policies in complex, dynamic environments by explicitly maximizing policy diversity using successor features. It outperforms existing methods in…

Detecting Bugs with Substantial Monetary Consequences by LLM and Rule-based Reasoning

26 September 2024·2263 words·11 mins· loading · loading

AI Applications Finance 🏢 University of Texas at Austin

Hybrid LLM & rule-based system accurately detects costly smart contract bugs!

Communication Efficient Distributed Training with Distributed Lion

26 September 2024·1698 words·8 mins· loading · loading

Machine Learning Optimization 🏢 University of Texas at Austin

Distributed Lion: Training large AI models efficiently by communicating only binary or low-precision vectors between workers and a server, significantly reducing communication costs and maintaining co…