🏢 University of Maryland

Why Warmup the Learning Rate? Underlying Mechanisms and Improvements

26 September 2024·7149 words·34 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Maryland

Deep learning’s learning rate warmup improves performance by allowing larger learning rates, pushing networks to better-conditioned loss landscape areas.

Transformers Can Do Arithmetic with the Right Embeddings

26 September 2024·3154 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Maryland

Researchers enhanced transformer performance on arithmetic tasks by introducing Abacus Embeddings, which encode each digit’s position, enabling improved generalization and unlocking multi-step reasoni…

Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations

26 September 2024·1994 words·10 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Maryland

ConVRT: A novel framework restores turbulence-distorted videos by decoupling spatial and temporal information in a neural representation, achieving temporally consistent mitigation.

SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

26 September 2024·2560 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

SHED, a Shapley value-based framework, efficiently refines instruction-tuning datasets for LLMs, producing high-performing subsets, only 10% of original size, that transfer well across different model…

QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos

26 September 2024·3903 words·19 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 University of Maryland

QUEEN: A novel framework for quantized and efficient streaming of free-viewpoint videos achieving high compression, quality, and speed.

Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory

26 September 2024·4015 words·19 mins· loading · loading

AI Generated AI Theory Interpretability 🏢 University of Maryland

Counterfactual Clamping Attack (CCA) improves model reconstruction using counterfactual explanations by leveraging decision boundary proximity, offering theoretical guarantees and enhanced fidelity.

Loki: Low-rank Keys for Efficient Sparse Attention

26 September 2024·3255 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Maryland

Loki: Low-rank Keys for Efficient Sparse Attention accelerates attention mechanisms in LLMs by exploiting the low-dimensionality of key vectors. It dynamically selects key tokens based on approximate…

Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models

26 September 2024·2218 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.

FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations

26 September 2024·1833 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Maryland

FLORA enables efficient & private federated fine-tuning of LLMs via novel stacking-based heterogeneous low-rank adaptation, surpassing existing methods.

Fairness and Efficiency in Online Class Matching

26 September 2024·1500 words·8 mins· loading · loading

AI Generated AI Theory Fairness 🏢 University of Maryland

First non-wasteful algorithm achieving 1/2-approximation for class envy-freeness, class proportionality, and utilitarian social welfare in online class matching.

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

26 September 2024·1605 words·8 mins· loading · loading

Machine Learning Federated Learning 🏢 University of Maryland

FACT, a novel federated learning mechanism, eliminates free-riding and incentivizes truthful agent behavior by introducing a penalty system and a competitive environment, boosting model performance si…

Estimating Epistemic and Aleatoric Uncertainty with a Single Model

26 September 2024·2171 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Maryland

HyperDM accurately estimates both epistemic and aleatoric uncertainty using a single model, overcoming the computational limitations of existing ensemble methods.

Dueling over Dessert, Mastering the Art of Repeated Cake Cutting

26 September 2024·2291 words·11 mins· loading · loading

AI Theory Fairness 🏢 University of Maryland

Repeated cake-cutting game reveals that strategic players can exploit myopic opponents, but equitable outcomes are achievable through specific strategies.

DMesh: A Differentiable Mesh Representation

26 September 2024·3349 words·16 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Maryland

DMesh: A novel differentiable mesh representation enabling efficient gradient-based optimization for diverse 3D shape applications.

Differentiable Quantum Computing for Large-scale Linear Control

26 September 2024·1462 words·7 mins· loading · loading

AI Generated AI Applications Robotics 🏢 University of Maryland

Quantum algorithm achieves super-quadratic speedup for large-scale linear control, offering a novel approach to address the computational challenges of optimizing complex dynamical systems.

Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance

26 September 2024·3386 words·16 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 University of Maryland

Equivariant Graph Neural Networks boost multi-agent reinforcement learning by improving sample efficiency and generalization, overcoming inherent exploration biases.

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

26 September 2024·2673 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.

Ad Auctions for LLMs via Retrieval Augmented Generation

26 September 2024·2337 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

This paper introduces segment auctions, maximizing logarithmic social welfare, for integrating ads into LLM outputs via Retrieval Augmented Generation, balancing ad revenue and output quality.