Skip to main content

🏢 University of Maryland

Why Warmup the Learning Rate? Underlying Mechanisms and Improvements
·7149 words·34 mins· loading · loading
AI Generated AI Theory Optimization 🏢 University of Maryland
Deep learning’s learning rate warmup improves performance by allowing larger learning rates, pushing networks to better-conditioned loss landscape areas.
Transformers Can Do Arithmetic with the Right Embeddings
·3154 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Maryland
Researchers enhanced transformer performance on arithmetic tasks by introducing Abacus Embeddings, which encode each digit’s position, enabling improved generalization and unlocking multi-step reasoni…
Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations
·1994 words·10 mins· loading · loading
Computer Vision Video Understanding 🏢 University of Maryland
ConVRT: A novel framework restores turbulence-distorted videos by decoupling spatial and temporal information in a neural representation, achieving temporally consistent mitigation.
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
·2560 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland
SHED, a Shapley value-based framework, efficiently refines instruction-tuning datasets for LLMs, producing high-performing subsets, only 10% of original size, that transfer well across different model…
QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos
·3903 words·19 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 University of Maryland
QUEEN: A novel framework for quantized and efficient streaming of free-viewpoint videos achieving high compression, quality, and speed.
Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory
·4015 words·19 mins· loading · loading
AI Generated AI Theory Interpretability 🏢 University of Maryland
Counterfactual Clamping Attack (CCA) improves model reconstruction using counterfactual explanations by leveraging decision boundary proximity, offering theoretical guarantees and enhanced fidelity.
Loki: Low-rank Keys for Efficient Sparse Attention
·3255 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Maryland
Loki: Low-rank Keys for Efficient Sparse Attention accelerates attention mechanisms in LLMs by exploiting the low-dimensionality of key vectors. It dynamically selects key tokens based on approximate…
Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models
·2218 words·11 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland
Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations
·1833 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Maryland
FLORA enables efficient & private federated fine-tuning of LLMs via novel stacking-based heterogeneous low-rank adaptation, surpassing existing methods.
Fairness and Efficiency in Online Class Matching
·1500 words·8 mins· loading · loading
AI Generated AI Theory Fairness 🏢 University of Maryland
First non-wasteful algorithm achieving 1/2-approximation for class envy-freeness, class proportionality, and utilitarian social welfare in online class matching.
FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?
·1605 words·8 mins· loading · loading
Machine Learning Federated Learning 🏢 University of Maryland
FACT, a novel federated learning mechanism, eliminates free-riding and incentivizes truthful agent behavior by introducing a penalty system and a competitive environment, boosting model performance si…
Estimating Epistemic and Aleatoric Uncertainty with a Single Model
·2171 words·11 mins· loading · loading
Machine Learning Deep Learning 🏢 University of Maryland
HyperDM accurately estimates both epistemic and aleatoric uncertainty using a single model, overcoming the computational limitations of existing ensemble methods.
Dueling over Dessert, Mastering the Art of Repeated Cake Cutting
·2291 words·11 mins· loading · loading
AI Theory Fairness 🏢 University of Maryland
Repeated cake-cutting game reveals that strategic players can exploit myopic opponents, but equitable outcomes are achievable through specific strategies.
DMesh: A Differentiable Mesh Representation
·3349 words·16 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Maryland
DMesh: A novel differentiable mesh representation enabling efficient gradient-based optimization for diverse 3D shape applications.
Differentiable Quantum Computing for Large-scale Linear Control
·1462 words·7 mins· loading · loading
AI Generated AI Applications Robotics 🏢 University of Maryland
Quantum algorithm achieves super-quadratic speedup for large-scale linear control, offering a novel approach to address the computational challenges of optimizing complex dynamical systems.
Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance
·3386 words·16 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 University of Maryland
Equivariant Graph Neural Networks boost multi-agent reinforcement learning by improving sample efficiency and generalization, overcoming inherent exploration biases.
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
·2673 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland
Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.
Ad Auctions for LLMs via Retrieval Augmented Generation
·2337 words·11 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland
This paper introduces segment auctions, maximizing logarithmic social welfare, for integrating ads into LLM outputs via Retrieval Augmented Generation, balancing ad revenue and output quality.