🏢 University of Maryland
Why Warmup the Learning Rate? Underlying Mechanisms and Improvements
·7149 words·34 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 University of Maryland
Deep learning’s learning rate warmup improves performance by allowing larger learning rates, pushing networks to better-conditioned loss landscape areas.
Transformers Can Do Arithmetic with the Right Embeddings
·3154 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Maryland
Researchers enhanced transformer performance on arithmetic tasks by introducing Abacus Embeddings, which encode each digit’s position, enabling improved generalization and unlocking multi-step reasoni…
Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations
·1994 words·10 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 University of Maryland
ConVRT: A novel framework restores turbulence-distorted videos by decoupling spatial and temporal information in a neural representation, achieving temporally consistent mitigation.
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
·2560 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Maryland
SHED, a Shapley value-based framework, efficiently refines instruction-tuning datasets for LLMs, producing high-performing subsets, only 10% of original size, that transfer well across different model…
QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos
·3903 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 University of Maryland
QUEEN: A novel framework for quantized and efficient streaming of free-viewpoint videos achieving high compression, quality, and speed.
Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory
·4015 words·19 mins·
loading
·
loading
AI Generated
AI Theory
Interpretability
🏢 University of Maryland
Counterfactual Clamping Attack (CCA) improves model reconstruction using counterfactual explanations by leveraging decision boundary proximity, offering theoretical guarantees and enhanced fidelity.
Loki: Low-rank Keys for Efficient Sparse Attention
·3255 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Maryland
Loki: Low-rank Keys for Efficient Sparse Attention accelerates attention mechanisms in LLMs by exploiting the low-dimensionality of key vectors. It dynamically selects key tokens based on approximate…
Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models
·2218 words·11 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Maryland
Injecting watermarks into LLM outputs while speeding up generation is impossible; this paper proves this trade-off and offers methods prioritizing either watermark strength or speed.
FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations
·1833 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Maryland
FLORA enables efficient & private federated fine-tuning of LLMs via novel stacking-based heterogeneous low-rank adaptation, surpassing existing methods.
Fairness and Efficiency in Online Class Matching
·1500 words·8 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
🏢 University of Maryland
First non-wasteful algorithm achieving 1/2-approximation for class envy-freeness, class proportionality, and utilitarian social welfare in online class matching.
FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?
·1605 words·8 mins·
loading
·
loading
Machine Learning
Federated Learning
🏢 University of Maryland
FACT, a novel federated learning mechanism, eliminates free-riding and incentivizes truthful agent behavior by introducing a penalty system and a competitive environment, boosting model performance si…
Estimating Epistemic and Aleatoric Uncertainty with a Single Model
·2171 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 University of Maryland
HyperDM accurately estimates both epistemic and aleatoric uncertainty using a single model, overcoming the computational limitations of existing ensemble methods.
Dueling over Dessert, Mastering the Art of Repeated Cake Cutting
·2291 words·11 mins·
loading
·
loading
AI Theory
Fairness
🏢 University of Maryland
Repeated cake-cutting game reveals that strategic players can exploit myopic opponents, but equitable outcomes are achievable through specific strategies.
DMesh: A Differentiable Mesh Representation
·3349 words·16 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Maryland
DMesh: A novel differentiable mesh representation enabling efficient gradient-based optimization for diverse 3D shape applications.
Differentiable Quantum Computing for Large-scale Linear Control
·1462 words·7 mins·
loading
·
loading
AI Generated
AI Applications
Robotics
🏢 University of Maryland
Quantum algorithm achieves super-quadratic speedup for large-scale linear control, offering a novel approach to address the computational challenges of optimizing complex dynamical systems.
Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance
·3386 words·16 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 University of Maryland
Equivariant Graph Neural Networks boost multi-agent reinforcement learning by improving sample efficiency and generalization, overcoming inherent exploration biases.
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
·2673 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Maryland
Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.
Ad Auctions for LLMs via Retrieval Augmented Generation
·2337 words·11 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Maryland
This paper introduces segment auctions, maximizing logarithmic social welfare, for integrating ads into LLM outputs via Retrieval Augmented Generation, balancing ad revenue and output quality.