š¢ Georgia Institute of Technology
Zeroth-Order Sampling Methods for Non-Log-Concave Distributions: Alleviating Metastability by Denoising Diffusion
·2790 words·14 mins·
loading
·
loading
AI Theory
Sampling
š¢ Georgia Institute of Technology
Zeroth-Order Diffusion Monte Carlo (ZOD-MC) efficiently samples from non-log-concave distributions using only zeroth-order queries, overcoming metastability issues and outperforming state-of-the-art s…
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack
·2382 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
š¢ Georgia Institute of Technology
Vaccine: a novel technique safeguards LLMs against harmful fine-tuning attacks by creating invariant hidden embeddings.
Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data
·1955 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
š¢ Georgia Institute of Technology
Deep learning scaling laws are explained by novel approximation and estimation theories for transformers on low-dimensional data, resolving discrepancies between theory and practice.
Semi-supervised Knowledge Transfer Across Multi-omic Single-cell Data
·2468 words·12 mins·
loading
·
loading
Machine Learning
Semi-Supervised Learning
š¢ Georgia Institute of Technology
DANCE, a novel semi-supervised framework, efficiently transfers cell types across multi-omic single-cell data even with limited labeled samples, outperforming current state-of-the-art methods.
Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models
·1703 words·8 mins·
loading
·
loading
AI Theory
Robustness
š¢ Georgia Institute of Technology
Selective Projection Decay (SPD) enhances robust fine-tuning of foundation models by selectively applying weight decay, improving generalization and out-of-distribution robustness.
Quantitative Convergences of Lie Group Momentum Optimizers
·1602 words·8 mins·
loading
·
loading
Machine Learning
Optimization
š¢ Georgia Institute of Technology
Accelerated Lie group optimization achieved via a novel momentum algorithm (Lie NAG-SC) with proven convergence rates, surpassing existing methods in efficiency.
Provable Acceleration of Nesterov's Accelerated Gradient for Asymmetric Matrix Factorization and Linear Neural Networks
·1572 words·8 mins·
loading
·
loading
AI Theory
Optimization
š¢ Georgia Institute of Technology
This paper proves Nesterov’s Accelerated Gradient achieves faster convergence for rectangular matrix factorization and linear neural networks, using a novel unbalanced initialization.
Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks
·1447 words·7 mins·
loading
·
loading
Machine Learning
Optimization
š¢ Georgia Institute of Technology
New analysis reveals how reweighted least-squares algorithms for linear diagonal networks achieve favorable performance in high-dimensional settings, improving upon existing theoretical guarantees and…
Online Relational Inference for Evolving Multi-agent Interacting Systems
·2683 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
š¢ Georgia Institute of Technology
ORI: a novel online relational inference framework efficiently identifies hidden interaction graphs in evolving multi-agent systems using streaming data and real-time adaptation.
Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack
·2933 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
š¢ Georgia Institute of Technology
Lisa: a novel lazy safety alignment method safeguards LLMs against harmful fine-tuning attacks by introducing a proximal term to constrain model drift, significantly improving alignment performance.
Learning Spatially-Aware Language and Audio Embeddings
·3744 words·18 mins·
loading
·
loading
Multimodal Learning
Audio-Visual Learning
š¢ Georgia Institute of Technology
ELSA: a new model that learns spatially aware language and audio embeddings, achieving state-of-the-art performance in semantic retrieval and 3D sound source localization.
Large Pre-trained time series models for cross-domain Time series analysis tasks
·1870 words·9 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
š¢ Georgia Institute of Technology
Large Pre-trained Time-series Models (LPTM) achieves superior forecasting and time-series classification results using a novel adaptive segmentation method, requiring up to 40% less data and 50% less …
Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning
·1899 words·9 mins·
loading
·
loading
AI Theory
Privacy
š¢ Georgia Institute of Technology
Langevin unlearning offers a novel, privacy-preserving machine unlearning framework based on noisy gradient descent, handling both convex and non-convex problems efficiently.
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
·2980 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
š¢ Georgia Institute of Technology
HYDRA, a novel model factorization framework, significantly improves black-box LLM personalization by capturing both user-specific behavior and shared knowledge, achieving a 9.01% average relative imp…
High-dimensional (Group) Adversarial Training in Linear Regression
·1556 words·8 mins·
loading
·
loading
AI Generated
Machine Learning
Optimization
š¢ Georgia Institute of Technology
Adversarial training achieves minimax-optimal prediction error in high-dimensional linear regression under lā-perturbation, improving upon existing methods.
Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models
·3176 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
š¢ Georgia Institute of Technology
BeNeDiff uses generative diffusion models to disentangle and interpret neural dynamics linked to specific behaviors, providing interpretable quantifications of behavior in multi-brain region datasets.
Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies
·2633 words·13 mins·
loading
·
loading
AI Applications
Robotics
š¢ Georgia Institute of Technology
DP-Attacker unveils diffusion-based policy vulnerabilities by crafting effective adversarial attacks, significantly impacting robot safety and paving the way for more robust AI.
Differentially Private Graph Diffusion with Applications in Personalized PageRanks
·1969 words·10 mins·
loading
·
loading
AI Theory
Privacy
š¢ Georgia Institute of Technology
This paper introduces a novel differentially private graph diffusion framework ensuring edge-level privacy, significantly improving utility-privacy trade-offs for personalized PageRank computation.
Derivative-enhanced Deep Operator Network
·3502 words·17 mins·
loading
·
loading
Machine Learning
Deep Learning
š¢ Georgia Institute of Technology
Derivative-enhanced DeepONets boost PDE solution accuracy and derivative approximation, particularly valuable with limited training data.
Certified Machine Unlearning via Noisy Stochastic Gradient Descent
·2364 words·12 mins·
loading
·
loading
AI Generated
AI Theory
Privacy
š¢ Georgia Institute of Technology
This paper introduces a novel machine unlearning method using projected noisy stochastic gradient descent, providing the first approximate unlearning guarantee under convexity, significantly improving…