Machine Learning
Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff
·592 words·3 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ MIT
LOLIPOP: A novel algorithm achieving near-optimal regret for offline contextual Markov Decision Processes (CMDPs) using only O(H log T) offline density estimation oracle calls.
Offline Behavior Distillation
·1729 words·9 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ School of Computer Science, University of Sydney
This paper introduces Offline Behavior Distillation (OBD) to synthesize compact expert behavioral data from massive sub-optimal RL data, enabling faster policy learning.
Off-Dynamics Reinforcement Learning via Domain Adaptation and Reward Augmented Imitation
·6706 words·32 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Johns Hopkins University
DARAIL, a novel algorithm, tackles off-dynamics reinforcement learning by combining reward modification with imitation learning to transfer a learned policy from a source to a target domain. This app…
Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality
·1532 words·8 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ University of Illinois Urbana-Champaign
Model-free policy gradient methods using occupancy functions are developed for online and offline RL, achieving computational efficiency and handling arbitrary data distributions.
OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
·2351 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Carnegie Mellon University
OASIS, a novel data-centric approach, shapes offline data distributions toward safer, higher-reward policies using a conditional diffusion model, outperforming existing offline safe RL methods.
Nuclear Norm Regularization for Deep Learning
·1763 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ MIT
This paper presents a novel, efficient method for Jacobian nuclear norm regularization in deep learning, replacing computationally expensive SVDs with equivalent Frobenius norm computations, thereby e…
Normalization and effective learning rates in reinforcement learning
·2714 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Google DeepMind
Normalize-and-Project (NaP) boosts reinforcement learning by stabilizing layer normalization, preventing plasticity loss, and enabling effective learning rate control.
Nonstationary Sparse Spectral Permanental Process
·2196 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Center for Applied Statistics and School of Statistics, Renmin University of China
Nonstationary Sparse Spectral Permanental Process (NSSPP) enhances point process modeling by using sparse spectral representations, enabling flexible, efficient, nonstationary kernel learning.
Nonparametric Evaluation of Noisy ICA Solutions
·3684 words·18 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Department of Computer Science, UT Austin
Adaptive algorithm selection for noisy ICA is achieved via a novel nonparametric independence score, improving accuracy and efficiency.
Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks
·1457 words·7 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ UC Santa Barbara
Overparameterized ConvResNets surprisingly excel at prediction; this study proves they efficiently learn smooth functions on low-dimensional manifolds, avoiding the curse of dimensionality.
Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data
·1738 words·9 mins·
loading
·
loading
Machine Learning
Federated Learning
π’ KTH Royal Institute of Technology
This paper proposes a novel federated learning algorithm for nonconvex problems on compact smooth manifolds, achieving both computational and communication efficiency while mitigating client drift.
Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset
·4994 words·24 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Google DeepMind
AI models struggle with changing data; this paper introduces Soft Resets, a novel learning approach that uses an adaptive drift to gracefully guide parameters toward initialization, improving adaptabi…
Non-parametric classification via expand-and-sparsify representation
·1563 words·8 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Wichita State University
New non-parametric classifiers using expand-and-sparsify representation achieve minimax-optimal convergence, adapting to low-dimensional manifold structure.
Non-Euclidean Mixture Model for Social Network Embedding
·2185 words·11 mins·
loading
·
loading
Machine Learning
Representation Learning
π’ UC Los Angeles
Non-Euclidean Mixture Model (NMM-GNN) outperforms existing methods by using spherical and hyperbolic spaces to model homophily and social influence in social network embedding, improving link predicti…
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
·2753 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Optimization
π’ Sorbonne UniversitΓ©
This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…
Noether's Razor: Learning Conserved Quantities
·2052 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Imperial College London
Noether’s Razor learns conserved quantities and symmetries directly from data via Bayesian model selection, improving dynamical systems modeling accuracy and generalizability.
No-Regret Bandit Exploration based on Soft Tree Ensemble Model
·1480 words·7 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ LY Corporation
A novel stochastic bandit algorithm using soft tree ensemble models achieves lower cumulative regret than existing ReLU-based neural bandit algorithms, offering a constrained yet effective hypothesis …
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
·5380 words·26 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ CLAIRE, EPFL
Deep RL agents trained under non-stationarity suffer performance collapse due to representation degradation; this work reveals this in PPO and introduces Proximal Feature Optimization (PFO) to mitigat…
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
·4811 words·23 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ University of Oxford
AI agents learn better with well-designed training environments. This paper reveals flaws in current environment-selection methods and introduces Sampling for Learnability (SFL), a new approach that …
Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms
·2334 words·11 mins·
loading
·
loading
Machine Learning
Optimization
π’ Stanford University
Newton Losses enhance training of neural networks with complex objectives by using second-order information from loss functions, achieving significant performance gains.