🏢 University of Toronto
Your contrastive learning problem is secretly a distribution alignment problem
·381 words·2 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
🏢 University of Toronto
Contrastive learning is reframed as a distribution alignment problem, leading to a flexible framework (GCA) that improves representation learning with unbalanced optimal transport.
Temporal-Difference Learning Using Distributed Error Signals
·2668 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 University of Toronto
Artificial Dopamine (AD) algorithm achieves comparable performance to backpropagation methods in complex RL tasks by using only synchronously distributed per-layer TD errors, demonstrating the suffici…
Simplifying Constraint Inference with Inverse Reinforcement Learning
·1653 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Toronto
This paper simplifies constraint inference in reinforcement learning, demonstrating that standard inverse RL methods can effectively infer constraints from expert data, surpassing complex, previously …
Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood
·217 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
This paper introduces contextual Shtarkov sums, a new complexity measure characterizing minimax regret in sequential probability assignment with contexts, and derives the minimax optimal algorithm, co…
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
·1771 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Toronto
ExPerior leverages expert demonstrations to enhance online decision-making, even when experts use hidden contextual information unseen by the learner.
Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences
·2131 words·11 mins·
loading
·
loading
Generative Learning
🏢 University of Toronto
Curated synthetic data provably optimizes human preferences in iterative generative model training, maximizing expected reward while mitigating variance.
SCube: Instant Large-Scale Scene Reconstruction using VoxSplats
·3116 words·15 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Toronto
SCube: Instant large-scale 3D scene reconstruction from sparse images using VoxSplats, a novel 3D Gaussian splat representation.
RGFN: Synthesizable Molecular Generation Using GFlowNets
·3785 words·18 mins·
loading
·
loading
AI Applications
Healthcare
🏢 University of Toronto
Reaction-GFlowNet (RGFN) revolutionizes small molecule discovery by generating synthesizable molecules directly within the chemical reaction space, dramatically expanding the search space for drug dis…
Reward Machines for Deep RL in Noisy and Uncertain Environments
·2032 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Toronto
Deep RL agents can now effectively learn complex tasks even with noisy, uncertain sensor readings by exploiting the structure of Reward Machines.
Random Cycle Coding: Lossless Compression of Cluster Assignments via Bits-Back Coding
·1446 words·7 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
Random Cycle Coding (RCC) optimally compresses cluster assignments in large datasets, saving up to 70% storage in vector databases by eliminating the need for integer IDs.
Quantum Deep Equilibrium Models
·1736 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 University of Toronto
Quantum Deep Equilibrium Models (QDEQs) achieve higher QML performance with shallower circuits by using a DEQ training paradigm, improving near-term quantum computation efficiency.
Policy Aggregation
·1384 words·7 mins·
loading
·
loading
AI Theory
Fairness
🏢 University of Toronto
This paper introduces efficient algorithms that leverage social choice theory to aggregate multiple individual preferences, resulting in a desirable collective AI policy.
Paths to Equilibrium in Games
·265 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
In n-player games, a satisficing path always exists leading from any initial strategy profile to a Nash equilibrium by allowing unsatisfied players to explore suboptimal strategies.
On the Efficiency of ERM in Feature Learning
·316 words·2 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
🏢 University of Toronto
ERM’s efficiency in feature learning surprisingly remains high even with massive feature maps; its excess risk asymptotically matches an oracle procedure’s, implying potential for streamlined feature-…
Observational Scaling Laws and the Predictability of Langauge Model Performance
·4816 words·23 mins·
loading
·
loading
Large Language Models
🏢 University of Toronto
Researchers predict language model performance by observing existing models, bypassing costly training, revealing surprising predictability in complex scaling phenomena.
Neur2BiLO: Neural Bilevel Optimization
·2909 words·14 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
NEUR2BILO: a neural network-based heuristic solves mixed-integer bilevel optimization problems extremely fast, achieving high-quality solutions for diverse applications.
Minimum Entropy Coupling with Bottleneck
·2823 words·14 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
A novel lossy compression framework, Minimum Entropy Coupling with Bottleneck (MEC-B), extends existing methods by integrating a bottleneck for controlled stochasticity, enhancing performance in scen…
Maia-2: A Unified Model for Human-AI Alignment in Chess
·2577 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Toronto
Maia-2: A unified model for human-AI alignment in chess, coherently captures human play across skill levels, significantly improving AI-human alignment and paving the way for AI-guided teaching.
LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language
·5678 words·27 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Toronto
LLM Processes leverage LLMs to create probabilistic regression models guided by natural language, enabling seamless integration of expert knowledge and improving prediction accuracy.
Linguistic Collapse: Neural Collapse in (Large) Language Models
·6528 words·31 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 University of Toronto
Scaling causal language models reveals a connection between neural collapse properties, model size, and improved generalization, highlighting NC’s broader relevance to LLMs.