π’ Columbia University
Towards a 'Universal Translator' for Neural Dynamics at Single-Cell, Single-Spike Resolution
·2778 words·14 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
π’ Columbia University
A new self-supervised learning approach, Multi-task Masking (MtM), significantly improves the prediction accuracy of neural population activity by capturing neural dynamics at multiple spatial scales,…
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
·336 words·2 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Columbia University
New research precisely defines the computational limits of training large language models, revealing a sharp threshold based on parameter matrix entries, paving the way for faster algorithms.
The Fairness-Quality Tradeoff in Clustering
·2122 words·10 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
π’ Columbia University
Novel algorithms trace the optimal balance between clustering quality and fairness, revealing all non-dominated solutions for various objectives.
SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
·1972 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Columbia University
SEMCODER: A novel 6.7B parameter code LLM surpasses GPT-3.5-turbo’s performance on code generation and execution reasoning by employing ‘monologue reasoning’βtraining the model to verbally explain cod…
Randomized Strategic Facility Location with Predictions
·1312 words·7 mins·
loading
·
loading
AI Theory
Optimization
π’ Columbia University
Randomized strategies improve truthful learning-augmented mechanisms for strategic facility location, achieving better approximations than deterministic methods.
Promoting Fairness Among Dynamic Agents in Online-Matching Markets under Known Stationary Arrival Distributions
·1572 words·8 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
π’ Columbia University
This paper presents novel algorithms for online matching markets that prioritize fairness among dynamic agents, achieving asymptotic optimality in various scenarios and offering extensions to group-le…
Partial Transportability for Domain Generalization
·2485 words·12 mins·
loading
·
loading
AI Theory
Generalization
π’ Columbia University
This paper introduces a novel technique to bound prediction risks in new domains using causal diagrams, enabling reliable AI performance guarantees.
Nonparametric Instrumental Variable Regression through Stochastic Approximate Gradients
·1348 words·7 mins·
loading
·
loading
AI Theory
Causality
π’ Columbia University
SAGD-IV: a novel functional stochastic gradient descent algorithm for stable nonparametric instrumental variable regression, excelling in handling binary outcomes and various loss functions.
Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making
·1773 words·9 mins·
loading
·
loading
AI Theory
Fairness
π’ Columbia University
AI bias amplification in decision-making is uncovered, showing how fair prediction scores can become discriminatory after thresholding, urging stronger regulatory oversight.
Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
·2759 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Columbia University
Greedy algorithms for linear contextual bandits achieve poly-logarithmic regret under the novel Local Anti-Concentration condition, expanding applicable distributions beyond Gaussians and uniforms.
Is Cross-validation the Gold Standard to Estimate Out-of-sample Model Performance?
·1790 words·9 mins·
loading
·
loading
AI Theory
Optimization
π’ Columbia University
Cross-validation isn’t always superior; simple plug-in methods often perform equally well for estimating out-of-sample model performance, especially when considering computational costs.
Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse
·3248 words·16 mins·
loading
·
loading
AI Generated
Machine Learning
Transfer Learning
π’ Columbia University
Multi-task learning and finetuning show surprising feature reuse biases, including a novel ’nested feature selection’ regime where finetuning prioritizes a sparse subset of pretrained features, signif…
Group-wise oracle-efficient algorithms for online multi-group learning
·316 words·2 mins·
loading
·
loading
AI Theory
Fairness
π’ Columbia University
Oracle-efficient algorithms conquer online multi-group learning, achieving sublinear regret even with massive, overlapping groups, paving the way for fair and efficient large-scale online systems.
Fair Secretaries with Unfair Predictions
·1586 words·8 mins·
loading
·
loading
AI Theory
Fairness
π’ Columbia University
Fair algorithms can leverage biased predictions to improve performance while guaranteeing fairness for all candidates.
Extensive-Form Game Solving via Blackwell Approachability on Treeplexes
·2500 words·12 mins·
loading
·
loading
Reinforcement Learning
π’ Columbia University
First algorithmic framework for Blackwell approachability on treeplexes, enabling stepsize-invariant EFG solvers with state-of-the-art convergence rates.
Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component Analysis
·3550 words·17 mins·
loading
·
loading
AI Generated
Machine Learning
Representation Learning
π’ Columbia University
Supervised Independent Subspace PCA (sisPCA) disentangles interpretable factors in high-dimensional data by leveraging supervision to maximize subspace dependence on target variables while minimizing …
Disentangled Representation Learning in Non-Markovian Causal Systems
·2882 words·14 mins·
loading
·
loading
AI Theory
Causality
π’ Columbia University
This paper introduces graphical criteria and an algorithm for disentangling causal factors from heterogeneous data in non-Markovian settings, advancing causal representation learning.
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference
·1804 words·9 mins·
loading
·
loading
Machine Learning
Gaussian Processes
π’ Columbia University
Computation-Aware Gaussian Processes (CaGP) achieve linear-time inference and model selection, enabling efficient training of GPs on large datasets without compromising uncertainty quantification, sur…
Community Detection Guarantees using Embeddings Learned by Node2Vec
·2609 words·13 mins·
loading
·
loading
AI Generated
AI Theory
Representation Learning
π’ Columbia University
Node2Vec, a popular network embedding method, is proven to consistently recover community structure in stochastic block models, paving the way for more reliable unsupervised community detection.
Causal Imitation for Markov Decision Processes: a Partial Identification Approach
·1601 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Columbia University
This paper presents novel causal imitation learning algorithms using partial identification to achieve expert performance even when unobserved confounders affect Markov Decision Processes.