๐ข University of Cambridge
Zero-Shot Tokenizer Transfer
·2795 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
๐ข University of Cambridge
Zero-Shot Tokenizer Transfer (ZeTT) detaches language models from their tokenizers via a hypernetwork, enabling efficient on-the-fly tokenizer swapping without retraining, significantly improving LLM …
Zero-Shot Reinforcement Learning from Low Quality Data
·4722 words·23 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
๐ข University of Cambridge
Zero-shot RL struggles with low-quality data; this paper introduces conservative algorithms that significantly boost performance on such data without sacrificing performance on high-quality data.
TinyTTA: Efficient Test-time Adaptation via Early-exit Ensembles on Edge Devices
·2263 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข University of Cambridge
TinyTTA enables efficient test-time adaptation on memory-constrained edge devices using a novel self-ensemble and early-exit strategy, improving accuracy and reducing memory usage.
TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models
·8456 words·40 mins·
loading
·
loading
AI Generated
Machine Learning
Generative Models
๐ข University of Cambridge
TabEBM: Class-specific EBMs boost tabular data augmentation, improving classification accuracy, especially on small datasets, by generating high-quality synthetic data.
Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
·2758 words·13 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
๐ข University of Cambridge
Self-healing machine learning (SHML) autonomously diagnoses and fixes model performance degradation caused by data shifts, outperforming reason-agnostic methods.
Second-order forward-mode optimization of recurrent neural networks for neuroscience
·2260 words·11 mins·
loading
·
loading
๐ข University of Cambridge
SOFO: a novel second-order optimizer enables efficient and memory-friendly RNN training for neuroscience tasks, surpassing Adam’s performance, especially on long time horizons.
Rule Extrapolation in Language Modeling: A Study of Compositional Generalization on OOD Prompts
·2787 words·14 mins·
loading
·
loading
Large Language Models
๐ข University of Cambridge
LLMs struggle with out-of-distribution (OOD) generalization. This research introduces ‘rule extrapolation’ using formal languages to rigorously evaluate OOD behavior in various LLM architectures, rev…
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe
·5026 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
๐ข University of Cambridge
This research unveils a compute-optimal recipe for fine-tuning language models into high-quality text embedding models, offering practical guidance and scaling laws for resource-constrained settings.
Relational Concept Bottleneck Models
·2454 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
๐ข University of Cambridge
Relational Concept Bottleneck Models (R-CBMs) merge interpretable CBMs with powerful GNNs for high-performing, explainable relational deep learning.
Recurrent neural network dynamical systems for biological vision
·2292 words·11 mins·
loading
·
loading
Image Classification
๐ข University of Cambridge
CordsNet: a hybrid CNN-RNN architecture enabling biologically realistic, robust image recognition through continuous-time recurrent dynamics.
Predicting Ground State Properties: Constant Sample Complexity and Deep Learning Algorithms
·1574 words·8 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข University of Cambridge
Deep learning algorithms now predict quantum ground state properties with constant sample complexity, regardless of system size, improving upon previous methods.
Predicting Future Actions of Reinforcement Learning Agents
·1902 words·9 mins·
loading
·
loading
AI Applications
Robotics
๐ข University of Cambridge
Predicting RL agent behavior is key for safety and interaction; this study reveals that explicitly planned agents are significantly easier to predict due to their internal plans.
Partially Observable Cost-Aware Active-Learning with Large Language Models
·3564 words·17 mins·
loading
·
loading
AI Generated
Machine Learning
Active Learning
๐ข University of Cambridge
ยตPOCA: a new active learning approach maximizes model generalization using strategically acquired labels/features in data-scarce, costly scenarios with partial observability, leveraging LLMs for effic…
On conditional diffusion models for PDE simulations
·5766 words·28 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข University of Cambridge
This paper introduces novel autoregressive sampling and hybrid training strategies for score-based diffusion models, significantly boosting PDE forecasting and assimilation accuracy.
Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks
·2633 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
๐ข University of Cambridge
Researchers introduce Geometric Parameterization (GmP), a novel neural network parameterization resolving instability in ReLU network training, leading to faster convergence and better generalization.
Multi-language Diversity Benefits Autoformalization
·1698 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
๐ข University of Cambridge
Researchers created MMA, a large multilingual dataset of informal-formal mathematical pairs, leveraging a language model for reverse translation. Fine-tuned models achieved significantly improved aut…
Localized Adaptive Risk Control
·2386 words·12 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
๐ข University of Cambridge
Localized Adaptive Risk Control (L-ARC) improves fairness and reliability of online prediction by providing localized statistical risk guarantees, surpassing existing methods in high-stakes applicatio…
Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes
·3448 words·17 mins·
loading
·
loading
Machine Learning
Gaussian Processes
๐ข University of Cambridge
Accelerate Gaussian process hyperparameter optimization by up to 72x using novel linear system solver techniques.
HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data
·1708 words·9 mins·
loading
·
loading
AI Applications
Healthcare
๐ข University of Cambridge
HEALNet: a novel multimodal fusion network achieving state-of-the-art performance on biomedical survival analysis by effectively integrating heterogeneous data while handling missing modalities.
GRANOLA: Adaptive Normalization for Graph Neural Networks
·3044 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
๐ข University of Cambridge
GRANOLA: A novel graph-adaptive normalization layer significantly boosts GNN performance by dynamically adjusting node features based on the input graph’s unique structure.