🏢 University of Cambridge

Zero-Shot Tokenizer Transfer

26 September 2024·2795 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Cambridge

Zero-Shot Tokenizer Transfer (ZeTT) detaches language models from their tokenizers via a hypernetwork, enabling efficient on-the-fly tokenizer swapping without retraining, significantly improving LLM …

Zero-Shot Reinforcement Learning from Low Quality Data

26 September 2024·4722 words·23 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 University of Cambridge

Zero-shot RL struggles with low-quality data; this paper introduces conservative algorithms that significantly boost performance on such data without sacrificing performance on high-quality data.

TinyTTA: Efficient Test-time Adaptation via Early-exit Ensembles on Edge Devices

26 September 2024·2263 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Cambridge

TinyTTA enables efficient test-time adaptation on memory-constrained edge devices using a novel self-ensemble and early-exit strategy, improving accuracy and reducing memory usage.

TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models

26 September 2024·8456 words·40 mins· loading · loading

AI Generated Machine Learning Generative Models 🏢 University of Cambridge

TabEBM: Class-specific EBMs boost tabular data augmentation, improving classification accuracy, especially on small datasets, by generating high-quality synthetic data.

Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments

26 September 2024·2758 words·13 mins· loading · loading

Machine Learning Self-Supervised Learning 🏢 University of Cambridge

Self-healing machine learning (SHML) autonomously diagnoses and fixes model performance degradation caused by data shifts, outperforming reason-agnostic methods.

Second-order forward-mode optimization of recurrent neural networks for neuroscience

26 September 2024·2260 words·11 mins· loading · loading

🏢 University of Cambridge

SOFO: a novel second-order optimizer enables efficient and memory-friendly RNN training for neuroscience tasks, surpassing Adam’s performance, especially on long time horizons.

Rule Extrapolation in Language Modeling: A Study of Compositional Generalization on OOD Prompts

26 September 2024·2787 words·14 mins· loading · loading

Large Language Models 🏢 University of Cambridge

LLMs struggle with out-of-distribution (OOD) generalization. This research introduces ‘rule extrapolation’ using formal languages to rigorously evaluate OOD behavior in various LLM architectures, rev…

Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe

26 September 2024·5026 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Cambridge

This research unveils a compute-optimal recipe for fine-tuning language models into high-quality text embedding models, offering practical guidance and scaling laws for resource-constrained settings.

Relational Concept Bottleneck Models

26 September 2024·2454 words·12 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Cambridge

Relational Concept Bottleneck Models (R-CBMs) merge interpretable CBMs with powerful GNNs for high-performing, explainable relational deep learning.

Recurrent neural network dynamical systems for biological vision

26 September 2024·2292 words·11 mins· loading · loading

Image Classification 🏢 University of Cambridge

CordsNet: a hybrid CNN-RNN architecture enabling biologically realistic, robust image recognition through continuous-time recurrent dynamics.

Predicting Ground State Properties: Constant Sample Complexity and Deep Learning Algorithms

26 September 2024·1574 words·8 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Cambridge

Deep learning algorithms now predict quantum ground state properties with constant sample complexity, regardless of system size, improving upon previous methods.

Predicting Future Actions of Reinforcement Learning Agents

26 September 2024·1902 words·9 mins· loading · loading

AI Applications Robotics 🏢 University of Cambridge

Predicting RL agent behavior is key for safety and interaction; this study reveals that explicitly planned agents are significantly easier to predict due to their internal plans.

Partially Observable Cost-Aware Active-Learning with Large Language Models

26 September 2024·3564 words·17 mins· loading · loading

AI Generated Machine Learning Active Learning 🏢 University of Cambridge

µPOCA: a new active learning approach maximizes model generalization using strategically acquired labels/features in data-scarce, costly scenarios with partial observability, leveraging LLMs for effic…

On conditional diffusion models for PDE simulations

26 September 2024·5766 words·28 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Cambridge

This paper introduces novel autoregressive sampling and hybrid training strategies for score-based diffusion models, significantly boosting PDE forecasting and assimilation accuracy.

Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks

26 September 2024·2633 words·13 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Cambridge

Researchers introduce Geometric Parameterization (GmP), a novel neural network parameterization resolving instability in ReLU network training, leading to faster convergence and better generalization.

Multi-language Diversity Benefits Autoformalization

26 September 2024·1698 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Cambridge

Researchers created MMA, a large multilingual dataset of informal-formal mathematical pairs, leveraging a language model for reverse translation. Fine-tuned models achieved significantly improved aut…

Localized Adaptive Risk Control

26 September 2024·2386 words·12 mins· loading · loading

AI Generated AI Theory Fairness 🏢 University of Cambridge

Localized Adaptive Risk Control (L-ARC) improves fairness and reliability of online prediction by providing localized statistical risk guarantees, surpassing existing methods in high-stakes applicatio…

Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes

26 September 2024·3448 words·17 mins· loading · loading

Machine Learning Gaussian Processes 🏢 University of Cambridge

Accelerate Gaussian process hyperparameter optimization by up to 72x using novel linear system solver techniques.

HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data

26 September 2024·1708 words·9 mins· loading · loading

AI Applications Healthcare 🏢 University of Cambridge

HEALNet: a novel multimodal fusion network achieving state-of-the-art performance on biomedical survival analysis by effectively integrating heterogeneous data while handling missing modalities.

GRANOLA: Adaptive Normalization for Graph Neural Networks

26 September 2024·3044 words·15 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Cambridge

GRANOLA: A novel graph-adaptive normalization layer significantly boosts GNN performance by dynamically adjusting node features based on the input graph’s unique structure.