Skip to main content

Posters

2024

MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers
·2036 words·10 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Peking University
MemoryFormer drastically cuts large language model computation by replacing fully-connected layers with memory-efficient hashing, enabling faster and more scalable AI.
Memory-Efficient LLM Training with Online Subspace Descent
·1794 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Texas at Austin
Online Subspace Descent: a novel memory-efficient LLM training algorithm guaranteed to converge, closing the performance gap with full-rank methods.
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
·3095 words·15 mins· loading · loading
AI Generated Machine Learning Meta Learning 🏒 National University of Singapore
FGΒ²U: a novel memory-efficient algorithm for unbiased stochastic approximation of meta-gradients in large-scale bi-level optimization, showing superior performance across diverse tasks.
MeMo: Meaningful, Modular Controllers via Noise Injection
·3100 words·15 mins· loading · loading
AI Applications Robotics 🏒 MIT
MeMo: a novel framework for pretraining meaningful, modular robot controllers via noise injection, enabling efficient transfer learning across different robot morphologies and tasks.
Membership Inference Attacks against Large Vision-Language Models
·3357 words·16 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 LIONS, EPFL
First benchmark for detecting training data in large vision-language models (VLLMs) improves data security.
Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
·2645 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Tsinghua University
SPV-MIA, a novel membership inference attack, significantly improves the accuracy of identifying training data in fine-tuned LLMs by using self-prompt calibration and probabilistic variation.
MeLLoC: Lossless Compression with High-order Mechanism Learning
·1838 words·9 mins· loading · loading
AI Generated AI Applications Healthcare 🏒 Fudan University
MeLLoC: Mechanism Learning for Lossless Compression, a novel approach that combines high-order mechanism learning with classical encoding, significantly improves lossless compression for scientific da…
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
·1897 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Meta AI
MEGALODON: A new neural architecture for LLMs, enabling unlimited context length with improved efficiency and accuracy.
MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning
·2561 words·13 mins· loading · loading
Natural Language Processing Question Answering 🏒 University of Washington
MEDIQ benchmark revolutionizes LLM evaluation by shifting from static to interactive clinical reasoning, revealing LLMs’ struggles with proactive information-seeking and highlighting the importance of…
Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification
·1761 words·9 mins· loading · loading
AI Applications Healthcare 🏒 University of North Carolina - Charlotte
Medformer: A novel multi-granularity patching transformer achieves state-of-the-art performance in medical time series classification, excelling in challenging subject-independent settings.
Med-Real2Sim: Non-Invasive Medical Digital Twins using Physics-Informed Self-Supervised Learning
·2852 words·14 mins· loading · loading
AI Applications Healthcare 🏒 UC Berkeley
Med-Real2Sim uses physics-informed self-supervised learning to build non-invasive medical digital twins, enabling in-silico clinical trials and unsupervised disease detection.
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
·2097 words·10 mins· loading · loading
Natural Language Processing Interpretability 🏒 MIT
New metrics and p-annealing improve sparse autoencoder training for better language model interpretability.
Measuring Per-Unit Interpretability at Scale Without Humans
·4136 words·20 mins· loading · loading
Computer Vision Interpretability 🏒 Tübingen AI Center
New scalable method measures per-unit interpretability in vision DNNs without human evaluation, revealing anti-correlation between model performance and interpretability.
Measuring Mutual Policy Divergence for Multi-Agent Sequential Exploration
·2042 words·10 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Xi'an Jiaotong University
MADPO, a novel MARL framework, uses mutual policy divergence maximization with conditional Cauchy-Schwarz divergence to enhance exploration and agent heterogeneity in sequential updating, outperformin…
Measuring Dejavu Memorization Efficiently
·2794 words·14 mins· loading · loading
Computer Vision Representation Learning 🏒 FAIR at Meta
New method efficiently measures how well AI models memorize training data, revealing that open-source models memorize less than expected.
Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance
·2532 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Harbin Institute of Technology
Boosting LLMs’ abstract reasoning via ‘Meaningful Learning’: A new dataset and learning paradigm significantly enhance LLMs’ capacity for abstract reasoning, moving beyond simple memorization.
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
·272 words·2 mins· loading · loading
AI Theory Optimization 🏒 MIT
Researchers establish basis-free conditions for SGD learnability in two-layer neural networks learning subspace-sparse polynomials with Gaussian input, offering insights into training dynamics.
MC-DiT: Contextual Enhancement via Clean-to-Clean Reconstruction for Masked Diffusion Models
·2494 words·12 mins· loading · loading
Computer Vision Image Generation 🏒 Shanghai Jiao Tong University
MC-DiT: A novel training paradigm for masked diffusion models achieving state-of-the-art image generation by leveraging clean-to-clean reconstruction.
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
·2836 words·14 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 NVIDIA Corporation
MEow, a novel MaxEnt RL framework, achieves superior performance by unifying policy evaluation and improvement steps, enabling exact soft value function calculation without Monte Carlo approximation.
Maximizing utility in multi-agent environments by anticipating the behavior of other learners
·1732 words·9 mins· loading · loading
AI Theory Optimization 🏒 MIT
Optimizing against learning agents: New algorithms and computational limits revealed!