Skip to main content

Posters

2024

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
·1807 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Oxford
BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.
Balancing Context Length and Mixing Times for Reinforcement Learning at Scale
·1724 words·9 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 IBM Research
Longer context in RL boosts generalization but slows down learning; this paper reveals the crucial tradeoff and offers theoretical insights.
BAKU: An Efficient Transformer for Multi-Task Policy Learning
·4209 words·20 mins· loading · loading
AI Applications Robotics 🏒 New York University
BAKU: A simple transformer enables efficient multi-task robot policy learning, achieving 91% success on real-world tasks with limited data.
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
·2359 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Chinese University of Hong Kong, Shenzhen
BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…
BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
·2859 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Wisconsin-Madison
BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…
Back to the Continuous Attractor
·5636 words·27 mins· loading · loading
AI Generated AI Theory Generalization 🏒 Champalimaud Centre for the Unknown
Despite their brittleness, continuous attractors remain functionally robust analog memory models due to persistent slow manifolds surviving bifurcations, enabling accurate approximation and generaliza…
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
·1821 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏒 AWS AI Labs
B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
·2514 words·12 mins· loading · loading
AI Theory Interpretability 🏒 Max Planck Institute for Informatics
B-cosification: cheaply transform any pre-trained deep neural network into an inherently interpretable model.
B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data
·1511 words·8 mins· loading · loading
Machine Learning Deep Learning 🏒 Chinese University of Hong Kong, Shenzhen
B-ary Tree Push-Pull (BTPP) achieves linear speedup for distributed learning on heterogeneous data, significantly outperforming state-of-the-art methods with minimal communication.
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
·2546 words·12 mins· loading · loading
Multimodal Learning Vision-Language Models 🏒 State Key Laboratory for Novel Software Technology, Nanjing University
AWT: a novel framework boosts vision-language model’s zero-shot capabilities by augmenting inputs, weighting them dynamically, and leveraging optimal transport to enhance semantic correlations.
Avoiding Undesired Future with Minimal Cost in Non-Stationary Environments
·2100 words·10 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 National Key Laboratory for Novel Software Technology, Nanjing University, China
AUF-MICNS: A novel sequential method efficiently solves the avoiding undesired future problem by dynamically updating influence relations in non-stationary environments while minimizing action costs.
AverNet: All-in-one Video Restoration for Time-varying Unknown Degradations
·2558 words·13 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏒 College of Computer Science, Sichuan University, China
AverNet: All-in-one video restoration defying time-varying unknown degradations.
Average gradient outer product as a mechanism for deep neural collapse
·2027 words·10 mins· loading · loading
AI Theory Optimization 🏒 UC San Diego
Deep Neural Collapse (DNC) explained via Average Gradient Outer Product (AGOP).
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
·3104 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Stanford University
AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
·2151 words·11 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏒 University of Surrey, UK
AV-GS: A novel Audio-Visual Gaussian Splatting model, uses geometry and material-aware priors to efficiently synthesize realistic binaural audio from a single audio source.
AV-Cloud: Spatial Audio Rendering Through Audio-Visual Cloud Splatting
·2151 words·11 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏒 University of Washington
AV-Cloud: Real-time, high-quality 3D spatial audio rendering synced with visuals, bypassing pre-rendered images for immersive virtual experiences.
AutoTimes: Autoregressive Time Series Forecasters via Large Language Models
·5046 words·24 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Tsinghua University
AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.
AutoSurvey: Large Language Models Can Automatically Write Surveys
·2587 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Peking University
AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…
Autoregressive Policy Optimization for Constrained Allocation Tasks
·2331 words·11 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Munich Center for Machine Learning
PASPO: a novel autoregressive policy optimization method for constrained allocation tasks guarantees constraint satisfaction and outperforms existing methods.
Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI
·2433 words·12 mins· loading · loading
AI Applications Healthcare 🏒 University Medical Center Gâttingen
Autoregressive Image Diffusion (AID) generates coherent MRI image sequences from undersampled data, outperforming standard diffusion models by exploiting inter-image dependencies.