Posters

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

26 September 2024·1807 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Oxford

BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.

Balancing Context Length and Mixing Times for Reinforcement Learning at Scale

26 September 2024·1724 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 IBM Research

Longer context in RL boosts generalization but slows down learning; this paper reveals the crucial tradeoff and offers theoretical insights.

BAKU: An Efficient Transformer for Multi-Task Policy Learning

26 September 2024·4209 words·20 mins· loading · loading

AI Applications Robotics 🏢 New York University

BAKU: A simple transformer enables efficient multi-task robot policy learning, achieving 91% success on real-world tasks with limited data.

BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

26 September 2024·2359 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong, Shenzhen

BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…

BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment

26 September 2024·2859 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Wisconsin-Madison

BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…

Back to the Continuous Attractor

26 September 2024·5636 words·27 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Champalimaud Centre for the Unknown

Despite their brittleness, continuous attractors remain functionally robust analog memory models due to persistent slow manifolds surviving bifurcations, enabling accurate approximation and generaliza…

B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory

26 September 2024·1821 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 AWS AI Labs

B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…

B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

26 September 2024·2514 words·12 mins· loading · loading

AI Theory Interpretability 🏢 Max Planck Institute for Informatics

B-cosification: cheaply transform any pre-trained deep neural network into an inherently interpretable model.

B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data

26 September 2024·1511 words·8 mins· loading · loading

Machine Learning Deep Learning 🏢 Chinese University of Hong Kong, Shenzhen

B-ary Tree Push-Pull (BTPP) achieves linear speedup for distributed learning on heterogeneous data, significantly outperforming state-of-the-art methods with minimal communication.

AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation

26 September 2024·2546 words·12 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 State Key Laboratory for Novel Software Technology, Nanjing University

AWT: a novel framework boosts vision-language model’s zero-shot capabilities by augmenting inputs, weighting them dynamically, and leveraging optimal transport to enhance semantic correlations.

Avoiding Undesired Future with Minimal Cost in Non-Stationary Environments

26 September 2024·2100 words·10 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 National Key Laboratory for Novel Software Technology, Nanjing University, China

AUF-MICNS: A novel sequential method efficiently solves the avoiding undesired future problem by dynamically updating influence relations in non-stationary environments while minimizing action costs.

AverNet: All-in-one Video Restoration for Time-varying Unknown Degradations

26 September 2024·2558 words·13 mins· loading · loading

AI Generated Computer Vision Video Understanding 🏢 College of Computer Science, Sichuan University, China

AverNet: All-in-one video restoration defying time-varying unknown degradations.

Average gradient outer product as a mechanism for deep neural collapse

26 September 2024·2027 words·10 mins· loading · loading

AI Theory Optimization 🏢 UC San Diego

Deep Neural Collapse (DNC) explained via Average Gradient Outer Product (AGOP).

AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

26 September 2024·3104 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.

AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis

26 September 2024·2151 words·11 mins· loading · loading

Multimodal Learning Audio-Visual Learning 🏢 University of Surrey, UK

AV-GS: A novel Audio-Visual Gaussian Splatting model, uses geometry and material-aware priors to efficiently synthesize realistic binaural audio from a single audio source.

AV-Cloud: Spatial Audio Rendering Through Audio-Visual Cloud Splatting

26 September 2024·2151 words·11 mins· loading · loading

Multimodal Learning Audio-Visual Learning 🏢 University of Washington

AV-Cloud: Real-time, high-quality 3D spatial audio rendering synced with visuals, bypassing pre-rendered images for immersive virtual experiences.

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

26 September 2024·5046 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.

AutoSurvey: Large Language Models Can Automatically Write Surveys

26 September 2024·2587 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…

Autoregressive Policy Optimization for Constrained Allocation Tasks

26 September 2024·2331 words·11 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 Munich Center for Machine Learning

PASPO: a novel autoregressive policy optimization method for constrained allocation tasks guarantees constraint satisfaction and outperforms existing methods.

Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI

26 September 2024·2433 words·12 mins· loading · loading

AI Applications Healthcare 🏢 University Medical Center Göttingen

Autoregressive Image Diffusion (AID) generates coherent MRI image sequences from undersampled data, outperforming standard diffusion models by exploiting inter-image dependencies.