Posters
2024
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
·1807 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ University of Oxford
BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.
Balancing Context Length and Mixing Times for Reinforcement Learning at Scale
·1724 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ IBM Research
Longer context in RL boosts generalization but slows down learning; this paper reveals the crucial tradeoff and offers theoretical insights.
BAKU: An Efficient Transformer for Multi-Task Policy Learning
·4209 words·20 mins·
loading
·
loading
AI Applications
Robotics
π’ New York University
BAKU: A simple transformer enables efficient multi-task robot policy learning, achieving 91% success on real-world tasks with limited data.
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
·2359 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Chinese University of Hong Kong, Shenzhen
BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…
BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
·2859 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ University of Wisconsin-Madison
BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…
Back to the Continuous Attractor
·5636 words·27 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Champalimaud Centre for the Unknown
Despite their brittleness, continuous attractors remain functionally robust analog memory models due to persistent slow manifolds surviving bifurcations, enabling accurate approximation and generaliza…
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
·1821 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ AWS AI Labs
B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
·2514 words·12 mins·
loading
·
loading
AI Theory
Interpretability
π’ Max Planck Institute for Informatics
B-cosification: cheaply transform any pre-trained deep neural network into an inherently interpretable model.
B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data
·1511 words·8 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Chinese University of Hong Kong, Shenzhen
B-ary Tree Push-Pull (BTPP) achieves linear speedup for distributed learning on heterogeneous data, significantly outperforming state-of-the-art methods with minimal communication.
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
·2546 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
π’ State Key Laboratory for Novel Software Technology, Nanjing University
AWT: a novel framework boosts vision-language model’s zero-shot capabilities by augmenting inputs, weighting them dynamically, and leveraging optimal transport to enhance semantic correlations.
Avoiding Undesired Future with Minimal Cost in Non-Stationary Environments
·2100 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ National Key Laboratory for Novel Software Technology, Nanjing University, China
AUF-MICNS: A novel sequential method efficiently solves the avoiding undesired future problem by dynamically updating influence relations in non-stationary environments while minimizing action costs.
AverNet: All-in-one Video Restoration for Time-varying Unknown Degradations
·2558 words·13 mins·
loading
·
loading
AI Generated
Computer Vision
Video Understanding
π’ College of Computer Science, Sichuan University, China
AverNet: All-in-one video restoration defying time-varying unknown degradations.
Average gradient outer product as a mechanism for deep neural collapse
·2027 words·10 mins·
loading
·
loading
AI Theory
Optimization
π’ UC San Diego
Deep Neural Collapse (DNC) explained via Average Gradient Outer Product (AGOP).
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
·3104 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Stanford University
AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
·2151 words·11 mins·
loading
·
loading
Multimodal Learning
Audio-Visual Learning
π’ University of Surrey, UK
AV-GS: A novel Audio-Visual Gaussian Splatting model, uses geometry and material-aware priors to efficiently synthesize realistic binaural audio from a single audio source.
AV-Cloud: Spatial Audio Rendering Through Audio-Visual Cloud Splatting
·2151 words·11 mins·
loading
·
loading
Multimodal Learning
Audio-Visual Learning
π’ University of Washington
AV-Cloud: Real-time, high-quality 3D spatial audio rendering synced with visuals, bypassing pre-rendered images for immersive virtual experiences.
AutoTimes: Autoregressive Time Series Forecasters via Large Language Models
·5046 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Tsinghua University
AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.
AutoSurvey: Large Language Models Can Automatically Write Surveys
·2587 words·13 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Peking University
AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…
Autoregressive Policy Optimization for Constrained Allocation Tasks
·2331 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Munich Center for Machine Learning
PASPO: a novel autoregressive policy optimization method for constrained allocation tasks guarantees constraint satisfaction and outperforms existing methods.
Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI
·2433 words·12 mins·
loading
·
loading
AI Applications
Healthcare
π’ University Medical Center GΓΆttingen
Autoregressive Image Diffusion (AID) generates coherent MRI image sequences from undersampled data, outperforming standard diffusion models by exploiting inter-image dependencies.