Posters

Unified Graph Augmentations for Generalized Contrastive Learning on Graphs

26 September 2024·2324 words·11 mins· loading · loading

Machine Learning Self-Supervised Learning 🏢 Hebei University of Technology

Unified Graph Augmentations (UGA) module boosts graph contrastive learning by unifying diverse augmentation strategies, improving model generalizability and efficiency.

Unified Generative and Discriminative Training for Multi-modal Large Language Models

26 September 2024·3972 words·19 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Zhejiang University

Unified generative-discriminative training boosts multimodal large language models (MLLMs)! Sugar, a novel approach, leverages dynamic sequence alignment and a triple kernel to enhance global and fin…

Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection

26 September 2024·3024 words·15 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Korea University

Unified Domain Generalization and Adaptation (UDGA) tackles 3D object detection’s domain adaptation challenges by leveraging multi-view overlap and label-efficient learning, achieving state-of-the-art…

Unified Covariate Adjustment for Causal Inference

26 September 2024·1452 words·7 mins· loading · loading

AI Theory Causality 🏢 Purdue University

Unified Covariate Adjustment (UCA) offers a scalable, doubly robust estimator for a wide array of causal estimands beyond standard methods.

UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior

26 September 2024·3219 words·16 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Xiamen University

UniDSeg uses Visual Foundation Models to create a unified framework for adaptable and generalizable cross-domain 3D semantic segmentation, achieving state-of-the-art results.

UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation

26 September 2024·2138 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 ETH Zurich

UniBias unveils and mitigates LLM bias by identifying and eliminating biased internal components (FFN vectors and attention heads), significantly improving in-context learning performance and robustne…

UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner

26 September 2024·2866 words·14 mins· loading · loading

AI Generated Multimodal Learning Audio-Visual Learning 🏢 Tsinghua University

UniAudio 1.5 uses a novel LLM-driven audio codec to enable frozen LLMs to perform various audio tasks with just a few examples, opening new avenues for efficient few-shot cross-modal learning.

UniAR: A Unified model for predicting human Attention and Responses on visual content

26 September 2024·2440 words·12 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Google Research

UniAR: A unified model predicts human attention and preferences across diverse visual content (images, webpages, designs), achieving state-of-the-art performance and enabling human-centric improvement…

Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE

26 September 2024·2974 words·14 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Tsinghua University

Uni-Med, a novel unified medical foundation model, tackles multi-task learning challenges by using Connector-MoE to efficiently bridge modalities, achieving competitive performance across six medical …

Unelicitable Backdoors via Cryptographic Transformer Circuits

26 September 2024·1600 words·8 mins· loading · loading

AI Theory Safety 🏢 Contramont Research

Researchers unveil unelicitable backdoors in language models, using cryptographic transformer circuits, defying conventional detection methods and raising crucial AI safety concerns.

Understanding Visual Feature Reliance through the Lens of Complexity

26 September 2024·3993 words·19 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 Google DeepMind

Deep learning models favor simple features, hindering generalization; this paper introduces a new feature complexity metric revealing a spectrum of simple-to-complex features, their learning dynamics,…

Understanding Transformers via N-Gram Statistics

26 September 2024·3310 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs’ inner workings remain elusive. This study uses N-gram statistics to approximate transformer predictions, revealing how LLMs learn from simple to complex statistical rules, and how model variance…

Understanding Transformer Reasoning Capabilities via Graph Algorithms

26 September 2024·2280 words·11 mins· loading · loading

Natural Language Processing Question Answering 🏢 Google Research

Transformers excel at graph reasoning, with logarithmic depth proving necessary and sufficient for parallelizable tasks; single-layer transformers solve retrieval tasks efficiently.

Understanding the Role of Equivariance in Self-supervised Learning

26 September 2024·2016 words·10 mins· loading · loading

AI Generated Machine Learning Self-Supervised Learning 🏢 MIT

E-SSL’s generalization ability is rigorously analyzed via an information-theoretic lens, revealing key design principles for improved performance.

Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

26 September 2024·2181 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Princeton University

Vision-language models struggle with multi-object reasoning due to the binding problem; this paper reveals human-like capacity limits in VLMs and proposes solutions.

Understanding the Gains from Repeated Self-Distillation

26 September 2024·2009 words·10 mins· loading · loading

Machine Learning Optimization 🏢 University of Washington

Repeated self-distillation significantly reduces excess risk in linear regression, achieving up to a ’d’ factor improvement over single-step methods.

Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective

26 September 2024·2537 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Tokyo

A mean-field theory explains Fourier Neural Operator (FNO) behavior, linking expressivity to trainability by identifying ordered and chaotic phases that correspond to vanishing or exploding gradients,…

Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling

26 September 2024·1911 words·9 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Peking University

This work systematically investigates the approximation properties of Transformer networks for sequence modeling, revealing the distinct roles of key components (self-attention, positional encoding, f…

Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks

26 September 2024·1735 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 ETH Zurich

Unifying framework reveals hidden connections between attention, recurrent, and state-space models, boosting foundation model efficiency.

Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data

26 September 2024·1955 words·10 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

Deep learning scaling laws are explained by novel approximation and estimation theories for transformers on low-dimensional data, resolving discrepancies between theory and practice.