Posters
2024
Unified Graph Augmentations for Generalized Contrastive Learning on Graphs
·2324 words·11 mins·
loading
·
loading
Machine Learning
Self-Supervised Learning
🏢 Hebei University of Technology
Unified Graph Augmentations (UGA) module boosts graph contrastive learning by unifying diverse augmentation strategies, improving model generalizability and efficiency.
Unified Generative and Discriminative Training for Multi-modal Large Language Models
·3972 words·19 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Zhejiang University
Unified generative-discriminative training boosts multimodal large language models (MLLMs)! Sugar, a novel approach, leverages dynamic sequence alignment and a triple kernel to enhance global and fin…
Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection
·3024 words·15 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Korea University
Unified Domain Generalization and Adaptation (UDGA) tackles 3D object detection’s domain adaptation challenges by leveraging multi-view overlap and label-efficient learning, achieving state-of-the-art…
Unified Covariate Adjustment for Causal Inference
·1452 words·7 mins·
loading
·
loading
AI Theory
Causality
🏢 Purdue University
Unified Covariate Adjustment (UCA) offers a scalable, doubly robust estimator for a wide array of causal estimands beyond standard methods.
UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior
·3219 words·16 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Xiamen University
UniDSeg uses Visual Foundation Models to create a unified framework for adaptable and generalizable cross-domain 3D semantic segmentation, achieving state-of-the-art results.
UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
·2138 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 ETH Zurich
UniBias unveils and mitigates LLM bias by identifying and eliminating biased internal components (FFN vectors and attention heads), significantly improving in-context learning performance and robustne…
UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner
·2866 words·14 mins·
loading
·
loading
AI Generated
Multimodal Learning
Audio-Visual Learning
🏢 Tsinghua University
UniAudio 1.5 uses a novel LLM-driven audio codec to enable frozen LLMs to perform various audio tasks with just a few examples, opening new avenues for efficient few-shot cross-modal learning.
UniAR: A Unified model for predicting human Attention and Responses on visual content
·2440 words·12 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Google Research
UniAR: A unified model predicts human attention and preferences across diverse visual content (images, webpages, designs), achieving state-of-the-art performance and enabling human-centric improvement…
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE
·2974 words·14 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Tsinghua University
Uni-Med, a novel unified medical foundation model, tackles multi-task learning challenges by using Connector-MoE to efficiently bridge modalities, achieving competitive performance across six medical …
Unelicitable Backdoors via Cryptographic Transformer Circuits
·1600 words·8 mins·
loading
·
loading
AI Theory
Safety
🏢 Contramont Research
Researchers unveil unelicitable backdoors in language models, using cryptographic transformer circuits, defying conventional detection methods and raising crucial AI safety concerns.
Understanding Visual Feature Reliance through the Lens of Complexity
·3993 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 Google DeepMind
Deep learning models favor simple features, hindering generalization; this paper introduces a new feature complexity metric revealing a spectrum of simple-to-complex features, their learning dynamics,…
Understanding Transformers via N-Gram Statistics
·3310 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
LLMs’ inner workings remain elusive. This study uses N-gram statistics to approximate transformer predictions, revealing how LLMs learn from simple to complex statistical rules, and how model variance…
Understanding Transformer Reasoning Capabilities via Graph Algorithms
·2280 words·11 mins·
loading
·
loading
Natural Language Processing
Question Answering
🏢 Google Research
Transformers excel at graph reasoning, with logarithmic depth proving necessary and sufficient for parallelizable tasks; single-layer transformers solve retrieval tasks efficiently.
Understanding the Role of Equivariance in Self-supervised Learning
·2016 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Self-Supervised Learning
🏢 MIT
E-SSL’s generalization ability is rigorously analyzed via an information-theoretic lens, revealing key design principles for improved performance.
Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
·2181 words·11 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Princeton University
Vision-language models struggle with multi-object reasoning due to the binding problem; this paper reveals human-like capacity limits in VLMs and proposes solutions.
Understanding the Gains from Repeated Self-Distillation
·2009 words·10 mins·
loading
·
loading
Machine Learning
Optimization
🏢 University of Washington
Repeated self-distillation significantly reduces excess risk in linear regression, achieving up to a ’d’ factor improvement over single-step methods.
Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective
·2537 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 University of Tokyo
A mean-field theory explains Fourier Neural Operator (FNO) behavior, linking expressivity to trainability by identifying ordered and chaotic phases that correspond to vanishing or exploding gradients,…
Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
·1911 words·9 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
🏢 Peking University
This work systematically investigates the approximation properties of Transformer networks for sequence modeling, revealing the distinct roles of key components (self-attention, positional encoding, f…
Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
·1735 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 ETH Zurich
Unifying framework reveals hidden connections between attention, recurrent, and state-space models, boosting foundation model efficiency.
Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data
·1955 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Georgia Institute of Technology
Deep learning scaling laws are explained by novel approximation and estimation theories for transformers on low-dimensional data, resolving discrepancies between theory and practice.