Skip to main content

Posters

2024

TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
·2847 words·14 mins· loading · loading
Computer Vision Image Generation 🏢 Korea Advanced Institute of Science and Technology (KAIST)
Boosting diffusion-based human image animation, Test-time Procrustes Calibration (TPC) ensures high-quality outputs by aligning reference and target images, overcoming common compositional misalignmen…
Towards Unsupervised Model Selection for Domain Adaptive Object Detection
·1885 words·9 mins· loading · loading
Computer Vision Object Detection 🏢 University of Electronic Science and Technology of China
Unsupervised model selection for domain adaptive object detection is achieved via a new Detection Adaptation Score (DAS), effectively selecting optimal models without target labels by leveraging the f…
Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model
·3139 words·15 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Renmin University of China
Stable Diffusion’s text-to-image generation is sped up by 25% by removing text guidance after the initial shape generation, revealing that the [EOS] token is key to early-stage image construction.
Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens
·2618 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Renmin University of China
Transformers’ in-context learning (ICL) is explained using representation learning, revealing its ICL process as gradient descent on a dual model and offering modifiable attention layers for enhanced …
Towards Understanding Extrapolation: a Causal Lens
·2076 words·10 mins· loading · loading
Machine Learning Transfer Learning 🏢 Carnegie Mellon University
This work unveils a causal lens on extrapolation, offering theoretical guarantees for accurate predictions on out-of-support data, even with limited target samples.
Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning
·2032 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 SYCAMORE, EPFL
This paper proposes a novel solution to the transferability problem in inverse reinforcement learning (IRL) using principal angles to measure the similarity between transition laws. It provides suffi…
Towards the Dynamics of a DNN Learning Symbolic Interactions
·1849 words·9 mins· loading · loading
AI Theory Interpretability 🏢 Shanghai Jiao Tong University
DNNs learn interactions in two phases: initially removing complex interactions, then gradually learning higher-order ones, leading to overfitting.
Towards Stable Representations for Protein Interface Prediction
·2364 words·12 mins· loading · loading
AI Generated Machine Learning Representation Learning 🏢 Hong Kong University of Science and Technology
ATProt: Adversarial training makes protein interface prediction robust to flexibility!
Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing
·3866 words·19 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏢 Hong Kong Polytechnic University
Causal Representation Editing (CRE) improves safe image generation by precisely removing unsafe concepts from diffusion models, enhancing efficiency and flexibility.
Towards Robust Multimodal Sentiment Analysis with Incomplete Data
·3583 words·17 mins· loading · loading
AI Generated Natural Language Processing Sentiment Analysis 🏢 School of Data Science, the Chinese University of Hong Kong, Shenzhen
Robust Multimodal Sentiment Analysis (MSA) model, Language-dominated Noise-resistant Learning Network (LNLN), handles incomplete data by correcting dominant modality (language) and using a multimodal …
Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
·2412 words·12 mins· loading · loading
Computer Vision Image Segmentation 🏢 KAIST
PixelCLIP: Open-vocabulary semantic segmentation without pixel-level labels! Leveraging unlabeled image masks from Vision Foundation Models and an online clustering algorithm, PixelCLIP achieves imp…
Towards Next-Generation Logic Synthesis: A Scalable Neural Circuit Generation Framework
·2522 words·12 mins· loading · loading
AI Applications Hardware Design 🏢 University of Science and Technology of China
A novel regularized triangle-shaped neural network framework, T-Net, achieves highly accurate and scalable logic circuit generation, significantly outperforming existing methods.
Towards Neuron Attributions in Multi-Modal Large Language Models
·1551 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Science and Technology of China
NAM: a novel neuron attribution method for MLLMs, revealing modality-specific semantic knowledge and enabling multi-modal knowledge editing.
Towards Multi-Domain Learning for Generalizable Video Anomaly Detection
·2936 words·14 mins· loading · loading
Computer Vision Video Understanding 🏢 Kyung Hee University
Researchers propose Multi-Domain learning for Video Anomaly Detection (MDVAD) to create generalizable models handling conflicting abnormality criteria across diverse datasets, improving accuracy and a…
Towards Multi-dimensional Explanation Alignment for Medical Classification
·2650 words·13 mins· loading · loading
AI Applications Healthcare 🏢 King Abdullah University of Science and Technology
Med-MICN: a novel end-to-end framework for medical image classification, achieving superior accuracy and multi-dimensional interpretability by aligning neural symbolic reasoning, concept semantics, an…
Towards Learning Group-Equivariant Features for Domain Adaptive 3D Detection
·1931 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Oxford
GroupEXP-DA boosts domain adaptive 3D object detection by using a grouping-exploration strategy to reduce bias in pseudo-label collection and account for multiple factors affecting object perception i…
Towards Human-AI Complementarity with Prediction Sets
·1995 words·10 mins· loading · loading
AI Applications Human-AI Interaction 🏢 Fondazione Bruno Kessler & University of Trento
Greedy algorithms outperform conformal prediction in creating prediction sets that maximize human expert accuracy in classification tasks.
Towards Harmless Rawlsian Fairness Regardless of Demographic Prior
·3255 words·16 mins· loading · loading
AI Generated AI Theory Fairness 🏢 School of Computer Science and Engineering, Beihang University
VFair achieves harmless Rawlsian fairness in regression tasks without relying on sensitive demographic data by minimizing the variance of training losses.
Towards Global Optimal Visual In-Context Learning Prompt Selection
·2618 words·13 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏢 Fudan University
Partial2Global: A novel VICL framework achieving globally optimal prompt selection, significantly improving visual in-context learning across various tasks.
Towards Flexible Visual Relationship Segmentation
·3217 words·16 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏢 Microsoft Research
FleVRS: One unified model masters standard, promptable, and open-vocabulary visual relationship segmentation, outperforming existing methods.