Posters

TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation

26 September 2024·2847 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 Korea Advanced Institute of Science and Technology (KAIST)

Boosting diffusion-based human image animation, Test-time Procrustes Calibration (TPC) ensures high-quality outputs by aligning reference and target images, overcoming common compositional misalignmen…

Towards Unsupervised Model Selection for Domain Adaptive Object Detection

26 September 2024·1885 words·9 mins· loading · loading

Computer Vision Object Detection 🏢 University of Electronic Science and Technology of China

Unsupervised model selection for domain adaptive object detection is achieved via a new Detection Adaptation Score (DAS), effectively selecting optimal models without target labels by leveraging the f…

Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model

26 September 2024·3139 words·15 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Renmin University of China

Stable Diffusion’s text-to-image generation is sped up by 25% by removing text guidance after the initial shape generation, revealing that the [EOS] token is key to early-stage image construction.

Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens

26 September 2024·2618 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Renmin University of China

Transformers’ in-context learning (ICL) is explained using representation learning, revealing its ICL process as gradient descent on a dual model and offering modifiable attention layers for enhanced …

Towards Understanding Extrapolation: a Causal Lens

26 September 2024·2076 words·10 mins· loading · loading

Machine Learning Transfer Learning 🏢 Carnegie Mellon University

This work unveils a causal lens on extrapolation, offering theoretical guarantees for accurate predictions on out-of-support data, even with limited target samples.

Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning

26 September 2024·2032 words·10 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 SYCAMORE, EPFL

This paper proposes a novel solution to the transferability problem in inverse reinforcement learning (IRL) using principal angles to measure the similarity between transition laws. It provides suffi…

Towards the Dynamics of a DNN Learning Symbolic Interactions

26 September 2024·1849 words·9 mins· loading · loading

AI Theory Interpretability 🏢 Shanghai Jiao Tong University

DNNs learn interactions in two phases: initially removing complex interactions, then gradually learning higher-order ones, leading to overfitting.

Towards Stable Representations for Protein Interface Prediction

26 September 2024·2364 words·12 mins· loading · loading

AI Generated Machine Learning Representation Learning 🏢 Hong Kong University of Science and Technology

ATProt: Adversarial training makes protein interface prediction robust to flexibility!

Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing

26 September 2024·3866 words·19 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Hong Kong Polytechnic University

Causal Representation Editing (CRE) improves safe image generation by precisely removing unsafe concepts from diffusion models, enhancing efficiency and flexibility.

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

26 September 2024·3583 words·17 mins· loading · loading

AI Generated Natural Language Processing Sentiment Analysis 🏢 School of Data Science, the Chinese University of Hong Kong, Shenzhen

Robust Multimodal Sentiment Analysis (MSA) model, Language-dominated Noise-resistant Learning Network (LNLN), handles incomplete data by correcting dominant modality (language) and using a multimodal …

Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels

26 September 2024·2412 words·12 mins· loading · loading

Computer Vision Image Segmentation 🏢 KAIST

PixelCLIP: Open-vocabulary semantic segmentation without pixel-level labels! Leveraging unlabeled image masks from Vision Foundation Models and an online clustering algorithm, PixelCLIP achieves imp…

Towards Next-Generation Logic Synthesis: A Scalable Neural Circuit Generation Framework

26 September 2024·2522 words·12 mins· loading · loading

AI Applications Hardware Design 🏢 University of Science and Technology of China

A novel regularized triangle-shaped neural network framework, T-Net, achieves highly accurate and scalable logic circuit generation, significantly outperforming existing methods.

Towards Neuron Attributions in Multi-Modal Large Language Models

26 September 2024·1551 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Science and Technology of China

NAM: a novel neuron attribution method for MLLMs, revealing modality-specific semantic knowledge and enabling multi-modal knowledge editing.

Towards Multi-Domain Learning for Generalizable Video Anomaly Detection

26 September 2024·2936 words·14 mins· loading · loading

Computer Vision Video Understanding 🏢 Kyung Hee University

Researchers propose Multi-Domain learning for Video Anomaly Detection (MDVAD) to create generalizable models handling conflicting abnormality criteria across diverse datasets, improving accuracy and a…

Towards Multi-dimensional Explanation Alignment for Medical Classification

26 September 2024·2650 words·13 mins· loading · loading

AI Applications Healthcare 🏢 King Abdullah University of Science and Technology

Med-MICN: a novel end-to-end framework for medical image classification, achieving superior accuracy and multi-dimensional interpretability by aligning neural symbolic reasoning, concept semantics, an…

Towards Learning Group-Equivariant Features for Domain Adaptive 3D Detection

26 September 2024·1931 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Oxford

GroupEXP-DA boosts domain adaptive 3D object detection by using a grouping-exploration strategy to reduce bias in pseudo-label collection and account for multiple factors affecting object perception i…

Towards Human-AI Complementarity with Prediction Sets

26 September 2024·1995 words·10 mins· loading · loading

AI Applications Human-AI Interaction 🏢 Fondazione Bruno Kessler & University of Trento

Greedy algorithms outperform conformal prediction in creating prediction sets that maximize human expert accuracy in classification tasks.

Towards Harmless Rawlsian Fairness Regardless of Demographic Prior

26 September 2024·3255 words·16 mins· loading · loading

AI Generated AI Theory Fairness 🏢 School of Computer Science and Engineering, Beihang University

VFair achieves harmless Rawlsian fairness in regression tasks without relying on sensitive demographic data by minimizing the variance of training losses.

Towards Global Optimal Visual In-Context Learning Prompt Selection

26 September 2024·2618 words·13 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 Fudan University

Partial2Global: A novel VICL framework achieving globally optimal prompt selection, significantly improving visual in-context learning across various tasks.

Towards Flexible Visual Relationship Segmentation

26 September 2024·3217 words·16 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 Microsoft Research

FleVRS: One unified model masters standard, promptable, and open-vocabulary visual relationship segmentation, outperforming existing methods.