Skip to main content

🏢 Ningbo Institute of Digital Twin, Eastern Institute of Technology

Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models
·2293 words·11 mins· loading · loading
Representation Learning Multimodal Learning 🏢 Ningbo Institute of Digital Twin, Eastern Institute of Technology
GEM, a novel framework, uses a bidirectional graph and MLLMs to achieve fine-grained, relation-aware disentanglement in unsupervised representation learning, surpassing existing methods.