🏢 Ningbo Institute of Digital Twin, Eastern Institute of Technology
Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models
·2293 words·11 mins·
loading
·
loading
Representation Learning
Multimodal Learning
🏢 Ningbo Institute of Digital Twin, Eastern Institute of Technology
GEM, a novel framework, uses a bidirectional graph and MLLMs to achieve fine-grained, relation-aware disentanglement in unsupervised representation learning, surpassing existing methods.