🏢 University of Western Australia
Referring Human Pose and Mask Estimation In the Wild
·2191 words·11 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 University of Western Australia
RefHuman: a new dataset and UniPHD model achieve state-of-the-art referring human pose and mask estimation in the wild, using text or positional prompts.