↓Skip to main content

🏢 University of Western Australia

Referring Human Pose and Mask Estimation In the Wild

26 September 2024·2191 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 University of Western Australia

RefHuman: a new dataset and UniPHD model achieve state-of-the-art referring human pose and mask estimation in the wild, using text or positional prompts.