Skip to main content

🏢 University of Hong Kong

Goku: Flow Based Video Generative Foundation Models
·3430 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Hong Kong
Goku: a novel family of joint image-and-video generation models uses rectified flow Transformers, achieving industry-leading performance with a robust data pipeline and training infrastructure.
Teaching Language Models to Critique via Reinforcement Learning
·4328 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Hong Kong
LLMs learn to critique and refine their output via reinforcement learning, significantly improving code generation.
GameFactory: Creating New Games with Generative Interactive Videos
·3286 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 University of Hong Kong
GameFactory uses AI to generate entirely new games within diverse, open-domain scenes by learning action controls from a small dataset and transferring them to pre-trained video models.
FashionComposer: Compositional Fashion Image Generation
·2265 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Hong Kong
FashionComposer revolutionizes fashion image creation through flexible composition of garments, faces, and poses.
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
·3117 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Hong Kong
UniReal: a universal framework for image generation and editing, unifying diverse tasks via learning real-world dynamics from video data, achieving highly realistic and versatile results.
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
·3555 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers AI Applications Robotics 🏢 University of Hong Kong
Moto: Bridging language for robot manipulation using latent motion tokens, achieving superior performance with limited data.
TEXGen: a Generative Diffusion Model for Mesh Textures
·3720 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Hong Kong
TEXGen: A groundbreaking generative diffusion model creates high-resolution 3D mesh textures directly from text and image prompts, exceeding prior methods in quality and efficiency.
SAMPart3D: Segment Any Part in 3D Objects
·3136 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Hong Kong
SAMPart3D: Zero-shot 3D part segmentation across granularities, scaling to large datasets & handling part ambiguity.