🏢 University of Hong Kong
Goku: Flow Based Video Generative Foundation Models
·3430 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 University of Hong Kong
Goku: a novel family of joint image-and-video generation models uses rectified flow Transformers, achieving industry-leading performance with a robust data pipeline and training infrastructure.
Teaching Language Models to Critique via Reinforcement Learning
·4328 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of Hong Kong
LLMs learn to critique and refine their output via reinforcement learning, significantly improving code generation.
GameFactory: Creating New Games with Generative Interactive Videos
·3286 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 University of Hong Kong
GameFactory uses AI to generate entirely new games within diverse, open-domain scenes by learning action controls from a small dataset and transferring them to pre-trained video models.
FashionComposer: Compositional Fashion Image Generation
·2265 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 University of Hong Kong
FashionComposer revolutionizes fashion image creation through flexible composition of garments, faces, and poses.
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics
·3117 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 University of Hong Kong
UniReal: a universal framework for image generation and editing, unifying diverse tasks via learning real-world dynamics from video data, achieving highly realistic and versatile results.
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
·3555 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Robotics
🏢 University of Hong Kong
Moto: Bridging language for robot manipulation using latent motion tokens, achieving superior performance with limited data.
TEXGen: a Generative Diffusion Model for Mesh Textures
·3720 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 University of Hong Kong
TEXGen: A groundbreaking generative diffusion model creates high-resolution 3D mesh textures directly from text and image prompts, exceeding prior methods in quality and efficiency.
SAMPart3D: Segment Any Part in 3D Objects
·3136 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 University of Hong Kong
SAMPart3D: Zero-shot 3D part segmentation across granularities, scaling to large datasets & handling part ambiguity.