Skip to main content

🏢 Meta

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation
·4248 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Meta
Scaling visual tokenizers dramatically improves image and video generation, achieving state-of-the-art results and outperforming existing methods with fewer computations by focusing on decoder scaling…
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
·3304 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Meta
Through-The-Mask uses mask-based motion trajectories to generate realistic videos from images and text, overcoming limitations of existing methods in handling complex multi-object motion.
VisualLens: Personalization through Visual History
·2160 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Visual Question Answering 🏢 Meta
VisualLens leverages user visual history for personalized recommendations, improving state-of-the-art by 5-10% and exceeding GPT-4’s performance.