Skip to main content

🏢 Snap Inc.

Slicing Vision Transformer for Flexibile Inference
·2922 words·14 mins· loading · loading
Computer Vision Image Classification 🏢 Snap Inc.
Scala: One-shot training enables flexible ViT inference!
SF-V: Single Forward Video Generation Model
·1607 words·8 mins· loading · loading
Computer Vision Video Understanding 🏢 Snap Inc.
Researchers developed SF-V, a single-step image-to-video generation model, achieving a 23x speedup compared to existing models without sacrificing quality, paving the way for real-time video synthesis…
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
·5994 words·29 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Snap Inc.
BitsFusion achieves 7.9x smaller Stable Diffusion models by quantizing UNet weights to 1.99 bits, surprisingly improving image generation quality!
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
·2478 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Snap Inc.
AsCAN, a novel hybrid architecture, achieves superior efficiency and performance in image recognition and generation by asymmetrically combining convolutional and transformer blocks.
4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models
·1721 words·9 mins· loading · loading
Computer Vision Video Understanding 🏢 Snap Inc.
4Real: Photorealistic 4D scene generation from text prompts using video diffusion models, exceeding object-centric approaches for higher realism and efficiency.