↓Skip to main content

🏢 Snap Inc.

Slicing Vision Transformer for Flexibile Inference

26 September 2024·2922 words·14 mins· loading · loading

Computer Vision Image Classification 🏢 Snap Inc.

Scala: One-shot training enables flexible ViT inference!

SF-V: Single Forward Video Generation Model

26 September 2024·1607 words·8 mins· loading · loading

Computer Vision Video Understanding 🏢 Snap Inc.

Researchers developed SF-V, a single-step image-to-video generation model, achieving a 23x speedup compared to existing models without sacrificing quality, paving the way for real-time video synthesis…

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

26 September 2024·5994 words·29 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Snap Inc.

BitsFusion achieves 7.9x smaller Stable Diffusion models by quantizing UNet weights to 1.99 bits, surprisingly improving image generation quality!

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

26 September 2024·2478 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Snap Inc.

AsCAN, a novel hybrid architecture, achieves superior efficiency and performance in image recognition and generation by asymmetrically combining convolutional and transformer blocks.

4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models

26 September 2024·1721 words·9 mins· loading · loading

Computer Vision Video Understanding 🏢 Snap Inc.

4Real: Photorealistic 4D scene generation from text prompts using video diffusion models, exceeding object-centric approaches for higher realism and efficiency.