↓Skip to main content

🏢 University of Toronto

MMFactory: A Universal Solution Search Engine for Vision-Language Tasks

24 December 2024·2929 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 University of Toronto

MMFactory: A universal framework for vision-language tasks, offering diverse programmatic solutions based on user needs and constraints, outperforming existing methods.

Wonderland: Navigating 3D Scenes from a Single Image

16 December 2024·3153 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Toronto

Generate wide-scope 3D scenes from single images in a snap!

Mind the Time: Temporally-Controlled Multi-Event Video Generation

6 December 2024·4541 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 University of Toronto

MinT: Generating coherent videos with precisely timed, multiple events via temporal control, surpassing existing methods.

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

27 November 2024·2596 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 University of Toronto

AC3D achieves precise 3D camera control in video diffusion transformers by analyzing camera motion’s spectral properties, optimizing pose conditioning, and using a curated dataset of dynamic videos.

SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

7 November 2024·3777 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Toronto

SG-I2V: Zero-shot controllable image-to-video generation using a self-guided approach that leverages pre-trained models for precise object and camera motion control.

Minimum Entropy Coupling with Bottleneck

29 October 2024·2581 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Optimization 🏢 University of Toronto

A new lossy compression framework handles reconstruction distribution divergence by integrating a bottleneck, extending minimum entropy coupling and offering guaranteed performance.