Image Generation

TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation

26 September 2024·2847 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 Korea Advanced Institute of Science and Technology (KAIST)

Boosting diffusion-based human image animation, Test-time Procrustes Calibration (TPC) ensures high-quality outputs by aligning reference and target images, overcoming common compositional misalignmen…

TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge

26 September 2024·1979 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 School of Integrated Circuits, Xidian University

TinyLUT achieves 10x lower memory consumption and superior accuracy in image restoration on edge devices using innovative separable mapping and dynamic discretization of LUTs.

Time-Varying LoRA: Towards Effective Cross-Domain Fine-Tuning of Diffusion Models

26 September 2024·3031 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Southern University of Science and Technology

Terra, a novel time-varying low-rank adapter, enables effective cross-domain fine-tuning of diffusion models by creating a continuous parameter manifold, facilitating efficient knowledge sharing and g…

The GAN is dead; long live the GAN! A Modern GAN Baseline

26 September 2024·3072 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Brown University

R3GAN, a minimalist GAN baseline, surpasses state-of-the-art models by using a novel regularized relativistic GAN loss and modern architectures, proving GANs can be trained efficiently without relying…

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

26 September 2024·2216 words·11 mins· loading · loading

Image Generation 🏢 Institute of Information Engineering, Chinese Academy of Sciences

TextCtrl: a novel diffusion-based scene text editing method using prior guidance control, achieving superior style fidelity and accuracy with a new real-world benchmark dataset, ScenePair.

Taming Generative Diffusion Prior for Universal Blind Image Restoration

26 September 2024·4450 words·21 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Fudan University

BIR-D tames generative diffusion models for universal blind image restoration, dynamically updating parameters to handle various complex degradations without assuming degradation model types.

Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs

26 September 2024·2142 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 Advanced Micro Devices Inc.

DoSSR: A novel SR model boosts efficiency by 5-7x, achieving state-of-the-art performance with only 5 sampling steps by cleverly integrating a domain shift equation into pretrained diffusion models.

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

26 September 2024·4065 words·20 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 KAIST

SyncTweedies: a zero-shot diffusion synchronization framework generates diverse visual content (images, panoramas, 3D textures) by synchronizing multiple diffusion processes without fine-tuning, demon…

Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques

26 September 2024·3213 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Institute of Information Engineering, CAS

Boosting diffusion model features: This paper introduces GATE, a novel method to suppress ‘content shift’ in diffusion features, improving their quality via off-the-shelf generation techniques.

Stylus: Automatic Adapter Selection for Diffusion Models

26 September 2024·3022 words·15 mins· loading · loading

Image Generation 🏢 University of California, Berkeley

Stylus: an automatic adapter selection system for diffusion models, boosts image quality and diversity by intelligently composing task-specific adapters based on prompt keywords.

StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation

26 September 2024·2393 words·12 mins· loading · loading

Image Generation 🏢 Nankai University

StoryDiffusion enhances long-range image & video generation by introducing a simple yet effective self-attention mechanism and a semantic motion predictor, achieving high content consistency without t…

StepbaQ: Stepping backward as Correction for Quantized Diffusion Models

26 September 2024·2381 words·12 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 MediaTek

StepbaQ enhances quantized diffusion models by correcting accumulated quantization errors via a novel sampling step correction mechanism, significantly improving model accuracy without modifying exist…

Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation

26 September 2024·3451 words·17 mins· loading · loading

Computer Vision Image Generation 🏢 Munich Center for Machine Learning

Stable-Pose: Precise human pose guidance for text-to-image synthesis.

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

26 September 2024·2011 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 School of Data Science, University of Science and Technology of China

DiGIT stabilizes image autoregressive models’ latent space using a novel discrete tokenizer from self-supervised learning, achieving state-of-the-art image generation.

Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics

26 September 2024·2499 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 University of Oxford

D³GM, a novel score-based diffusion model, enhances stability & generalizability in solving inverse problems by leveraging measure-preserving dynamics, enabling robust image reconstruction across dive…

SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening

26 September 2024·2088 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 University of Electronic Science and Technology of China

SSDiff: A novel spatial-spectral integrated diffusion model for superior remote sensing pansharpening.

SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams

26 September 2024·2631 words·13 mins· loading · loading

Image Generation 🏢 Peking University

SpikeReveal: Self-supervised learning unlocks sharp video sequences from blurry, real-world spike camera data, overcoming limitations of prior supervised approaches.

Spatio-Temporal Interactive Learning for Efficient Image Reconstruction of Spiking Cameras

26 September 2024·2395 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Peking University

STIR: A novel spatio-temporal network reconstructs high-quality images from spiking camera data by jointly refining motion and intensity information for efficient and accurate high-speed imaging.

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention

26 September 2024·2786 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 University of Washington

Smoothed Energy Guidance (SEG) improves unconditional image generation by reducing self-attention’s energy curvature, leading to higher-quality outputs with fewer artifacts.

Slight Corruption in Pre-training Data Makes Better Diffusion Models

26 September 2024·4250 words·20 mins· loading · loading

Image Generation 🏢 Carnegie Mellon University

Slightly corrupting pre-training data significantly improves diffusion models’ image generation quality, diversity, and fidelity.