Image Generation

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?

26 September 2024·2761 words·13 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Mohamed Bin Zayed University of Artificial Intelligence

Concept-Incremental Flexible Customization (CIFC) model tackles catastrophic forgetting and concept neglect in continually adapting text-to-image diffusion models, enabling flexible personalization.

How Diffusion Models Learn to Factorize and Compose

26 September 2024·3926 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 MIT

Diffusion models surprisingly learn factorized representations, enabling compositional generalization, but struggle with interpolation; training with independent factors drastically improves data effi…

Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models

26 September 2024·2415 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Qualcomm AI Research

Hollowed Net efficiently personalizes text-to-image diffusion models on-device by temporarily removing deep U-Net layers during training, drastically reducing memory usage without sacrificing performa…

High-Resolution Image Harmonization with Adaptive-Interval Color Transformation

26 September 2024·3030 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Harbin Institute of Technology

AICT: Adaptive-Interval Color Transformation harmonizes high-resolution images by predicting pixel-wise color changes, adaptively adjusting sampling intervals to capture local variations, and using a …

Hierarchical Uncertainty Exploration via Feedforward Posterior Trees

26 September 2024·5486 words·26 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Technion-Israel Institute of Technology

Visualizing high-dimensional posterior distributions is challenging. This paper introduces ‘Posterior Trees,’ a novel method using tree-structured neural network predictions for hierarchical uncertai…

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation

26 September 2024·3034 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 360 AI Research

HiCo: Hierarchical Controllable Diffusion Model achieves superior layout-to-image generation by disentangling spatial layouts through a multi-branch network structure, resulting in high-quality images…

HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach

26 September 2024·2844 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 HSE University

HairFastGAN achieves realistic and robust hairstyle transfer in near real-time using a novel encoder-based approach, significantly outperforming optimization-based methods.

HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion

26 September 2024·3966 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Shenzhen University

HairDiffusion uses latent diffusion models and a multi-stage blending technique to achieve vivid, multi-colored hair editing in images, preserving other facial features.

Guiding a Diffusion Model with a Bad Version of Itself

26 September 2024·1887 words·9 mins· loading · loading

Image Generation 🏢 NVIDIA

Boost image quality in diffusion models without reducing variation using Autoguidance: guide a high-quality model with a less-trained version of itself!

Gradient-free Decoder Inversion in Latent Diffusion Models

26 September 2024·2408 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Seoul National University

This paper introduces a novel gradient-free decoder inversion method for latent diffusion models, improving efficiency and memory usage compared to existing gradient-based methods. The method is theo…

Goal Conditioned Reinforcement Learning for Photo Finishing Tuning

26 September 2024·3405 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Shanghai AI Laboratory

This paper introduces a goal-conditioned reinforcement learning approach that efficiently tunes photo finishing pipelines, achieving high-quality results in fewer iterations than optimization-based me…

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

26 September 2024·2403 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Sony AI

GenWarp generates high-quality novel image views from a single input image by using a semantic-preserving generative warping framework, outperforming existing methods.

Generating compositional scenes via Text-to-image RGBA Instance Generation

26 September 2024·4227 words·20 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Edinburgh

This paper introduces a novel multi-stage generation framework for creating compositional scenes with fine-grained control by leveraging a trained diffusion model to produce individual scene component…

Generalizable and Animatable Gaussian Head Avatar

26 September 2024·3445 words·17 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Tokyo

One-shot animatable head avatar reconstruction is achieved using a novel dual-lifting method that generates 3D Gaussians from a single image, enabling real-time expression control and rendering with s…

General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process

26 September 2024·2623 words·13 mins· loading · loading

Computer Vision Image Generation 🏢 Shanghai Jiao Tong University

Part-Aware Diffusion Model (PA-Diffusion) enables precise and efficient manipulation of articulated objects in real images by using abstract 3D models and dynamic feature maps, overcoming limitations …

FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images

26 September 2024·1953 words·10 mins· loading · loading

Image Generation 🏢 Shanghai Jiao Tong University

FuseAnyPart: Swap facial parts seamlessly using multiple reference images via diffusion, achieving high-fidelity results.

From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models

26 September 2024·3334 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Tsinghua University

Diffusion models, while excelling in image generation, are vulnerable to data poisoning. This paper demonstrates a BadNets-like attack’s effectiveness against diffusion models, causing image misalign…

FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space

26 September 2024·3551 words·17 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Science and Technology of China

FreqMark: Robust invisible image watermarking via latent frequency space optimization, resisting regeneration attacks and achieving >90% bit accuracy with high image quality.

Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion

26 September 2024·2351 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 University of Electronic Science and Technology of China

FeINFN: a novel Fourier-enhanced Implicit Neural Fusion Network, achieves state-of-the-art hyperspectral image fusion by innovatively combining spatial and frequency information in both the spatial an…

Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting

26 September 2024·3838 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology

This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.