Skip to main content

Image Generation

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
·2761 words·13 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Mohamed Bin Zayed University of Artificial Intelligence
Concept-Incremental Flexible Customization (CIFC) model tackles catastrophic forgetting and concept neglect in continually adapting text-to-image diffusion models, enabling flexible personalization.
How Diffusion Models Learn to Factorize and Compose
·3926 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 MIT
Diffusion models surprisingly learn factorized representations, enabling compositional generalization, but struggle with interpolation; training with independent factors drastically improves data effi…
Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models
·2415 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Qualcomm AI Research
Hollowed Net efficiently personalizes text-to-image diffusion models on-device by temporarily removing deep U-Net layers during training, drastically reducing memory usage without sacrificing performa…
High-Resolution Image Harmonization with Adaptive-Interval Color Transformation
·3030 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 Harbin Institute of Technology
AICT: Adaptive-Interval Color Transformation harmonizes high-resolution images by predicting pixel-wise color changes, adaptively adjusting sampling intervals to capture local variations, and using a …
Hierarchical Uncertainty Exploration via Feedforward Posterior Trees
·5486 words·26 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Technion-Israel Institute of Technology
Visualizing high-dimensional posterior distributions is challenging. This paper introduces ‘Posterior Trees,’ a novel method using tree-structured neural network predictions for hierarchical uncertai…
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation
·3034 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 360 AI Research
HiCo: Hierarchical Controllable Diffusion Model achieves superior layout-to-image generation by disentangling spatial layouts through a multi-branch network structure, resulting in high-quality images…
HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach
·2844 words·14 mins· loading · loading
Computer Vision Image Generation 🏢 HSE University
HairFastGAN achieves realistic and robust hairstyle transfer in near real-time using a novel encoder-based approach, significantly outperforming optimization-based methods.
HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion
·3966 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Shenzhen University
HairDiffusion uses latent diffusion models and a multi-stage blending technique to achieve vivid, multi-colored hair editing in images, preserving other facial features.
Guiding a Diffusion Model with a Bad Version of Itself
·1887 words·9 mins· loading · loading
Image Generation 🏢 NVIDIA
Boost image quality in diffusion models without reducing variation using Autoguidance: guide a high-quality model with a less-trained version of itself!
Gradient-free Decoder Inversion in Latent Diffusion Models
·2408 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Seoul National University
This paper introduces a novel gradient-free decoder inversion method for latent diffusion models, improving efficiency and memory usage compared to existing gradient-based methods. The method is theo…
Goal Conditioned Reinforcement Learning for Photo Finishing Tuning
·3405 words·16 mins· loading · loading
Computer Vision Image Generation 🏢 Shanghai AI Laboratory
This paper introduces a goal-conditioned reinforcement learning approach that efficiently tunes photo finishing pipelines, achieving high-quality results in fewer iterations than optimization-based me…
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
·2403 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Sony AI
GenWarp generates high-quality novel image views from a single input image by using a semantic-preserving generative warping framework, outperforming existing methods.
Generating compositional scenes via Text-to-image RGBA Instance Generation
·4227 words·20 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 University of Edinburgh
This paper introduces a novel multi-stage generation framework for creating compositional scenes with fine-grained control by leveraging a trained diffusion model to produce individual scene component…
Generalizable and Animatable Gaussian Head Avatar
·3445 words·17 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 University of Tokyo
One-shot animatable head avatar reconstruction is achieved using a novel dual-lifting method that generates 3D Gaussians from a single image, enabling real-time expression control and rendering with s…
General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process
·2623 words·13 mins· loading · loading
Computer Vision Image Generation 🏢 Shanghai Jiao Tong University
Part-Aware Diffusion Model (PA-Diffusion) enables precise and efficient manipulation of articulated objects in real images by using abstract 3D models and dynamic feature maps, overcoming limitations …
FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images
·1953 words·10 mins· loading · loading
Image Generation 🏢 Shanghai Jiao Tong University
FuseAnyPart: Swap facial parts seamlessly using multiple reference images via diffusion, achieving high-fidelity results.
From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models
·3334 words·16 mins· loading · loading
Computer Vision Image Generation 🏢 Tsinghua University
Diffusion models, while excelling in image generation, are vulnerable to data poisoning. This paper demonstrates a BadNets-like attack’s effectiveness against diffusion models, causing image misalign…
FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space
·3551 words·17 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 University of Science and Technology of China
FreqMark: Robust invisible image watermarking via latent frequency space optimization, resisting regeneration attacks and achieving >90% bit accuracy with high image quality.
Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion
·2351 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 University of Electronic Science and Technology of China
FeINFN: a novel Fourier-enhanced Implicit Neural Fusion Network, achieves state-of-the-art hyperspectral image fusion by innovatively combining spatial and frequency information in both the spatial an…
Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting
·3838 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology
This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.