Image Generation

ReFIR: Grounding Large Restoration Models with Retrieval Augmentation

26 September 2024·3091 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Tsinghua University

ReFIR enhances Large Restoration Models’ accuracy by incorporating retrieved images as external knowledge, mitigating hallucination without retraining.

RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance

26 September 2024·4522 words·22 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Georgia Tech

RefDrop: A training-free method enhances image and video generation consistency by directly controlling the influence of reference features on the diffusion process, enabling precise manipulation of c…

ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration

26 September 2024·2816 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 MediaTek

ReF-LDM uses reference images to improve the accuracy of face image restoration, achieving high-quality results faithful to the subject’s true appearance.

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

26 September 2024·2658 words·13 mins· loading · loading

Computer Vision Image Generation 🏢 Peking University

RectifID personalizes image generation by cleverly guiding a diffusion model using off-the-shelf classifiers, achieving identity preservation without needing extra training data.

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

26 September 2024·2463 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 College of Computer Science and Technology, Tongji University

SRStitcher revolutionizes image stitching by integrating fusion and rectangling into a unified inpainting model, eliminating model training and achieving superior performance and stability.

RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models

26 September 2024·2585 words·13 mins· loading · loading

Computer Vision Image Generation 🏢 Tsinghua University

RealCompo: A novel training-free framework dynamically balances realism and compositionality in text-to-image generation, achieving state-of-the-art results.

Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network

26 September 2024·2135 words·11 mins· loading · loading

Image Generation 🏢 Tsinghua University

CORUN-Colabator: a novel cooperative unfolding network and coherence-based label generator achieves state-of-the-art real-world image dehazing by effectively integrating physical knowledge and generat…

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

26 September 2024·2208 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 University of Minnesota

RAW: A novel watermark framework ensures the authenticity of AI-generated images by embedding learnable watermarks directly into the image data, providing provable guarantees even under adversarial at…

PuLID: Pure and Lightning ID Customization via Contrastive Alignment

26 September 2024·3805 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 ByteDance Inc.

PuLID: Lightning-fast, tuning-free ID customization for text-to-image!

PTQ4DiT: Post-training Quantization for Diffusion Transformers

26 September 2024·2510 words·12 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Illinois Chicago

PTQ4DiT achieves 8-bit and even 4-bit weight precision for Diffusion Transformers, significantly improving efficiency for image generation without sacrificing quality.

Prune and Repaint: Content-Aware Image Retargeting for any Ratio

26 September 2024·2137 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 Southeast University

Prune and Repaint: A new content-aware method for superior image retargeting across any aspect ratio, preserving key features and avoiding artifacts.

PromptFix: You Prompt and We Fix the Photo

26 September 2024·5744 words·27 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Rochester

PromptFix: a novel framework enables diffusion models to precisely follow instructions for diverse image processing tasks, using a new high-frequency guidance sampling method and an auxiliary prompt a…

Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models

26 September 2024·3455 words·17 mins· loading · loading

Computer Vision Image Generation 🏢 Xi'an Jiaotong University

Prompt-Agnostic Adversarial Perturbation (PAP) defends customized diffusion models against image tampering, achieving superior generalization over prompt-specific methods.

Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors

26 September 2024·2800 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 Department of Computing and Mathematical Sciences, Caltech

Principled Probabilistic Imaging uses diffusion models as plug-and-play priors for accurate posterior sampling in inverse problems, surpassing existing methods.

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference

26 September 2024·4348 words·21 mins· loading · loading

Computer Vision Image Generation 🏢 City University of Hong Kong

PrefPaint: Aligning image inpainting diffusion models with human preferences using reinforcement learning, resulting in significantly improved visual appeal.

Phased Consistency Models

26 September 2024·5013 words·24 mins· loading · loading

Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology

Phased Consistency Models (PCMs) revolutionize diffusion model generation by overcoming LCM limitations, achieving superior speed and quality in image and video generation.

PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator

26 September 2024·2348 words·12 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 ByteDance

PeRFlow accelerates diffusion models by straightening their sampling trajectories using a piecewise reflow operation, enabling fast and high-quality image generation with minimal computational cost.

PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher

26 September 2024·2966 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 Stanford University

PaGoDA: Train high-resolution image generators efficiently by progressively growing a one-step generator from a low-resolution diffusion model. This innovative pipeline drastically cuts training cost…

Optical Diffusion Models for Image Generation

26 September 2024·1966 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 Google Research

Researchers created an energy-efficient optical system for generating images using light propagation, drastically reducing the latency and energy consumption of diffusion models.

OneActor: Consistent Subject Generation via Cluster-Conditioned Guidance

26 September 2024·3168 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Xi'an Jiaotong University

OneActor: One-shot tuning for consistent subject image generation, bypassing laborious backbone tuning via semantic guidance, achieving 4x faster speed.