Image Generation
ReFIR: Grounding Large Restoration Models with Retrieval Augmentation
·3091 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
ReFIR enhances Large Restoration Models’ accuracy by incorporating retrieved images as external knowledge, mitigating hallucination without retraining.
RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance
·4522 words·22 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Georgia Tech
RefDrop: A training-free method enhances image and video generation consistency by directly controlling the influence of reference features on the diffusion process, enabling precise manipulation of c…
ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration
·2816 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 MediaTek
ReF-LDM uses reference images to improve the accuracy of face image restoration, achieving high-quality results faithful to the subject’s true appearance.
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
·2658 words·13 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Peking University
RectifID personalizes image generation by cleverly guiding a diffusion model using off-the-shelf classifiers, achieving identity preservation without needing extra training data.
Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model
·2463 words·12 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 College of Computer Science and Technology, Tongji University
SRStitcher revolutionizes image stitching by integrating fusion and rectangling into a unified inpainting model, eliminating model training and achieving superior performance and stability.
RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
·2585 words·13 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
RealCompo: A novel training-free framework dynamically balances realism and compositionality in text-to-image generation, achieving state-of-the-art results.
Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network
·2135 words·11 mins·
loading
·
loading
Image Generation
🏢 Tsinghua University
CORUN-Colabator: a novel cooperative unfolding network and coherence-based label generator achieves state-of-the-art real-world image dehazing by effectively integrating physical knowledge and generat…
RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees
·2208 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 University of Minnesota
RAW: A novel watermark framework ensures the authenticity of AI-generated images by embedding learnable watermarks directly into the image data, providing provable guarantees even under adversarial at…
PuLID: Pure and Lightning ID Customization via Contrastive Alignment
·3805 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 ByteDance Inc.
PuLID: Lightning-fast, tuning-free ID customization for text-to-image!
PTQ4DiT: Post-training Quantization for Diffusion Transformers
·2510 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 University of Illinois Chicago
PTQ4DiT achieves 8-bit and even 4-bit weight precision for Diffusion Transformers, significantly improving efficiency for image generation without sacrificing quality.
Prune and Repaint: Content-Aware Image Retargeting for any Ratio
·2137 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Southeast University
Prune and Repaint: A new content-aware method for superior image retargeting across any aspect ratio, preserving key features and avoiding artifacts.
PromptFix: You Prompt and We Fix the Photo
·5744 words·27 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 University of Rochester
PromptFix: a novel framework enables diffusion models to precisely follow instructions for diverse image processing tasks, using a new high-frequency guidance sampling method and an auxiliary prompt a…
Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
·3455 words·17 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Xi'an Jiaotong University
Prompt-Agnostic Adversarial Perturbation (PAP) defends customized diffusion models against image tampering, achieving superior generalization over prompt-specific methods.
Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors
·2800 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Department of Computing and Mathematical Sciences, Caltech
Principled Probabilistic Imaging uses diffusion models as plug-and-play priors for accurate posterior sampling in inverse problems, surpassing existing methods.
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
·4348 words·21 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 City University of Hong Kong
PrefPaint: Aligning image inpainting diffusion models with human preferences using reinforcement learning, resulting in significantly improved visual appeal.
Phased Consistency Models
·5013 words·24 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Hong Kong University of Science and Technology
Phased Consistency Models (PCMs) revolutionize diffusion model generation by overcoming LCM limitations, achieving superior speed and quality in image and video generation.
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
·2348 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 ByteDance
PeRFlow accelerates diffusion models by straightening their sampling trajectories using a piecewise reflow operation, enabling fast and high-quality image generation with minimal computational cost.
PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher
·2966 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Stanford University
PaGoDA: Train high-resolution image generators efficiently by progressively growing a one-step generator from a low-resolution diffusion model. This innovative pipeline drastically cuts training cost…
Optical Diffusion Models for Image Generation
·1966 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Google Research
Researchers created an energy-efficient optical system for generating images using light propagation, drastically reducing the latency and energy consumption of diffusion models.
OneActor: Consistent Subject Generation via Cluster-Conditioned Guidance
·3168 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Xi'an Jiaotong University
OneActor: One-shot tuning for consistent subject image generation, bypassing laborious backbone tuning via semantic guidance, achieving 4x faster speed.