Skip to main content

Image Generation

Single Image Reflection Separation via Dual-Stream Interactive Transformers
·2158 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 College of Intelligence and Computing, Tianjin University
Dual-Stream Interactive Transformers (DSIT) revolutionizes single image reflection separation by using a novel dual-attention mechanism that captures inter- and intra-layer correlations, significantly…
Simple and Fast Distillation of Diffusion Models
·3151 words·15 mins· loading · loading
Computer Vision Image Generation 🏒 Zhejiang University
Simple and Fast Distillation (SFD) drastically accelerates diffusion model training by 1000x, achieving state-of-the-art results in few-step image generation with minimal fine-tuning.
ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling
·2221 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 Tsinghua University
ShowMaker: Generating high-fidelity 2D human conversational videos using fine-grained diffusion modeling and 2D key points.
SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
·3049 words·15 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 DAMO Academy, Alibaba Group
SHMT: Self-supervised Hierarchical Makeup Transfer uses latent diffusion models to realistically and precisely apply diverse makeup styles to faces, even without paired training data, achieving high f…
SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
·2658 words·13 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Peking University
SemFlow: A unified framework uses rectified flow to seamlessly bridge semantic segmentation and image synthesis, achieving competitive results and offering reversible image-mask transformations.
Self-Play Fine-tuning of Diffusion Models for Text-to-image Generation
·4025 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 UC Los Angeles
Self-Play Fine-Tuning (SPIN-Diffusion) revolutionizes diffusion model training, achieving superior text-to-image results with less data via iterative self-improvement, surpassing supervised and RLHF m…
Score Distillation via Reparametrized DDIM
·4128 words·20 mins· loading · loading
Computer Vision Image Generation 🏒 MIT
Researchers improved 3D shape generation from 2D diffusion models by showing that existing Score Distillation Sampling is a reparameterized version of DDIM and fixing its high-variance noise issue via…
Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
·3730 words·18 mins· loading · loading
Computer Vision Image Generation 🏒 State Grid Corporation of China
Logistic Schedule: A novel noise schedule revolutionizes image editing by improving DDIM inversion, enhancing content preservation and edit fidelity without model retraining!
Scene Graph Disentanglement and Composition for Generalizable Complex Image Generation
·2541 words·12 mins· loading · loading
Image Generation 🏒 Shanghai Jiao Tong University
DisCo: a novel framework for generalizable complex image generation using scene graph disentanglement and composition, achieving superior performance over existing methods.
Scaling the Codebook Size of VQ-GAN to 100,000 with a Utilization Rate of 99%
·2947 words·14 mins· loading · loading
Computer Vision Image Generation 🏒 Microsoft Research
VQGAN-LC massively scales VQGAN’s codebook to 100,000 entries while maintaining a 99% utilization rate, significantly boosting image generation and downstream task performance.
ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization
·2551 words·12 mins· loading · loading
Computer Vision Image Generation 🏒 Wuhan University
ROBIN: A novel watermarking method for diffusion models that actively conceals robust watermarks using adversarial optimization, enabling strong, imperceptible, and verifiable image authentication.
Return of Unconditional Generation: A Self-supervised Representation Generation Method
·2725 words·13 mins· loading · loading
Image Generation 🏒 Massachusetts Institute of Technology
Revolutionizing image generation, Representation-Conditioned Generation (RCG) achieves state-of-the-art results in unconditional image synthesis by leveraging self-supervised representations to condit…
Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
·3245 words·16 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Amazon Web Services Shanghai AI Lab
This paper presents a novel regional cross-attention module for rich-context layout-to-image generation, significantly improving image accuracy while addressing limitations of existing methods. Two n…
Rethinking Score Distillation as a Bridge Between Image Distributions
·2251 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 UC Berkeley
Researchers enhanced image generation by improving score distillation sampling via a novel SchrΓΆdinger Bridge framework, improving realism without computational overhead.
Rethinking No-reference Image Exposure Assessment from Holism to Pixel: Models, Datasets and Benchmarks
·2343 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 Beijing University of Posts and Telecommunications
Revolutionizing image exposure assessment, Pixel-level IEA Network (P-IEANet) achieves state-of-the-art performance with a novel pixel-level approach, a new dataset (IEA40K), and a benchmark of 19 met…
Rethinking Imbalance in Image Super-Resolution for Efficient Inference
·2134 words·11 mins· loading · loading
Computer Vision Image Generation 🏒 Harbin Institute of Technology
WBSR: A novel framework for efficient image super-resolution that tackles data and model imbalances for superior performance and approximately a 34% reduction in computational cost.
RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models
·2570 words·13 mins· loading · loading
Computer Vision Image Generation 🏒 Hong Kong University of Science and Technology
RestoreAgent, an AI-powered image restoration agent, autonomously identifies and corrects multiple image degradations, exceeding human expert performance.
Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise
·2678 words·13 mins· loading · loading
Computer Vision Image Generation 🏒 College of Computer Science, Nankai University
Resfusion, a novel framework, accelerates image restoration by integrating residual noise into the diffusion process, achieving superior results with fewer steps.
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
·3815 words·18 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Technical University of Munich
ReNO: Boost one-step text-to-image models by cleverly optimizing initial noise using reward signals, achieving state-of-the-art results efficiently.
Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
·2025 words·10 mins· loading · loading
Computer Vision Image Generation 🏒 National University of Singapore
Remix-DiT: Boosting diffusion model image generation quality by cleverly mixing smaller basis models into numerous specialized denoisers, improving efficiency and lowering costs!