Skip to main content

Image Generation

One-Step Effective Diffusion Network for Real-World Image Super-Resolution
·2247 words·11 mins· loading · loading
Computer Vision Image Generation 🏢 Hong Kong Polytechnic University
OSEDiff: One-step diffusion network for real-world image super-resolution, achieving comparable or better results than multi-step methods with significantly reduced computational cost and improved ima…
One-Step Diffusion Distillation through Score Implicit Matching
·2065 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Peking University
Score Implicit Matching (SIM) revolutionizes diffusion model distillation by creating high-quality, single-step generators from complex, multi-step models, achieving comparable performance and enablin…
On improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
·3235 words·16 mins· loading · loading
Computer Vision Image Generation 🏢 FAIR at Meta
Researchers achieve state-of-the-art image generation by disentangling semantic and control metadata in diffusion models and optimizing pre-training across resolutions.
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
·3306 words·16 mins· loading · loading
Image Generation 🏢 Institute of Information Engineering, Chinese Academy of Sciences
Unlocking superior discriminative features from diffusion models, this research reveals key activation properties for effective feature selection, surpassing state-of-the-art methods.
Neural Residual Diffusion Models for Deep Scalable Vision Generation
·1912 words·9 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Tsinghua University
Neural-RDM: A novel framework for deep, scalable vision generation using residual diffusion models, achieving state-of-the-art results on image and video benchmarks.
Neural Gaffer: Relighting Any Object via Diffusion
·2042 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Cornell University
Neural Gaffer: Relighting any object via diffusion using a single image and an environment map to produce high-quality, realistic relit images.
Neural Cover Selection for Image Steganography
·3814 words·18 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 University of Texas at Austin
This study introduces a neural cover selection framework for image steganography, optimizing latent spaces in generative models to improve message recovery and image quality.
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
·2318 words·11 mins· loading · loading
Image Generation 🏢 Google DeepMind
Neural Assets enables intuitive 3D multi-object scene editing via image diffusion models by using per-object representations to control individual object poses, achieving state-of-the-art results.
Multistep Distillation of Diffusion Models via Moment Matching
·2156 words·11 mins· loading · loading
Computer Vision Image Generation 🏢 Google DeepMind
New method distills slow diffusion models into fast, few-step models by matching data expectations, achieving state-of-the-art results on ImageNet.
MonkeySee: Space-time-resolved reconstructions of natural images from macaque multi-unit activity
·2882 words·14 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Donders Institute for Brain, Cognition and Behaviour
MonkeySee reconstructs natural images from macaque brain signals with high accuracy using a novel CNN decoder, advancing neural decoding and offering insights into visual perception.
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
·1847 words·9 mins· loading · loading
Computer Vision Image Generation 🏢 Zhejiang University
MimicTalk generates realistic, expressive talking videos in minutes using a pre-trained model adapted to individual identities.
MC-DiT: Contextual Enhancement via Clean-to-Clean Reconstruction for Masked Diffusion Models
·2494 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Shanghai Jiao Tong University
MC-DiT: A novel training paradigm for masked diffusion models achieving state-of-the-art image generation by leveraging clean-to-clean reconstruction.
Masked Pre-training Enables Universal Zero-shot Denoiser
·4914 words·24 mins· loading · loading
Computer Vision Image Generation 🏢 University of Science and Technology of China
Masked Pre-training empowers a universal, fast zero-shot image denoiser!
Locating What You Need: Towards Adapting Diffusion Models to OOD Concepts In-the-Wild
·3829 words·18 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Zhejiang University
CATOD framework improves text-to-image generation by actively learning high-quality training data to accurately depict out-of-distribution concepts.
Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner
·2916 words·14 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Beijing University of Posts and Telecommunications
LucidDrag: Semantic-aware dragging transforms image editing with an intention reasoner and collaborative guidance, achieving superior accuracy, image fidelity, and semantic diversity.
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
·3223 words·16 mins· loading · loading
Computer Vision Image Generation 🏢 ETH Zurich
LiteVAE: A new autoencoder design for latent diffusion models boosts efficiency sixfold without sacrificing image quality, achieving faster training and lower memory needs via the 2D discrete wavelet …
Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
·2463 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 National University of Singapore
Learning-to-Cache (L2C) dramatically accelerates diffusion transformers by intelligently caching layer computations, achieving significant speedups with minimal performance loss.
Learning Transferable Features for Implicit Neural Representations
·4038 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Rice University
STRAINER: A new framework enabling faster, higher-quality INR fitting by leveraging transferable features across similar signals, significantly boosting INR performance.
Learning Image Priors Through Patch-Based Diffusion Models for Solving Inverse Problems
·3556 words·17 mins· loading · loading
Computer Vision Image Generation 🏢 University of Michigan
PaDIS: Patch-based diffusion inverse solver learns efficient image priors from image patches, enabling high-resolution inverse problem solutions with reduced computational costs and data needs.
Learning Group Actions on Latent Representations
·2124 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 University of Virginia
This paper proposes a novel method to model group actions within autoencoders by learning these actions in the latent space, enhancing model versatility and improving performance in various real-world…