Image Generation
Hidden in the Noise: Two-Stage Robust Watermarking for Images
·3984 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ New York University
WIND: A novel, distortion-free image watermarking method leveraging diffusion models’ initial noise for robust AI-generated content authentication.
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
·3014 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ ByteDance
AnyDressing: Customizable multi-garment virtual dressing via a novel latent diffusion model!
NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
·3265 words·16 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Tencent AI Lab
NVComposer: A novel generative NVS model boosts synthesis quality by implicitly inferring spatial relationships from multiple sparse, unposed images, eliminating reliance on external alignment.
MV-Adapter: Multi-view Consistent Image Generation Made Easy
·3888 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ School of Software, Beihang University
MV-Adapter easily transforms existing image generators into multi-view consistent image generators, improving efficiency and adaptability.
Imagine360: Immersive 360 Video Generation from Perspective Anchor
·2648 words·13 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Chinese University of Hong Kong
Imagine360: Generating immersive 360Β° videos from perspective videos, improving quality and accessibility of 360Β° content creation.
CleanDIFT: Diffusion Features without Noise
·3337 words·16 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ CompVis @ LMU Munich, MCML
CleanDIFT revolutionizes diffusion feature extraction by leveraging clean images and a lightweight fine-tuning method, significantly boosting performance across various tasks without noise or timestep…
SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance
·4159 words·20 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ VinAI Research
SNOOPI supercharges one-step diffusion model distillation with enhanced guidance, achieving state-of-the-art performance by stabilizing training and enabling negative prompt control.
Scaling Image Tokenizers with Grouped Spherical Quantization
·7140 words·34 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ JΓΌlich Supercomputing Centre
GSQ-GAN, a novel image tokenizer, achieves superior reconstruction quality with 16x downsampling using grouped spherical quantization, enabling efficient scaling for high-fidelity image generation.
TinyFusion: Diffusion Transformers Learned Shallow
·4225 words·20 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ National University of Singapore
TinyFusion, a novel learnable depth pruning method, crafts efficient shallow diffusion transformers with superior post-fine-tuning performance, achieving a 2x speedup with less than 7% of the original…
Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis
·3884 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Yandex Research
SWITTI: a novel scale-wise transformer achieves 7x faster text-to-image generation than state-of-the-art diffusion models, while maintaining competitive image quality.
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
·2333 words·11 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ SketchX, CVSSP, University of Surrey
NitroFusion achieves high-fidelity single-step image generation using a dynamic adversarial training approach with a specialized discriminator pool, dramatically improving speed and quality.
Negative Token Merging: Image-based Adversarial Feature Guidance
·2311 words·11 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ University of Washington
NegToMe: Image-based adversarial guidance improves image generation diversity and reduces similarity to copyrighted content without training, simply by using images instead of negative text prompts.
Open-Sora Plan: Open-Source Large Video Generation Model
·4618 words·22 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Peking University
Open-Sora Plan introduces an open-source large video generation model capable of producing high-resolution videos with long durations, based on various user inputs.
TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models
·6566 words·31 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Machine Learning Group, CITEC, Bielefeld University
TryOffDiff generates realistic garment images from single photos, solving virtual try-on limitations.
ROICtrl: Boosting Instance Control for Visual Generation
·3855 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Show Lab, National University of Singapore
ROICtrl boosts visual generation’s instance control by using regional instance control via ROI-Align and a new ROI-Unpool operation, resulting in precise regional control and high efficiency.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
·5402 words·26 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ University of Cambridge
FAM Diffusion: Generate high-res images seamlessly from pre-trained diffusion models, solving structural and texture inconsistencies without retraining!
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis
·3637 words·18 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Nanyang Technological University
Omegance: One parameter precisely controls image detail in diffusion models, enabling flexible granularity adjustments without model changes or retraining.
Identity-Preserving Text-to-Video Generation by Frequency Decomposition
·2775 words·14 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Peking University
ConsisID achieves high-quality, identity-preserving text-to-video generation using a tuning-free diffusion transformer model that leverages frequency decomposition for effective identity control.
DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting
·2489 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Dalian University of Technology
DreamMix enhances image inpainting by disentangling object attributes for precise editing, enabling both identity preservation and flexible text-driven modifications.
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching
·3048 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Image Generation
π’ Samsung R&D Institute UK
DreamCache enables efficient, high-quality personalized image generation without finetuning by caching reference image features and using lightweight conditioning adapters.