Skip to main content

Image Generation

Stylecodes: Encoding Stylistic Information For Image Generation
·237 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 String
StyleCodes enables easy style sharing for image generation by encoding styles as compact strings, enhancing control and collaboration while minimizing quality loss.
Continuous Speculative Decoding for Autoregressive Image Generation
·1799 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Chinese Academy of Sciences
Researchers have developed Continuous Speculative Decoding, boosting autoregressive image generation speed by up to 2.33x while maintaining image quality.
FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on
·2555 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Tencent
FitDiT boosts virtual try-on realism by enhancing garment details via Diffusion Transformers, improving texture and size accuracy for high-fidelity virtual fashion.
MagicQuill: An Intelligent Interactive Image Editing System
·4923 words·24 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 HKUST
MagicQuill: an intelligent interactive image editing system enabling intuitive, precise image edits via brushstrokes and real-time intent prediction by a multimodal LLM.
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
·3438 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Waterloo
OmniEdit, a novel instruction-based image editing model, surpasses existing methods by leveraging specialist supervision and high-quality data, achieving superior performance across diverse editing ta…
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
·3087 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 NVIDIA Research
Edify Image: groundbreaking pixel-perfect photorealistic image generation using cascaded pixel-space diffusion models with a novel Laplacian diffusion process, enabling diverse applications including …
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
·3359 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 NVIDIA Research
Add-it: Training-free object insertion in images using pretrained diffusion models by cleverly balancing information from the scene, text prompt, and generated image, achieving state-of-the-art result…
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
·2454 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Tencent AI Lab
StdGEN: Generate high-quality, semantically decomposed 3D characters from a single image in minutes, enabling flexible customization for various applications.
SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
·4041 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 MIT
SVDQuant boosts 4-bit diffusion models by absorbing outliers via low-rank components, achieving 3.5x memory reduction and 3x speedup on 12B parameter models.
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
·3777 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 University of Toronto
SG-I2V: Zero-shot controllable image-to-video generation using a self-guided approach that leverages pre-trained models for precise object and camera motion control.
Training-free Regional Prompting for Diffusion Transformers
·1817 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Peking University
Training-free Regional Prompting for FLUX boosts compositional text-to-image generation by cleverly manipulating attention mechanisms, achieving fine-grained control without retraining.
Randomized Autoregressive Visual Generation
·4145 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 ByteDance
Randomized Autoregressive Modeling (RAR) sets a new state-of-the-art in image generation by cleverly introducing randomness during training to improve the model’s ability to learn from bidirectional c…
Constant Acceleration Flow
·3289 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Korea University
Constant Acceleration Flow (CAF) dramatically speeds up diffusion model generation by using a constant acceleration equation, outperforming state-of-the-art methods with improved accuracy and few-step…
In-Context LoRA for Diffusion Transformers
·392 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Tongyi Lab
In-Context LoRA empowers existing text-to-image models for high-fidelity multi-image generation by simply concatenating images and using minimal task-specific LoRA tuning.
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models
·2152 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Peking University
HelloMeme enhances text-to-image models by integrating spatial knitting attentions, enabling high-fidelity meme video generation while preserving model generalization.