Image Generation
Autoregressive Image Generation with Randomized Parallel Decoding
·3693 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Westlake University
ARPG: Randomly generate high-quality images by parallel decoding, outperforming existing methods in efficiency, memory, and quality.
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models
·410 words·2 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 KAIST
New ‘Silent Branding Attack’ poisons text-to-image models, embedding brand logos without text prompts, raising ethical issues for image generation tools.
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
·3137 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 NVIDIA
SANA-Sprint: An efficient diffusion model for ultra-fast text-to-image generation with continuous-time consistency distillation, achieving state-of-the-art performance in speed and quality.
PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
·2966 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Technical University of Munich
PerCoV2: Open ultra-low bit-rate perceptual image compression using implicit hierarchical masked image modeling, built on Stable Diffusion 3 for bandwidth-constrained applications.
Neighboring Autoregressive Modeling for Efficient Visual Generation
·3102 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Zhejiang University, China
NAR: Neighboring Autoregressive Modeling for efficient visual generation by locality-preserved, parallel decoding.
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
·2300 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Hong Kong University of Science and Technology
LightGen: Efficient image generation via knowledge distillation and direct preference optimization.
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
·3702 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Peking University
WISE: Evaluates world knowledge in text-to-image generation.
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model
·3772 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 ByteDance
Seedream 2.0: A native Chinese-English bilingual image generation model that understands cultural nuances and excels in text rendering.
RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories
·2040 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 ByteDance Inc.
RayFlow: Accelerating diffusion with instance-aware adaptive flow, boosting speed & quality!
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
·4256 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Samsung Research
PLADIS: Sparsity boosts attention for diffusion models, enhancing text-to-image generation at inference time!
Effective and Efficient Masked Image Generation Models
·4167 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Renmin University of China
eMIGM: A unified, efficient masked image generation model achieving state-of-the-art performance with fewer resources.
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer
·2653 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Tiamat AI
EasyControl: Efficient & flexible control for Diffusion Transformers, enabling sophisticated image generation.
Learning Few-Step Diffusion Models by Trajectory Distribution Matching
·4283 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Hong Kong University of Science and Technology
TDM: a new diffusion distillation paradigm unifying trajectory distillation and distribution matching, surpassing teachers in a data-free manner with state-of-the-art performance and low training cost…
ProReflow: Progressive Reflow with Decomposed Velocity
·1902 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Tsinghua University
ProReflow: Improves diffusion model efficiency via progressive training and direction-focused velocity alignment.
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification
·2593 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Hong Kong University of Science and Technology
RectifiedHR: Enables training-free high-resolution image generation via energy rectification, boosting both efficiency and effectiveness.
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
·3985 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Shanghai Jiao Tong University
Q-Eval-100K: A new, large dataset for evaluating visual quality and text alignment in AI-generated content.
Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator
·2905 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 NVIDIA Research
Likelihood-based generative models get a GAN-like boost via a new Direct Discriminative Optimization, ditching the joint training complexity.
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think
·2523 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Peking University
DREAM ENGINE: Text-image interleaved control made easy, unifying text and visual cues for creative image generation.
GCC: Generative Color Constancy via Diffusing a Color Checker
·362 words·2 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 National Yang Ming Chiao Tung University
GCC: Color constancy through diffusion, inpainting a color checker for stable illumination estimation.
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
·1433 words·7 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Heifei University of Technology
M3-AGIQA: A multimodal AI solution that comprehensively assesses AI-generated image quality, achieving state-of-the-art performance by distilling online MLLM capabilities into a local model.