Computer Vision
AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario
·3145 words·15 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Shanghai Jiao Tong University
AnyFit: Controllable virtual try-on for any attire combination across any scenario, exceeding existing methods in accuracy and scalability.
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
·2384 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 DAMO Academy, Alibaba Group
Animate3D animates any 3D model using multi-view video diffusion, achieving superior spatiotemporal consistency and straightforward mesh animation.
An Image is Worth 32 Tokens for Reconstruction and Generation
·2076 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 ByteDance
Image generation gets a speed boost with TiTok, a novel 1D image tokenizer that uses just 32 tokens for high-quality image reconstruction and generation, achieving up to 410x faster processing than st…
An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations
·3657 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Peking University
EMDiffusion trains clean diffusion models from corrupted data using an expectation-maximization algorithm, achieving state-of-the-art results on diverse imaging tasks.
Amnesia as a Catalyst for Enhancing Black Box Pixel Attacks in Image Classification and Object Detection
·2866 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
Object Detection
🏢 Korea Aerospace University
RFPAR: A novel reinforcement learning-based attack enhances black-box pixel attacks by minimizing randomness and patch dependency, achieving state-of-the-art results in both image classification and o…
AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos
·2409 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Tsinghua University
AlphaTablets revolutionizes 3D planar reconstruction from monocular videos with its novel rectangle-based representation featuring continuous surfaces and precise boundaries, achieving state-of-the-ar…
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization
·2692 words·13 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 ByteDance
DiMR boosts image generation fidelity by cleverly combining multi-resolution networks with time-dependent layer normalization in diffusion models, achieving state-of-the-art results on ImageNet.
All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation
·5531 words·26 mins·
loading
·
loading
AI Generated
Computer Vision
Image Coding
🏢 Nanjing University
Multi-Path Aggregation (MPA) achieves comparable performance to state-of-the-art methods in multi-task image coding, by unifying feature representations with a novel all-in-one architecture and a two-…
Aligning Diffusion Models by Optimizing Human Utility
·3826 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 UC Los Angeles
Diffusion-KTO: Aligning text-to-image models with human preferences using simple likes/dislikes, maximizing expected human utility.
AirSketch: Generative Motion to Sketch
·2122 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 University of Central Florida
AirSketch generates aesthetically pleasing sketches directly from noisy hand-motion tracking data using a self-supervised controllable diffusion model, eliminating the need for expensive AR/VR equipme…
AID: Attention Interpolation of Text-to-Image Diffusion
·3646 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 National University of Singapore
AID, a novel training-free method, significantly improves image interpolation by fusing inner/outer interpolated attention layers and using beta-distribution for coefficient selection, enhancing consi…
Adversarial Schrödinger Bridge Matching
·3165 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Skoltech
Accelerate Schrödinger Bridge Matching with Discrete-time IMF using only a few steps, achieving comparable results to existing hundred-step methods via D-GAN implementation.
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
·4124 words·20 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 Reichman University
SaSPA, a novel data augmentation method, boosts fine-grained visual classification accuracy by generating diverse, class-consistent synthetic images using structural and subject-preserving techniques.
AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks
·2221 words·11 mins·
loading
·
loading
Computer Vision
Adversarial Attacks
🏢 Guangdong Key Lab of Information Security
AdvAD: A non-parametric diffusion process crafts imperceptible adversarial examples by subtly guiding an initial noise towards a target distribution, achieving high attack success rates with minimal p…
AdjointDEIS: Efficient Gradients for Diffusion Models
·1797 words·9 mins·
loading
·
loading
Computer Vision
Face Recognition
🏢 Clarkson University
AdjointDEIS: Efficient gradients for diffusion models via bespoke ODE solvers, simplifying backpropagation and improving guided generation.
AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection
·3853 words·19 mins·
loading
·
loading
AI Generated
Computer Vision
Object Detection
🏢 Shanghai AI Laboratory
AdaptiveISP uses reinforcement learning to create a scene-adaptive ISP pipeline that dynamically optimizes for object detection, surpassing existing methods in accuracy and efficiency.
Adaptive Important Region Selection with Reinforced Hierarchical Search for Dense Object Detection
·2760 words·13 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 Rochester Institute of Technology
AIRS framework, guided by Evidential Q-learning, dynamically balances exploration and exploitation to achieve superior dense object detection accuracy by adaptively selecting important regions.
Adaptive Domain Learning for Cross-domain Image Denoising
·2302 words·11 mins·
loading
·
loading
Computer Vision
Image Denoising
🏢 Hong Kong University of Science and Technology
Adaptive Domain Learning (ADL) efficiently trains a cross-domain RAW image denoising model using limited target data and existing source data by intelligently discarding harmful source data and levera…
Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
·3442 words·17 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 UC San Diego
FG-DMs revolutionize image synthesis by jointly modeling image and condition distributions, achieving higher object recall and enabling flexible editing.
AdaPKC: PeakConv with Adaptive Peak Receptive Field for Radar Semantic Segmentation
·2841 words·14 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Tsinghua University
AdaPKC upgrades PeakConv for superior radar semantic segmentation by dynamically adjusting its receptive field, outperforming current state-of-the-art methods.