Computer Vision
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos
·2627 words·13 mins·
loading
·
loading
Computer Vision
Scene Understanding
🏢 University of Arkansas
CYCLO: A novel cyclic graph transformer excels at multi-object relationship modeling in aerial videos.
CV-VAE: A Compatible Video VAE for Latent Generative Video Models
·3396 words·16 mins·
loading
·
loading
AI Generated
Computer Vision
Video Understanding
🏢 Tencent AI Lab
CV-VAE: A compatible video VAE enabling efficient, high-quality latent video generation by bridging the gap between image and video latent spaces.
Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
·1703 words·8 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Gwangju Institute of Science and Technology
CUFIT: a novel curriculum fine-tuning paradigm significantly improves medical image classification accuracy despite noisy labels by leveraging pre-trained Vision Foundation Models.
Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance
·2880 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 UC Los Angeles
Ctrl-X: Zero-shot text-to-image generation with training-free structure & appearance control!
CryoSPIN: Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference
·1731 words·9 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Toronto
CryoSPIN revolutionizes ab-initio cryo-EM reconstruction with semi-amortized pose inference, achieving faster and more accurate 3D structure determination.
CryoGEM: Physics-Informed Generative Cryo-Electron Microscopy
·2131 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 ShanghaiTech University
CryoGEM: Physics-informed generative model creates realistic synthetic cryo-EM datasets, boosting particle picking and pose estimation accuracy for higher-resolution protein structure determination.
Cross-video Identity Correlating for Person Re-identification Pre-training
·1957 words·10 mins·
loading
·
loading
Computer Vision
Person Re-Identification
🏢 String
Cross-video Identity-cOrrelating pre-training (CION) revolutionizes person re-identification by leveraging identity correlation across videos for superior model pre-training, achieving state-of-the-ar…
Cross-Scale Self-Supervised Blind Image Deblurring via Implicit Neural Representation
·3186 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 National University of Singapore
Self-supervised blind image deblurring (BID) breakthrough! A novel cross-scale consistency loss and progressive training scheme using implicit neural representations achieves superior performance wit…
Cross-Modality Perturbation Synergy Attack for Person Re-identification
·1933 words·10 mins·
loading
·
loading
Computer Vision
Face Recognition
🏢 Xiamen University
Cross-Modality Perturbation Synergy (CMPS) attack: A novel universal perturbation method for cross-modality person re-identification, effectively misleading ReID models by leveraging gradients from di…
CRAYM: Neural Field Optimization via Camera RAY Matching
·2649 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Shenzhen University
CRAYM: Neural field optimization via camera RAY matching enhances 3D reconstruction by using camera rays, not pixels, improving both novel view synthesis and geometry.
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
·2236 words·11 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Tsinghua University
COVE: Consistent high-quality video editing achieved by leveraging diffusion feature correspondence for temporal consistency.
CountGD: Multi-Modal Open-World Counting
·2520 words·12 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 University of Oxford
COUNTGD: A new multi-modal model counts objects in images using text or visual examples, significantly improving open-world counting accuracy.
CoSW: Conditional Sample Weighting for Smoke Segmentation with Label Noise
·2129 words·10 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 East China University of Science and Technology
CoSW: a novel conditional sample weighting method for robust smoke segmentation, achieves state-of-the-art results by handling inconsistent noisy labels through a multi-prototype framework.
COSMIC: Compress Satellite Image Efficiently via Diffusion Compensation
·3381 words·16 mins·
loading
·
loading
Computer Vision
Image Compression
🏢 Tsinghua University
COSMIC efficiently compresses satellite images via a lightweight encoder and diffusion compensation, enabling practical onboard processing and high compression ratios.
CosAE: Learnable Fourier Series for Image Restoration
·2867 words·14 mins·
loading
·
loading
Computer Vision
Image Restoration
🏢 NVIDIA Research
CosAE: a novel autoencoder using learnable Fourier series achieves state-of-the-art image restoration by encoding frequency coefficients in its narrow bottleneck, preserving fine details even with ext…
Cooperative Hardware-Prompt Learning for Snapshot Compressive Imaging
·1775 words·9 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Rochester Institute of Technology
Federated Hardware-Prompt Learning (FedHP) enables robust cross-hardware SCI training by aligning inconsistent data distributions using a hardware-conditioned prompter, outperforming existing FL metho…
Contrastive-Equivariant Self-Supervised Learning Improves Alignment with Primate Visual Area IT
·2007 words·10 mins·
loading
·
loading
Computer Vision
Self-Supervised Learning
🏢 Center for Neural Science, New York University
Self-supervised learning models can now better predict primate IT neural responses by preserving structured variability to input transformations, improving alignment with biological visual perception.
Continuous Spatiotemporal Events Decoupling through Spike-based Bayesian Computation
·1859 words·9 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Peking University
Spiking neural network effectively segments mixed-motion event streams via spike-based Bayesian computation, achieving efficient real-time motion decoupling.
Continuous Heatmap Regression for Pose Estimation via Implicit Neural Representation
·2522 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Nanjing University of Science and Technology
NerPE: continuous heatmap regression via implicit neural representation resolves the accuracy-limiting quantization errors in human pose estimation, achieving sub-pixel precision.
ContextGS : Compact 3D Gaussian Splatting with Anchor Level Context Model
·1913 words·9 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Nanyang Technological University
ContextGS: Revolutionizing 3D scene compression with an anchor-level autoregressive model, achieving 15x size reduction in 3D Gaussian Splatting while boosting rendering quality.