Computer Vision

Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection

26 September 2024·2705 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 Hong Kong University of Science and Technology

Object-centric occupancy completion boosts 3D object detection accuracy by using temporal information from long sequences to precisely reconstruct object shapes, particularly for incomplete or distant…

Towards Combating Frequency Simplicity-biased Learning for Domain Generalization

26 September 2024·2276 words·11 mins· loading · loading

Computer Vision Domain Generalization 🏢 Shenzhen University

This paper introduces novel data augmentation modules that dynamically adjust the frequency characteristics of datasets, preventing neural networks from over-relying on simple frequency-based shortcut…

Toward Real Ultra Image Segmentation: Leveraging Surrounding Context to Cultivate General Segmentation Model

26 September 2024·2381 words·12 mins· loading · loading

Computer Vision Image Segmentation 🏢 Wuhan University

SGNet cultivates general segmentation models for ultra images by integrating surrounding context, achieving significant performance improvements across various datasets.

Toward Dynamic Non-Line-of-Sight Imaging with Mamba Enforced Temporal Consistency

26 September 2024·2152 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Science and Technology of China

Dynamic NLOS imaging gets a speed boost! New ST-Mamba method leverages temporal consistency across frames for high-resolution video reconstruction, overcoming speed limitations of traditional methods.

Toward Approaches to Scalability in 3D Human Pose Estimation

26 September 2024·2344 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Korea University

Boosting 3D human pose estimation: Biomechanical Pose Generator and Binary Depth Coordinates enhance accuracy and scalability.

TopoFR: A Closer Look at Topology Alignment on Face Recognition

26 September 2024·2430 words·12 mins· loading · loading

Computer Vision Face Recognition 🏢 Zhejiang University

TopoFR enhances face recognition by aligning topological structures between input and latent spaces. Using persistent homology, it preserves crucial data structure info, overcoming overfitting. A har…

To Err Like Human: Affective Bias-Inspired Measures for Visual Emotion Recognition Evaluation

26 September 2024·1759 words·9 mins· loading · loading

Computer Vision Image Classification 🏢 Nankai University

This paper introduces novel metrics for visual emotion recognition evaluation, considering the psychological distance between emotions to better reflect human perception, improving the assessment of m…

TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge

26 September 2024·1979 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 School of Integrated Circuits, Xidian University

TinyLUT achieves 10x lower memory consumption and superior accuracy in image restoration on edge devices using innovative separable mapping and dynamic discretization of LUTs.

Time-Varying LoRA: Towards Effective Cross-Domain Fine-Tuning of Diffusion Models

26 September 2024·3031 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Southern University of Science and Technology

Terra, a novel time-varying low-rank adapter, enables effective cross-domain fine-tuning of diffusion models by creating a continuous parameter manifold, facilitating efficient knowledge sharing and g…

The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better

26 September 2024·2874 words·14 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 University of Washington

Using real images retrieved from a generator’s training data outperforms using synthetic images generated by that same model for image classification.

The GAN is dead; long live the GAN! A Modern GAN Baseline

26 September 2024·3072 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Brown University

R3GAN, a minimalist GAN baseline, surpasses state-of-the-art models by using a novel regularized relativistic GAN loss and modern architectures, proving GANs can be trained efficiently without relying…

TFS-NeRF: Template-Free NeRF for Semantic 3D Reconstruction of Dynamic Scene

26 September 2024·2695 words·13 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Faculty of IT, Monash University

TFS-NeRF: A template-free neural radiance field efficiently reconstructs semantically separable 3D geometries of dynamic scenes featuring multiple interacting entities from sparse RGB videos.

Test-Time Dynamic Image Fusion

26 September 2024·3589 words·17 mins· loading · loading

AI Generated Computer Vision Image Fusion 🏢 Tianjin University

Test-Time Dynamic Image Fusion (TTD) paradigm provably improves image fusion by dynamically weighting source data based on their relative dominance, reducing generalization error without extra trainin…

Tensor-Based Synchronization and the Low-Rankness of the Block Trifocal Tensor

26 September 2024·1554 words·8 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Minnesota

Low-rank block trifocal tensor unlocks accurate, efficient camera pose synchronization.

Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations

26 September 2024·1994 words·10 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Maryland

ConVRT: A novel framework restores turbulence-distorted videos by decoupling spatial and temporal information in a neural representation, achieving temporally consistent mitigation.

Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis

26 September 2024·2005 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

This research introduces a template-free articulated Gaussian splatting method for real-time dynamic view synthesis, automatically discovering object skeletons from videos to enable reposing.

TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network

26 September 2024·2452 words·12 mins· loading · loading

Computer Vision Image Segmentation 🏢 Intelligent Science and Technology Academy of CASIC

TARSS-Net: A novel temporal-aware radar semantic segmentation network uses a data-driven approach to aggregate temporal information, enhancing accuracy and performance.

TARP-VP: Towards Evaluation of Transferred Adversarial Robustness and Privacy on Label Mapping Visual Prompting Models

26 September 2024·2161 words·11 mins· loading · loading

Computer Vision Image Classification 🏢 University of Liverpool

TARP-VP reveals a surprising lack of trade-off between adversarial robustness and privacy for label mapping visual prompting models, showing that transferred adversarial training significantly improve…

Target-Guided Adversarial Point Cloud Transformer Towards Recognition Against Real-world Corruptions

26 September 2024·3740 words·18 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Beijing Institute of Technology

APCT: a novel architecture enhances 3D point cloud recognition by using an adversarial feature erasing mechanism to improve global structure capture and robustness against real-world corruptions.

TAPTRv2: Attention-based Position Update Improves Tracking Any Point

26 September 2024·1868 words·9 mins· loading · loading

Computer Vision Video Understanding 🏢 South China University of Technology

TAPTRv2 enhances point tracking by introducing an attention-based position update, eliminating cost-volume reliance for improved accuracy and efficiency.