Computer Vision
Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection
·2705 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Hong Kong University of Science and Technology
Object-centric occupancy completion boosts 3D object detection accuracy by using temporal information from long sequences to precisely reconstruct object shapes, particularly for incomplete or distant…
Towards Combating Frequency Simplicity-biased Learning for Domain Generalization
·2276 words·11 mins·
loading
·
loading
Computer Vision
Domain Generalization
🏢 Shenzhen University
This paper introduces novel data augmentation modules that dynamically adjust the frequency characteristics of datasets, preventing neural networks from over-relying on simple frequency-based shortcut…
Toward Real Ultra Image Segmentation: Leveraging Surrounding Context to Cultivate General Segmentation Model
·2381 words·12 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Wuhan University
SGNet cultivates general segmentation models for ultra images by integrating surrounding context, achieving significant performance improvements across various datasets.
Toward Dynamic Non-Line-of-Sight Imaging with Mamba Enforced Temporal Consistency
·2152 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Science and Technology of China
Dynamic NLOS imaging gets a speed boost! New ST-Mamba method leverages temporal consistency across frames for high-resolution video reconstruction, overcoming speed limitations of traditional methods.
Toward Approaches to Scalability in 3D Human Pose Estimation
·2344 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Korea University
Boosting 3D human pose estimation: Biomechanical Pose Generator and Binary Depth Coordinates enhance accuracy and scalability.
TopoFR: A Closer Look at Topology Alignment on Face Recognition
·2430 words·12 mins·
loading
·
loading
Computer Vision
Face Recognition
🏢 Zhejiang University
TopoFR enhances face recognition by aligning topological structures between input and latent spaces. Using persistent homology, it preserves crucial data structure info, overcoming overfitting. A har…
To Err Like Human: Affective Bias-Inspired Measures for Visual Emotion Recognition Evaluation
·1759 words·9 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Nankai University
This paper introduces novel metrics for visual emotion recognition evaluation, considering the psychological distance between emotions to better reflect human perception, improving the assessment of m…
TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge
·1979 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 School of Integrated Circuits, Xidian University
TinyLUT achieves 10x lower memory consumption and superior accuracy in image restoration on edge devices using innovative separable mapping and dynamic discretization of LUTs.
Time-Varying LoRA: Towards Effective Cross-Domain Fine-Tuning of Diffusion Models
·3031 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Southern University of Science and Technology
Terra, a novel time-varying low-rank adapter, enables effective cross-domain fine-tuning of diffusion models by creating a continuous parameter manifold, facilitating efficient knowledge sharing and g…
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
·2874 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 University of Washington
Using real images retrieved from a generator’s training data outperforms using synthetic images generated by that same model for image classification.
The GAN is dead; long live the GAN! A Modern GAN Baseline
·3072 words·15 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Brown University
R3GAN, a minimalist GAN baseline, surpasses state-of-the-art models by using a novel regularized relativistic GAN loss and modern architectures, proving GANs can be trained efficiently without relying…
TFS-NeRF: Template-Free NeRF for Semantic 3D Reconstruction of Dynamic Scene
·2695 words·13 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Faculty of IT, Monash University
TFS-NeRF: A template-free neural radiance field efficiently reconstructs semantically separable 3D geometries of dynamic scenes featuring multiple interacting entities from sparse RGB videos.
Test-Time Dynamic Image Fusion
·3589 words·17 mins·
loading
·
loading
AI Generated
Computer Vision
Image Fusion
🏢 Tianjin University
Test-Time Dynamic Image Fusion (TTD) paradigm provably improves image fusion by dynamically weighting source data based on their relative dominance, reducing generalization error without extra trainin…
Tensor-Based Synchronization and the Low-Rankness of the Block Trifocal Tensor
·1554 words·8 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Minnesota
Low-rank block trifocal tensor unlocks accurate, efficient camera pose synchronization.
Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations
·1994 words·10 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 University of Maryland
ConVRT: A novel framework restores turbulence-distorted videos by decoupling spatial and temporal information in a neural representation, achieving temporally consistent mitigation.
Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis
·2005 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
This research introduces a template-free articulated Gaussian splatting method for real-time dynamic view synthesis, automatically discovering object skeletons from videos to enable reposing.
TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network
·2452 words·12 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Intelligent Science and Technology Academy of CASIC
TARSS-Net: A novel temporal-aware radar semantic segmentation network uses a data-driven approach to aggregate temporal information, enhancing accuracy and performance.
TARP-VP: Towards Evaluation of Transferred Adversarial Robustness and Privacy on Label Mapping Visual Prompting Models
·2161 words·11 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 University of Liverpool
TARP-VP reveals a surprising lack of trade-off between adversarial robustness and privacy for label mapping visual prompting models, showing that transferred adversarial training significantly improve…
Target-Guided Adversarial Point Cloud Transformer Towards Recognition Against Real-world Corruptions
·3740 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Beijing Institute of Technology
APCT: a novel architecture enhances 3D point cloud recognition by using an adversarial feature erasing mechanism to improve global structure capture and robustness against real-world corruptions.
TAPTRv2: Attention-based Position Update Improves Tracking Any Point
·1868 words·9 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 South China University of Technology
TAPTRv2 enhances point tracking by introducing an attention-based position update, eliminating cost-volume reliance for improved accuracy and efficiency.