🏢 Nanjing University of Science and Technology
Progressive Exploration-Conformal Learning for Sparsely Annotated Object Detection in Aerial Images
·2177 words·11 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 Nanjing University of Science and Technology
Progressive Exploration-Conformal Learning (PECL) revolutionizes sparsely annotated object detection in aerial images by adaptively selecting high-quality pseudo-labels, overcoming limitations of exis…
Predicting Label Distribution from Ternary Labels
·2190 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Label Distribution Learning
🏢 Nanjing University of Science and Technology
Boosting label distribution learning accuracy and efficiency, this research proposes using ternary labels instead of binary labels to predict label distributions, thus enhancing annotation efficiency …
MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space
·2330 words·11 mins·
loading
·
loading
Computer Vision
Image Enhancement
🏢 Nanjing University of Science and Technology
MambaLLIE: a novel implicit Retinex-aware low-light enhancer using a global-then-local state space, significantly outperforms existing CNN and Transformer-based methods.
Long-tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction
·2225 words·11 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 Nanjing University of Science and Technology
Dynamic Rebalancing Contrastive Learning with Dual Reconstruction (2DRCL) pre-training significantly boosts object detection accuracy, especially for underrepresented classes.
IMAGPose: A Unified Conditional Framework for Pose-Guided Person Generation
·2991 words·15 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Nanjing University of Science and Technology
IMAGPose: A unified framework generating high-fidelity person images from single or multiple source images & poses, addressing existing methods’ limitations.
Facilitating Multimodal Classification via Dynamically Learning Modality Gap
·1770 words·9 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Nanjing University of Science and Technology
Researchers dynamically integrate contrastive and supervised learning to overcome the modality imbalance problem in multimodal classification, significantly improving model performance.
DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting
·2536 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Nanjing University of Science and Technology
DoFIT: A novel domain-aware framework significantly reduces catastrophic forgetting in federated instruction tuning by finely aggregating overlapping weights and using a proximal perturbation initiali…
DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain
·2252 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Nanjing University of Science and Technology
DCDepth achieves state-of-the-art monocular depth estimation by progressively predicting depth in the frequency domain via DCT, capturing local correlations and global context effectively.
Continuous Heatmap Regression for Pose Estimation via Implicit Neural Representation
·2522 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 Nanjing University of Science and Technology
NerPE: continuous heatmap regression via implicit neural representation resolves the accuracy-limiting quantization errors in human pose estimation, achieving sub-pixel precision.