🏢 Nanjing University of Science and Technology

Progressive Exploration-Conformal Learning for Sparsely Annotated Object Detection in Aerial Images

26 September 2024·2177 words·11 mins· loading · loading

Computer Vision Object Detection 🏢 Nanjing University of Science and Technology

Progressive Exploration-Conformal Learning (PECL) revolutionizes sparsely annotated object detection in aerial images by adaptively selecting high-quality pseudo-labels, overcoming limitations of exis…

Predicting Label Distribution from Ternary Labels

26 September 2024·2190 words·11 mins· loading · loading

AI Generated Machine Learning Label Distribution Learning 🏢 Nanjing University of Science and Technology

Boosting label distribution learning accuracy and efficiency, this research proposes using ternary labels instead of binary labels to predict label distributions, thus enhancing annotation efficiency …

MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space

26 September 2024·2330 words·11 mins· loading · loading

Computer Vision Image Enhancement 🏢 Nanjing University of Science and Technology

MambaLLIE: a novel implicit Retinex-aware low-light enhancer using a global-then-local state space, significantly outperforms existing CNN and Transformer-based methods.

Long-tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction

26 September 2024·2225 words·11 mins· loading · loading

Computer Vision Object Detection 🏢 Nanjing University of Science and Technology

Dynamic Rebalancing Contrastive Learning with Dual Reconstruction (2DRCL) pre-training significantly boosts object detection accuracy, especially for underrepresented classes.

IMAGPose: A Unified Conditional Framework for Pose-Guided Person Generation

26 September 2024·2991 words·15 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Nanjing University of Science and Technology

IMAGPose: A unified framework generating high-fidelity person images from single or multiple source images & poses, addressing existing methods’ limitations.

Facilitating Multimodal Classification via Dynamically Learning Modality Gap

26 September 2024·1770 words·9 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Nanjing University of Science and Technology

Researchers dynamically integrate contrastive and supervised learning to overcome the modality imbalance problem in multimodal classification, significantly improving model performance.

DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting

26 September 2024·2536 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Nanjing University of Science and Technology

DoFIT: A novel domain-aware framework significantly reduces catastrophic forgetting in federated instruction tuning by finely aggregating overlapping weights and using a proximal perturbation initiali…

DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain

26 September 2024·2252 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Nanjing University of Science and Technology

DCDepth achieves state-of-the-art monocular depth estimation by progressively predicting depth in the frequency domain via DCT, capturing local correlations and global context effectively.

Continuous Heatmap Regression for Pose Estimation via Implicit Neural Representation

26 September 2024·2522 words·12 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Nanjing University of Science and Technology

NerPE: continuous heatmap regression via implicit neural representation resolves the accuracy-limiting quantization errors in human pose estimation, achieving sub-pixel precision.