🏢 School of Computer Science and Engineering, Southeast University

Vision-Language Models are Strong Noisy Label Detectors

26 September 2024·2173 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 School of Computer Science and Engineering, Southeast University

Vision-language models effectively detect noisy labels, improving image classification accuracy with DEFT.

Multi-Label Open Set Recognition

26 September 2024·1580 words·8 mins· loading · loading

Machine Learning Deep Learning 🏢 School of Computer Science and Engineering, Southeast University

SLAN: A novel approach for multi-label open-set recognition, enriching sub-labeling info using structural data to identify unknown labels.

Multi-Instance Partial-Label Learning with Margin Adjustment

26 September 2024·3339 words·16 mins· loading · loading

AI Generated Machine Learning Semi-Supervised Learning 🏢 School of Computer Science and Engineering, Southeast University

MIPLMA, a novel algorithm, enhances multi-instance partial-label learning by dynamically adjusting margins for attention scores and predicted probabilities, leading to superior performance.

Linearly Decomposing and Recomposing Vision Transformers for Diverse-Scale Models

26 September 2024·2125 words·10 mins· loading · loading

Computer Vision Image Classification 🏢 School of Computer Science and Engineering, Southeast University

Linearly decompose & recompose Vision Transformers to create diverse-scale models efficiently, reducing computational costs & improving flexibility for various applications.

Initializing Variable-sized Vision Transformers from Learngene with Learnable Transformation

26 September 2024·2536 words·12 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 School of Computer Science and Engineering, Southeast University

LeTs: Learnable Transformation efficiently initializes variable-sized Vision Transformers by learning adaptable transformations from a compact learngene module, outperforming from-scratch training.

Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition

26 September 2024·2302 words·11 mins· loading · loading

Machine Learning Semi-Supervised Learning 🏢 School of Computer Science and Engineering, Southeast University

CCL, a novel probabilistic framework, uses continuous contrastive learning to excel in long-tailed semi-supervised recognition, surpassing prior state-of-the-art methods by over 4%.

Cluster-Learngene: Inheriting Adaptive Clusters for Vision Transformers

26 September 2024·3088 words·15 mins· loading · loading

AI Generated Computer Vision Vision-Language Models 🏢 School of Computer Science and Engineering, Southeast University

Cluster-Learngene efficiently initializes elastic-scale Vision Transformers by adaptively clustering and inheriting key modules from a large ancestry model, saving resources and boosting downstream ta…

A Motion-aware Spatio-temporal Graph for Video Salient Object Ranking

26 September 2024·2245 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 School of Computer Science and Engineering, Southeast University

A novel motion-aware spatio-temporal graph model surpasses existing methods in video salient object ranking by jointly optimizing multi-scale spatial and temporal features, thus accurately prioritizin…