Image Classification

Attention IoU: Examining Biases in CelebA using Attention Maps

25 March 2025·3919 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Princeton University

Attention-IoU reveals model biases by analyzing attention maps, offering insights beyond dataset labels and improving debiasing techniques.

CLS-RL: Image Classification with Rule-Based Reinforcement Learning

20 March 2025·2967 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Shanghai AI Laboratory

CLS-RL: Rule-based RL tackles catastrophic forgetting in MLLM image classification, outperforming SFT with better generalization and efficiency.

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

6 February 2025·4088 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Johns Hopkins University

Smaller image patches improve vision transformer performance, defying conventional wisdom and revealing a new scaling law for enhanced visual understanding.

iFormer: Integrating ConvNet and Transformer for Mobile Application

26 January 2025·7046 words·34 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Shanghai Jiao Tong University

iFormer: A new family of mobile hybrid vision networks that expertly blends ConvNeXt’s fast local feature extraction with the efficient global modeling of self-attention, achieving top-tier accuracy a…

MLLM-as-a-Judge for Image Safety without Human Labeling

31 December 2024·6596 words·31 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Meta AI

Zero-shot image safety judgment is achieved using MLLMs and a novel method called CLUE, objectifying safety rules, and significantly reducing the need for human labeling.

Learned Compression for Compressed Learning

12 December 2024·2966 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 University of Texas at Austin

WaLLOC: a novel neural codec boosts compressed-domain learning by combining wavelet transforms with asymmetric autoencoders, achieving high compression ratios with minimal computation and uniform dime…

EMOv2: Pushing 5M Vision Model Frontier

9 December 2024·6258 words·30 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Image Classification 🏢 Tencent AI Lab

EMOv2 achieves state-of-the-art performance in various vision tasks using a novel Meta Mobile Block, pushing the 5M parameter lightweight model frontier.