Image Classification
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
·4088 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Johns Hopkins University
Smaller image patches improve vision transformer performance, defying conventional wisdom and revealing a new scaling law for enhanced visual understanding.
iFormer: Integrating ConvNet and Transformer for Mobile Application
·7046 words·34 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Shanghai Jiao Tong University
iFormer: A new family of mobile hybrid vision networks that expertly blends ConvNeXt’s fast local feature extraction with the efficient global modeling of self-attention, achieving top-tier accuracy a…
MLLM-as-a-Judge for Image Safety without Human Labeling
·6596 words·31 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Meta AI
Zero-shot image safety judgment is achieved using MLLMs and a novel method called CLUE, objectifying safety rules, and significantly reducing the need for human labeling.
Learned Compression for Compressed Learning
·2966 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 University of Texas at Austin
WaLLOC: a novel neural codec boosts compressed-domain learning by combining wavelet transforms with asymmetric autoencoders, achieving high compression ratios with minimal computation and uniform dime…
EMOv2: Pushing 5M Vision Model Frontier
·6258 words·30 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Tencent AI Lab
EMOv2 achieves state-of-the-art performance in various vision tasks using a novel Meta Mobile Block, pushing the 5M parameter lightweight model frontier.