Cross-Modal Retrieval
Semantic Feature Learning for Universal Unsupervised Cross-Domain Retrieval
·1809 words·9 mins·
loading
·
loading
Computer Vision
Cross-Modal Retrieval
🏢 Northwestern University
Universal Unsupervised Cross-Domain Retrieval (U2CDR) framework learns semantic features to enable accurate retrieval even when category spaces differ across domains.
NeuroBOLT: Resting-state EEG-to-fMRI Synthesis with Multi-dimensional Feature Mapping
·2012 words·10 mins·
loading
·
loading
Multimodal Learning
Cross-Modal Retrieval
🏢 Vanderbilt University
NeuroBOLT: Resting-state EEG-to-fMRI synthesis using multi-dimensional feature mapping.
Identifiable Shared Component Analysis of Unpaired Multimodal Mixtures
·2736 words·13 mins·
loading
·
loading
AI Generated
Multimodal Learning
Cross-Modal Retrieval
🏢 Oregon State University
Unaligned multimodal mixtures’ shared components are identifiable under mild conditions using a distribution-matching approach, relaxing assumptions of existing methods.
How Molecules Impact Cells: Unlocking Contrastive PhenoMolecular Retrieval
·3595 words·17 mins·
loading
·
loading
Multimodal Learning
Cross-Modal Retrieval
🏢 University of Toronto
MolPhenix, a novel multi-modal model, drastically improves zero-shot molecular retrieval by leveraging a pre-trained phenomics model and a novel similarity-aware loss, achieving an 8.1x improvement ov…
Exploiting Descriptive Completeness Prior for Cross Modal Hashing with Incomplete Labels
·2505 words·12 mins·
loading
·
loading
Multimodal Learning
Cross-Modal Retrieval
🏢 Harbin Institute of Technology, Shenzhen
PCRIL, a novel prompt contrastive recovery approach, significantly boosts cross-modal hashing accuracy, especially when dealing with incomplete labels by progressively identifying promising positive c…
Empowering Visible-Infrared Person Re-Identification with Large Foundation Models
·2429 words·12 mins·
loading
·
loading
AI Generated
Multimodal Learning
Cross-Modal Retrieval
🏢 National Engineering Research Center for Multimedia Software,School of Computer Science,Wuhan University
Large foundation models empower visible-infrared person re-identification by enriching infrared image representations with automatically generated textual descriptions, significantly improving cross-m…
Diffusion-Inspired Truncated Sampler for Text-Video Retrieval
·2366 words·12 mins·
loading
·
loading
Multimodal Learning
Cross-Modal Retrieval
🏢 Rochester Institute of Technology
Diffusion-Inspired Truncated Sampler (DITS) revolutionizes text-video retrieval by progressively aligning embeddings and enhancing CLIP embedding space structure, achieving state-of-the-art results.
An End-To-End Graph Attention Network Hashing for Cross-Modal Retrieval
·1722 words·9 mins·
loading
·
loading
Multimodal Learning
Cross-Modal Retrieval
🏢 Hebei Normal University
EGATH: End-to-End Graph Attention Network Hashing revolutionizes cross-modal retrieval by combining CLIP, transformers, and graph attention networks for superior semantic understanding and hash code g…