Repository navigation

#

cross-modal-retrieval

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Python
969
3 年前

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

Kotlin
482
2 年前

New generation of CLIP with fine grained discrimination capability, ICML2025

Python
306
6 天前

TOMM2020 Dual-Path Convolutional Image-Text Embedding with Instance Loss 🐾 https://arxiv.org/abs/1711.05535

MATLAB
296
9 个月前

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

Python
219
1 年前

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)

Python
160
1 个月前

Deep Supervised Cross-modal Retrieval (CVPR 2019, PyTorch Code)

Python
144
6 年前

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

Python
140
1 年前

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

Python
136
1 年前

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

Python
135
2 年前

Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)

Python
133
2 年前

[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

Python
122
9 个月前

Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)

Python
120
10 个月前

PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction

Python
98
3 年前

Unsupervised Contrastive Cross-modal Hashing (IEEE TPAMI 2023, PyTorch Code)

Python
65
2 年前

[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval

Python
59
1 年前

[CVPR 2020, Oral] "Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval”, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. .

Python
59
5 年前