Repository navigation

image-text-matching

Website
Wikipedia

NVlabs / GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

image-text-matching transformers zero-shot-learning semantic-segmentation

Python

770

3 年前

slavabarkov / tidy

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

Android clip 机器视觉深度学习 image-retrieval Kotlin 自然语言处理 onnx quantization image-text-retrieval cross-modal-retrieval image-text-matching image-search semantic-search

Kotlin

482

2 年前

Paranioar / Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

cross-modal-retrieval 教程 Awesome Lists image-text-matching image-text-retrieval large-language-models large-vision-language-models multimodal-pretraining parameter-efficient-fine-tuning vision-and-language multimodal-large-language-models 大语言模型 text-to-image-generation text-to-image-synthesis text-to-video-generation

430

9 天前

Paranioar / SGRAF

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

cross-modal-retrieval image-text-matching image-retrieval image-text-retrieval text-matching aaai

Python

219

1 年前

woodfrog / vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)

image-text-matching cross-modal-retrieval vision-language PyTorch

Python

160

1 个月前

kywen1119 / DSRAN

Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

PyTorch image-text-matching cross-modal 机器视觉

Python

3 年前

naver-ai / eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

cross-modal-retrieval dataset 深度学习 eccv2022 evaluation image-text-matching 机器学习 vision-and-language

Python

2 年前

jaisidhsingh / LoRA-CLIP

Easy wrapper for inserting LoRA layers in CLIP.

image-text-matching lora multimodal multimodal-deep-learning parameter-efficient-tuning vision-language-pretraining

Python

1 年前

jaisidhsingh / CoN-CLIP

Implementation of the "Learn No to Say Yes Better" paper.

compositionality 深度学习 image-text-matching multimodal PyTorch visual-language-models

Python

8 天前

eric-ai-lab / ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

blip2 causality clip compositionality image-text-matching image-text-retrieval vision-and-language

Python

1 年前

weiyx16 / CLIP-pytorch

A non-JIT version implementation / replication of CLIP of OpenAI in pytorch

clip PyTorch image-text-matching

Python

5 年前

Paranioar / RCAR

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

cross-modal-retrieval image-text-matching image-retrieval image-text-retrieval text-matching tip

Python

1 年前

MartinYuanNJU / SEMScene

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval".

image-text-matching cross-modal-retrieval

Python

1 年前

JinhaoLee / WCA

[ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

vision-language-model 深度学习 image-text-matching large-language-models visual-prompting zero-shot-classification

Python

1 年前

alipay / PC2-NoiseofWeb

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.

benchmark cross-modal-retrieval dataset image-text-matching image-text-retrieval multimodal-learning

Python

2 个月前

nhtlongcs / AIC2022-VER

Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding

image-text-matching retrieval PyTorch pytorch-lightning

Python

2 年前

zabir-nabil / bangla-image-search

A dead-simple image search / retrieval and image-text matching system for Bangla using CLIP

clip 深度学习 image-search image-search-engine search search-engine image-retrieval image-text-matching

Python

2 年前

Paranioar / DBL

[TIP2024] The code of “Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching”

cross-modal-retrieval image-retrieval image-text-matching image-text-retrieval text-matching tip

Python

1 年前

zabir-nabil / bangla-CLIP

CLIP (Contrastive Language–Image Pre-training) for Bangla.

clip image-retrieval image-text-matching

Python

1 年前

cuiaiyu / Text-to-Image-ReIdentification

Unofficial code of paper "Improving description-based person re-identification by multi-granularity image-text alignment." by Niu et al. (partially implemented)

PyTorch re-identification image-text-matching

Jupyter Notebook

3 年前