Repository navigation

#

text-image-retrieval

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]

Python
919
1 年前

New generation of CLIP with fine grained discrimination capability, ICML2025

Python
276
22 天前

The back-end of cross-modal retrieval system,wihch will contain services such as semantic location .etc

Python
65
3 年前

PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

Python
12
1 年前

[ACMMM'25] Referring Expression Instance Retrieval and A Strong End-to-End Baseline

6
1 个月前

The LLM-Powered Video Search System is an advanced multimodal video search solution that leverages Large Language Models (LLMs) to enhance video retrieval through text, image, and metadata queries.

Jupyter Notebook
5
2 个月前

PIMA - A Novel Approach for Pill-Prescription Matching with GNN Assistance and Contrastive Learning

Jupyter Notebook
3
3 年前

A search engine, operating on the foundation of the OpenAI Clip Model to retrieve images corresponding to textual queries.

Jupyter Notebook
1
1 年前

Digimon Dataset for MultiModal Machine Learning

Python
0
2 年前

VisAlign: Aligning Visual Representations with Textual Semantics for Image Similarity and Retrieval

Jupyter Notebook
0
3 个月前