Repository navigation

#

text-image-retrieval

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]

Python
895
9 个月前

The back-end of cross-modal retrieval system,wihch will contain services such as semantic location .etc

Python
64
3 年前

A multimodal image search engine built on the GME model, capable of handling diverse input types. Whether you're querying with text, images, or both, provides powerful and flexible image retrieval under arbitrary inputs. Perfect for research and demos.

Python
25
4 个月前

PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

Python
10
1 年前

PIMA - A Novel Approach for Pill-Prescription Matching with GNN Assistance and Contrastive Learning

Jupyter Notebook
3
2 年前

The LLM-Powered Video Search System is an advanced multimodal video search solution that leverages Large Language Models (LLMs) to enhance video retrieval through text, image, and metadata queries.

Jupyter Notebook
3
3 个月前

A search engine, operating on the foundation of the OpenAI Clip Model to retrieve images corresponding to textual queries.

Jupyter Notebook
1
1 年前

Digimon Dataset for MultiModal Machine Learning

Python
0
2 年前