Repository navigation

#

vision-language-transformer

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python
7888
8 个月前

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook
5196
8 个月前
Python
693
2 年前

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python
563
1 年前

[ICCV2021 & TPAMI2023] Vision-Language Transformer and Query Generation for Referring Segmentation

Python
353
3 年前
Jupyter Notebook
31
3 年前

[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

Python
24
1 年前

A collection of VLMs papers, blogs, and projects, with a focus on VLMs in Autonomous Driving and related reasoning techniques.

10
5 个月前

Mini-batch selective sampling for knowledge adaption of VLMs for mammography.

Jupyter Notebook
1
6 个月前