Repository navigation

#

visual-question-answering

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook
5509
1 年前

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Python
2536
1 年前

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Jupyter Notebook
1457
3 年前

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Python
1265
3 年前

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Python
969
3 年前

Bilinear attention networks for visual question answering

Python
545
2 年前

Deep Modular Co-Attention Networks for Visual Question Answering

Python
454
5 年前

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Jupyter Notebook
348
4 年前

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Jupyter Notebook
335
6 天前

A lightweight, scalable, and general framework for visual question answering research

Python
327
4 年前

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

Python
276
4 个月前

Strong baseline for visual question answering

Python
241
3 年前

[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.

Python
210
1 年前

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python
196
1 年前

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Python
175
1 年前

Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17

Python
163
7 年前