Repository navigation

#

visual-commonsense-reasoning

Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)

Python
468
4 年前

This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation

Python
16
1 年前

Vision-Zephyr: a multimodal LLM for Visual Commonsense Reasoning—CLIP-ViT + Zephyr-7B with visual prompting; code, training scripts, and VCR evaluation.

Python
1
1 个月前