Repository navigation
#
visual-commonsense-reasoning
- Website
- Wikipedia
Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)
Python
468
4 年前
This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation
Python
16
1 年前
A list of research papers on knowledge-enhanced multimodal learning
7
3 年前
Vision-Zephyr: a multimodal LLM for Visual Commonsense Reasoning—CLIP-ViT + Zephyr-7B with visual prompting; code, training scripts, and VCR evaluation.
Python
1
1 个月前