Repository navigation

#

large-vision-language-model

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python
3227
5 个月前

Mixture-of-Experts for Large Vision-Language Models

Python
2148
5 个月前

🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies

Python
194
5 天前

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python
175
7 个月前
Python
82
6 个月前

[NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

Python
68
4 个月前

✨✨latest advancements in VLA models(VIsion Language Action)

56
5 天前

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

Python
32
1 个月前

This is the offical repository of LLAVIDAL

Python
13
1 个月前

[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.

Jupyter Notebook
4
1 个月前

Easy-to-use large vision language model pipeline for quantitative analysis

Python
2
2 个月前