Repository navigation

#

large-vision-language-model

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python
3367
10 个月前

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python
2247
3 个月前

🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

Python
220
6 个月前

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python
196
1 年前
Python
103
1 年前

✨✨latest advancements in VLA models(VIsion Language Action)

86
6 个月前

[NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

Python
76
10 个月前

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

Python
44
7 个月前

This is the offical repository of LLAVIDAL

Python
19
9 天前

[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.

Jupyter Notebook
5
6 个月前

Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025

Python
5
5 个月前

Easy-to-use large vision language model pipeline for quantitative analysis

Python
3
5 个月前
Python
3
2 个月前

Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model

Python
1
5 个月前