Repository navigation

#

large-vision-language-model

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python
3339
9 个月前

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python
2218
1 个月前

🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

Python
217
4 个月前

[NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"

Python
188
1 年前
Python
98
10 个月前

✨✨latest advancements in VLA models(VIsion Language Action)

77
4 个月前

[NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

Python
75
9 个月前

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

Python
40
5 个月前

This is the offical repository of LLAVIDAL

Python
17
5 个月前

[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.

Jupyter Notebook
5
5 个月前

Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025

Python
4
4 个月前

Easy-to-use large vision language model pipeline for quantitative analysis

Python
3
4 个月前
Python
3
13 天前

Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model

Python
0
3 个月前