Repository navigation

gpt-4v

Website
Wikipedia

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

image-classification image-text-retrieval 大语言模型 semantic-segmentation video-classification vision-language-model vit-22b vit-6b multi-modal gpt gpt-4v gpt-4o

Python

9284

720

13 天前

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt-4v large-language-models llava multi-modal openai vqa 大语言模型 openai-api qwen gpt 机器视觉 PyTorch gpt4 ChatGPT clip vit evaluation claude gemini

Python

3122

513

7 天前

ShareGPT4Omni / ShareGPT4Video

[NeurIPS 2024] An official implementation of "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions"

ChatGPT gpt gpt-4v large-language-models large-multimodal-models large-vision-language-models sora text-to-video

Python

1075

1 年前

RLHF-V / RLAIF-V

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

聊天机器人 gpt-4v multimodal llava minicpm-v

Python

413

5 个月前

zli12321 / Vision-Language-Models-Overview

A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.

blip2 claude clip deepseek gemini-pro gpt-4v llava multimodal-models reinforcement-learning world-models

389

9 天前

tianyi-lab / HallusionBench

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmark vlms gpt-4 gpt-4v llava benchmarks hallucination 大语言模型 lmm large-language-models large-vision-language-models

Python

300

1 年前

ShareGPT4Omni / ShareGPT4V

[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

ChatGPT gpt gpt-4v gpt4v instruction-tuning language-model large-language-models large-multimodal-models large-vision-language-models vision-language-model eccv2024

Python

236

1 年前

davideuler / awesome-assistant-api

Try openai assistant api apps on Google Colab for free. Awesome assistant API Demos!

assistant ChatGPT dalle-3 function-calling gpt-4-turbo gpt-4v assistant-api Example

Jupyter Notebook

218

2 年前

yachty66 / gpt_pdf_md

🚀 gpt_pdf_md: Convert PDF to Markdown with GPT-4V & more. Extract images, upload to Google Cloud, & generate Markdown with images. Python, GPT-4V Vision, Scala. Ideal for developers, researchers. PDF to Markdown, GPT-4V, image extraction, Python package

人工智能 gpt-4v Markdown pdf Python

Scala

2 年前