Repository navigation

llava

Website
Wikipedia

ollama / ollama

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

llama 大语言模型 llama2 Go ollama mistral gemma llama3 llava phi4 deepseek gemma3 qwen gemma3n gpt-oss

150512

12858

2 小时前

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

gpt-4 聊天机器人 ChatGPT llama multimodal llava foundation-models instruction-tuning multi-modality visual-language-learning llama-2 llama2 vision-language-model

Python

23330

2585

1 年前

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

CUDA inference llama llava 大语言模型 llm-serving moe PyTorch transformer vlm llama3 deepseek deepseek-v3 deepseek-r1 qwen3 llama4 blackwell llama5 openai kimi

Python

17021

2662

13 分钟前

Fanghua-Yu / SUPIR

SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.

深度学习 diffusion-models llava sdxl stable-diffusion super-resolution restoration PyTorch pytorch-lightning

Python

5187

442

3 个月前

InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

baichuan chatglm2 internlm large-language-models llama2 大语言模型 llm-training peft qwen 聊天机器人 conversational-ai agent chatglm3 llava mixtral llama3 phi3

Python

4700

354

7 天前

yuanzhoulvpi2017 / zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

bert 自然语言处理 transformers gpt2 chatglm-6b clip gpt PyTorch text-generation huggingface-transformers llama2 llama llava

Jupyter Notebook

3615

428

15 天前

SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

聊天机器人 gpt llama llamacpp 大语言模型 semantic-kernel llava multi-modal llama2 llama3 llama-cpp

3323

461

3 天前

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

gpt-4v large-language-models llava multi-modal openai vqa 大语言模型 openai-api qwen gpt 机器视觉 PyTorch gpt4 ChatGPT clip vit evaluation claude gemini

Python

2921

476

6 天前

om-ai-lab / OmAgent

Build multimodal language agents for fast prototype and production

large-language-models multimodal-agent vision-and-language agent workflow 聊天机器人 gpt4 大语言模型 multimodal rag vlm gpt gradio llama llava openai Python gemini

Python

2542

284

5 个月前

chenking2020 / FindTheChatGPTer

ChatGPT爆火，开启了通往AGI的关键一步，本项目旨在汇总那些ChatGPT的开源平替们，包括文本大模型、多模态大模型等，为大家提供一些便利

chatglm llama belle vicuna ChatGPT alpaca guanaco lora llava minigpt4 autogpt agi ceval baichuan llama2

2034

200

2 年前

Blaizzy / mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

llava 大语言模型 MLX vision-transformer apple-silicon idefics local-ai paligemma vision-framework vision-language-model florence2 molmo pixtral

Python

1581

166

7 小时前

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

聊天机器人 clip gpt-4 llama llava vicuna vision-language vision-language-pretraining

Python

1422

117

15 天前

unum-cloud / uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

huggingface-transformers language-vision multimodal PyTorch semantic-search transformer cross-attention vector-search bert 神经网络 pretrained-models multi-lingual clip openai contrastive-learning representation-learning clustering image-search llava

Python

1161

2 个月前

jhc13 / taggui

Tag manager and captioner for image datasets

image-captioning pyside6 stable-diffusion llava cogvlm florence-2

Python

1089

3 个月前

gokayfem / awesome-vlm-architectures

Famous Vision Language Models and Their Architectures

clip llava vlm multimodal blip cogvlm internlm kosmos vision-language-model Awesome Lists

Markdown

984

6 个月前

TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

large-multimodal-models llama llava 自然语言处理 transformers vision-language

Python

875

4 个月前

NVlabs / EAGLE

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Demo gpt4 huggingface llama llama3 llava lmm mllm 大语言模型 large-language-models

Python

853

11 天前

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

conversation llama3 llava 大语言模型 lmms phi3 vision-language llama-3-llava llama-3-vision llama3-llava phi-3-vision phi3-vision

Python

840

15 天前

PsyChip / machina

OpenCV+YOLO+LLAVA powered video surveillance system

camera llava ollama-api OpenCV Python rtsp yolo

Python

771

16 天前

PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

aigc stable-diffusion clip image-to-text text-to-image controlnet multimodal text-to-video dit llava sora qwen2-vl minicpm-v

Python

687

218

13 天前