Repository navigation

minicpm-v

Website
Wikipedia

OpenBMB / MiniCPM-V

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

minicpm minicpm-v multi-modal

Python

22024

1650

10 天前

PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

aigc stable-diffusion clip image-to-text text-to-image controlnet multimodal text-to-video dit llava sora qwen2-vl minicpm-v

Python

700

221

1 个月前

RLHF-V / RLAIF-V

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

聊天机器人 gpt-4v multimodal llava minicpm-v

Python

413

5 个月前

NetEase-Media / grps_trtllm

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

大语言模型 openai tensorrt-llm chatglm llama3 qwen2 function-call ai-agent llama-index multi-modal deepseek-r1 phi qwq qwen2-vl minicpm-v internvl qwen3

Python

155

5 个月前

AXERA-TECH / ax-llm

Explore LLM model deployment based on AXera's AI chips

edge-computing huggingface 大语言模型 transformer gemma2 llama3 minicpm minicpm-v qwen2 vlm

C++

116

2 个月前

1038lab / ComfyUI-MiniCPM

A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.

comfyui custom-nodes gguf llama-cpp stable-diffusion minicpm minicpm-v

Python

109

1 个月前