Repository navigation

#

minicpm-v

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python
22024
10 天前

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Python
700
1 个月前

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Python
413
5 个月前

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

Python
155
5 个月前
C++
116
2 个月前

A custom ComfyUI node for MiniCPM vision-language models, supporting v4, v4.5, and v4 GGUF formats, enabling high-quality image captioning and visual analysis.

Python
109
1 个月前

軽量VLMのMiniCPM-V2.6のColaboratoryサンプル

Jupyter Notebook
4
1 年前

PicQ: Demo for MiniCPM-o 2.6 to answer questions about images using natural language.

Python
4
9 个月前

VidiQA: Demo for MiniCPM-V 2.6 to answer questions about videos using natural language.

Python
2
10 个月前