Repository navigation

#

minicpm-v

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python
19252
2 个月前

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Python
621
2 天前

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Python
351
2 个月前

Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, distributed multi-GPU inference, multimodal capabilities, and a Gradio chat interface.

Python
130
2 天前
C++
100
20 天前

PicQ: Demo for MiniCPM-o 2.6 to answer questions about images using natural language.

Python
4
3 个月前

軽量VLMのMiniCPM-V2.6のColaboratoryサンプル

Jupyter Notebook
3
8 个月前

VidiQA: Demo for MiniCPM-V 2.6 to answer questions about videos using natural language.

Python
2
5 个月前