Repository navigation

#

serving

A flexible, high-performance serving system for machine learning models

C++
6327
4 天前
Go
4989
6 天前

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Go
4643
1 天前

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

4377
1 年前
Java
4348
2 个月前

The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.

Python
3579
3 天前

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python
3522
5 天前

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python
3284
9 天前
ray-project/llm-applications

A comprehensive guide to building RAG-based LLM applications for production.

Jupyter Notebook
1832
1 年前

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

Java
1684
5 天前

DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/

Python
1599
6 个月前

A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

C++
916
5 个月前

Generic and easy-to-use serving service for machine learning models

JavaScript
760
6 个月前

A unified end-to-end machine intelligence platform

Python
537
1 年前

A high-performance inference system for large language models, designed for production environments.

C++
472
11 天前