Repository navigation

#

serving

A flexible, high-performance serving system for machine learning models

C++
6313
1 小时前
Go
4888
10 小时前

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models

Go
4606
8 小时前

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

4368
9 个月前
Java
4350
13 天前

The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.

Python
3507
8 天前

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

Python
3455
11 小时前

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python
2897
4 分钟前
ray-project/llm-applications

A comprehensive guide to building RAG-based LLM applications for production.

Jupyter Notebook
1817
1 年前

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

Java
1672
16 小时前

DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/

Python
1595
4 个月前

A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

C++
914
3 个月前

Generic and easy-to-use serving service for machine learning models

JavaScript
760
5 个月前
C++
750
33 分钟前

A unified end-to-end machine intelligence platform

Python
538
1 年前

A high-performance inference system for large language models, designed for production environments.

C++
460
25 天前