Repository navigation
serving
- Website
- Wikipedia
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A flexible, high-performance serving system for machine learning models
AI + Data, online. https://vespa.ai
A Cloud Native Batch System (Project under CNCF)
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Serve, optimize and scale PyTorch models in production
⚡️An Easy-to-use and Fast Deep Learning Model Deployment Toolkit for ☁️Cloud 📱Mobile and 📹Edge. Including Image, Video, Text and Audio 20+ main stream scenarios and 150+ SOTA models with end-to-end optimization, multi-platform and multi-framework support.
Deploy high-performance AI models and inference pipelines on FastAPI with built-in batching, streaming and more.
Database system for AI-powered apps
TensorFlow template application for deep learning
A comprehensive guide to building RAG-based LLM applications for production.
DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)
Generic and easy-to-use serving service for machine learning models
A scalable inference server for models optimized with OpenVINO™
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.