Repository navigation

#

inference-server

RamaLama is an open-source developer tool that simplifies the local serving of AI models from any source and facilitates their use for inference in production, all through the familiar language of containers.

Python
2052
2 小时前

A REST API for Caffe using Docker and Go

C++
419
7 年前

This is a repository for an nocode object detection inference API using the Yolov3 and Yolov4 Darknet framework.

Python
280
3 年前

Work with LLMs on a local environment using containers

TypeScript
249
4 小时前

ONNX Runtime Server: The ONNX Runtime Server is a server that provides TCP and HTTP/HTTPS REST APIs for ONNX inference.

C++
166
7 天前

Serving AI/ML models in the open standard formats PMML and ONNX with both HTTP (REST API) and gRPC endpoints

Scala
160
10 个月前

K3ai is a lightweight, fully automated, AI infrastructure-in-a-box solution that allows anyone to experiment quickly with Kubeflow pipelines. K3ai is perfect for anything from Edge to laptops.

PowerShell
101
4 年前

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

Python
33
4 年前