Repository navigation

#

model-inference-service

bentoml/BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python
7995
1 天前

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Jupyter Notebook
64
22 天前

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

Python
45
1 年前

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

Jupyter Notebook
17
1 年前

SPIRA Serving Predictor v1 by @daitamae and @vitorguidi

Python
2
2 年前