Repository navigation
#
local-inference
- Website
- Wikipedia
High-speed Large Language Model Serving for Local Deployment
C++
8178
2 个月前
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
Python
208
5 个月前
Tool for test diferents large language models without code.
JavaScript
18
3 个月前
Local LLM Inference Library
Pascal
12
3 个月前
LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).
Python
5
1 年前
script which performs RAG and use a local LLM for Q&A
Python
0
7 个月前
Script which takes a .wav audio file, performs speech-to-text using OpenAI/Whisper, and then, using Llama3, summarization and action point from the transcript generated
Python
0
7 个月前