Repository navigation
#
distributed-inference
- Website
- Wikipedia
FlashInfer: Kernel Library for LLM Serving
Cuda
3586
16 小时前
Simple, scalable AI model deployment on GPU clusters
Python
3366
11 小时前
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
C++
997
1 个月前
一個基於 llama.cpp 的分佈式 LLM 推理程式,讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理,使用 Electron 的製作跨平台桌面應用程式操作 UI。
JavaScript
58
24 天前
Source code of the paper "Private Collaborative Edge Inference via Over-the-Air Computation".
Python
4
7 个月前
Official impl. of ACM MM paper "Identity-Aware Attribute Recognition via Real-Time Distributed Inference in Mobile Edge Clouds". A distributed inference model for pedestrian attribute recognition with re-ID in an MEC-enabled camera monitoring system. Jointly training of pedestrian attribute recognition and Re-ID.
Python
2
5 年前