Repository navigation

#

on-device-llms

prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters

C++
473
4 天前