Repository navigation
#
on-device-llms
- Website
- Wikipedia
prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters
C++
997
1 个月前
[ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
Python
29
21 天前
Multi-agent workflows with Llama3: A private on-device multi-agent framework
Python
3
1 年前