Repository navigation

#

cpu-inference

Runs LLaMA with Extremely HIGH speed

C++
92
2 年前

Portable LLM - A rust library for LLM inference

Rust
9
1 年前

eLLM Infers Qwen3-480B on a CPU in Real Time

Rust
7
2 天前

Wrapper for simplified use of Llama2 GGUF quantized models.

Python
7
2 年前
Visual Basic .NET
4
9 个月前

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.

Shell
2
5 个月前

# gemma-2-2b-it.csThis project implements int8 CPU inference in pure C#. It ports a Rust repository using Gemini 2.5 Pro Preview, and you can easily build and run it with the provided batch files. 🐙💻

C#
2
1 个月前

Simple bot that transcribes Telegram voice messages. Powered by go-telegram-bot-api & whisper.cpp Go bindings.

Go
2
2 年前