Repository navigation

cpu-inference

Website
Wikipedia

kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

cpu cpu-inference 深度学习 faiss langchain large-language-models 大语言模型机器学习自然语言处理 Python sentence-transformers transformers ChatGPT language-models llama llama-2

Python

964

209

2 年前

CoderLSF / fast-llama

Runs LLaMA with Extremely HIGH speed

cpu-inference inference-engine llama llama2

C++

2 年前

rbitr / llm.f90

LLM inference in Fortran

人工智能聊天机器人 cpu-inference language-model llama llama2 llamacpp 大语言模型 transformer mamba

Fortran

1 年前

jozsefszalma / homelab

The bare metal in my basement

人工智能 bare-metal cpu-inference hardware-hacking hobby-project homelab 机器学习深度学习 gpu Server

9 个月前

yybit / pllm

Portable LLM - A rust library for LLM inference

大语言模型 cpu-inference llama2 aigc

Rust

1 年前

lucienhuangfu / eLLM

eLLM Infers Qwen3-480B on a CPU in Real Time

cpu-inference llama deep-research context-engineering

Rust

2 天前

laelhalawani / gguf_llama

Wrapper for simplified use of Llama2 GGUF quantized models.

cpu-inference gguf llama llama2 llamacpp quantization

Python

2 年前

codito / arey

Simple large language model playground app

人工智能命令行界面 cpu-inference gguf large-language-models llama2 llamacpp 大语言模型 mistral ollama

Rust

2 天前

JohnClaw / chatllm.v

V-lang api wrapper for llm-inference chatllm.cpp

API bindings chatllm gemma ggml llama llm-inference mistral qwen V cpu-inference inference 大语言模型聊天机器人 phi3 quantization

9 个月前

JohnClaw / chatllm.vb

VB.NET api wrapper for llm-inference chatllm.cpp

API bindings chatllm cpu-inference gemma ggml int8 llama llm-inference mistral qwen

Visual Basic .NET

9 个月前

JohnClaw / chatllm.cs

C# api wrapper for llm-inference chatllm.cpp

API bindings chatllm C#gemma ggml int8 llama llm-inference mistral qwen cpu-inference inference 大语言模型

9 个月前

JohnClaw / chatllm.nim

Nim api-wrapper for llm-inference chatllm.cpp

API bindings chatllm gemma ggml llama llm-inference mistral Nim qwen cpu-inference inference 大语言模型聊天机器人 phi quantization

9 个月前

BjornMelin / local-llm-workbench

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.

cpu-inference CUDA gpu-acceleration inference-optimization llama-cpp local-llm model-management model-quantization

Shell

5 个月前

agoSantiago97 / gemma-2-2b-it.cs

# gemma-2-2b-it.csThis project implements int8 CPU inference in pure C#. It ports a Rust repository using Gemini 2.5 Pro Preview, and you can easily build and run it with the provided batch files. 🐙💻

cpu-inference C#gemma gemma2 inference inference-engine int8 大语言模型 llm-inference llm-serving quantization

1 个月前