Repository navigation

#

ggml

ggml-org/llama.cpp

LLM inference in C/C++

C++
78441
3 小时前

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Python
7559
1 天前

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Rust
6102
10 个月前

llama and other large language models on iOS and MacOS offline using GGML library.

Swift
1734
1 个月前

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

C++
1507
1 个月前

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

JavaScript
1286
5 个月前

Suno AI's Bark model in C/C++ for fast text-to-speech generation

C++
799
5 个月前

Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models

C++
606
2 个月前

Run inference on MPT-30B using CPU

Python
575
2 年前

Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)

C++
567
2 年前

CLIP inference in plain C/C++ with no extra dependencies

C++
492
8 个月前

Large Language Models for All, 🦙 Cult and More, Stay in touch !

HTML
446
2 年前

WIP Library Text To Speech From Suno AI's Bark in C/C++ for fast inference

C++
403
1 年前

Running any GGUF SLMs/LLMs locally, on-device in Android

Kotlin
260
1 天前
C++
196
6 天前