Repository navigation

#

ggml

ggml-org/llama.cpp

LLM inference in C/C++

C++
85153
2 小时前

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Python
8404
13 小时前

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Rust
6129
1 年前

llama and other large language models on iOS and MacOS offline using GGML library.

C
1850
11 天前

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

C++
1538
5 个月前

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

JavaScript
1342
9 个月前

Suno AI's Bark model in C/C++ for fast text-to-speech generation

C++
834
9 个月前

Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models

C++
606
6 个月前

Run inference on MPT-30B using CPU

Python
575
2 年前

Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)

C++
567
2 年前

CLIP inference in plain C/C++ with no extra dependencies

C++
514
2 个月前

Running any GGUF SLMs/LLMs locally, on-device in Android

Kotlin
457
3 天前

Large Language Models for All, 🦙 Cult and More, Stay in touch !

HTML
445
2 年前

WIP Library Text To Speech From Suno AI's Bark in C/C++ for fast inference

C++
371
1 年前