Repository navigation

#

gptq

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python
2378
2 天前

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Python
480
13 小时前

Large Language Models for All, 🦙 Cult and More, Stay in touch !

HTML
446
2 年前

Advanced Quantization Algorithm for LLMs/VLMs.

Python
431
2 天前
Python
169
1 年前
Jupyter Notebook
40
2 年前

ChatSakura:Open-source multilingual conversational model.(开源多语言对话大模型)

Python
14
2 年前

Private self-improvement coaching with open-source LLMs

Python
13
1 年前

An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

Python
13
3 天前

This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).

Python
9
1 年前

Run gguf LLM models in Latest Version TextGen-webui

Jupyter Notebook
9
6 个月前

A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system

Python
8
2 个月前

Code for NAACL paper When Quantization Affects Confidence of Large Language Models?

Jupyter Notebook
3
4 个月前

This project will develop a NEPSE chatbot using an open-source LLM, incorporating sentence transformers, vector database and reranking.

Jupyter Notebook
2
1 年前

LLM quantization techniques: absmax, zero-point, GPTQ and GGUF

Jupyter Notebook
1
9 个月前