Repository navigation

#

model-quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

2225
7 个月前

模型压缩的小白入门教程,PDF下载地址 https://github.com/datawhalechina/awesome-compression/releases

328
4 个月前
C++
248
2 年前

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

193
8 个月前

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

Jupyter Notebook
173
3 年前

[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.

Jupyter Notebook
77
3 年前

[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.

Python
56
2 年前

[NeurIPS 2023 Spotlight] This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.

Python
50
1 年前

The official implementation of the ICML 2023 paper OFQ-ViT

Python
33
2 年前

Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.

Python
30
2 年前

A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

8
3 个月前

PyTorch implementation of "BiDense: Binarization for Dense Prediction," A binary neural network for dense prediction tasks.

Python
6
10 个月前

Enterprise multi-agent framework for secure, borderless data collaboration with zero-trust and federated learning-lightweight edge-ready.

Python
5
6 个月前

Notebook from "A Hands-On Walkthrough on Model Quantization" blog post.

Jupyter Notebook
4
1 年前

Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).

Jupyter Notebook
4
2 年前

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.

Shell
3
6 个月前

Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.

Jupyter Notebook
3
8 个月前