Repository navigation

#

low-precision

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python
2378
2 天前

Low Precision Arithmetic Simulation in PyTorch

Python
275
1 年前

A script to convert floating-point CNN models into generalized low-precision ShiftCNN representation

Python
56
8 年前

Low Precision(quantized) Yolov5

Python
37
1 个月前

JAX Scalify: end-to-end scaled arithmetics

Python
16
6 个月前

Code for DNN feature map compression paper

C++
11
6 年前

CUDA/HIP header-only library for writing vectorized and low-precision (16 bit, 8 bit) GPU kernels

C++
7
9 天前