Repository navigation
#
low-precision
- Website
- Wikipedia
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Python
2378
2 天前
Low Precision Arithmetic Simulation in PyTorch
Python
275
1 年前
A script to convert floating-point CNN models into generalized low-precision ShiftCNN representation
Python
56
8 年前
Low Precision(quantized) Yolov5
Python
37
1 个月前
JAX Scalify: end-to-end scaled arithmetics
Python
16
6 个月前
Code for DNN feature map compression paper
C++
11
6 年前
CUDA/HIP header-only library for writing vectorized and low-precision (16 bit, 8 bit) GPU kernels
C++
7
9 天前
LinearCosine: Adding beats multiplying for lower-precision efficient cosine similarity
C++
0
6 个月前