Repository navigation
#
low-precision
- Website
- Wikipedia
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Python
2473
20 小时前
Low Precision Arithmetic Simulation in PyTorch
Python
282
1 年前
A script to convert floating-point CNN models into generalized low-precision ShiftCNN representation
Python
56
8 年前
Low Precision(quantized) Yolov5
Python
42
5 个月前
JAX Scalify: end-to-end scaled arithmetics
Python
16
10 个月前
Code for DNN feature map compression paper
C++
11
7 年前
CUDA/HIP header-only library for low-precision (16 bit, 8 bit) and vectorized GPU kernel development
C++
11
5 天前
LinearCosine: Adding beats multiplying for lower-precision efficient cosine similarity
C++
0
10 个月前