Repository navigation

#

sparsity

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python
2378
2 天前
Python
1974
9 小时前

PaddleSlim is an open-source library for deep model compression and architecture search.

Python
1589
5 个月前

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Python
1530
2 个月前

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python
1238
2 天前

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt

Python
673
8 个月前

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python
438
9 个月前

Caffe for Sparse and Low-rank Deep Neural Networks

C++
379
5 年前

An innovative library for efficient LLM inference via low-bit quantization

C++
350
8 个月前

Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".

Python
338
5 年前
Python
266
3 个月前

Always sparse. Never dense. But never say never. A Sparse Training repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boost Deep Learning scalability on various aspects (e.g. memory and computational time efficiency, representation and generalization power).

Python
248
4 年前

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

Python
239
4 年前

Caffe for Sparse Convolutional Neural Network

C++
237
2 年前

Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626

Jupyter Notebook
177
2 年前

A research library for pytorch-based neural network pruning, compression, and more.

Shell
160
2 年前