Repository navigation

#

sparsity

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python
2474
1 天前
Python
2266
2 小时前

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python
1805
3 小时前

PaddleSlim is an open-source library for deep model compression and architecture search.

Python
1602
1 个月前

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Python
1548
9 天前

More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt

Python
677
1 年前

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python
467
1 年前

Caffe for Sparse and Low-rank Deep Neural Networks

C++
381
5 年前

An innovative library for efficient LLM inference via low-bit quantization

C++
349
1 年前

Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper "XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera". The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper "On implicit filter level sparsity in Convolutional Neural Networks".

Python
338
5 年前
Python
270
7 个月前

Always sparse. Never dense. But never say never. A Sparse Training repository for the Adaptive Sparse Connectivity concept and its algorithmic instantiation, i.e. Sparse Evolutionary Training, to boost Deep Learning scalability on various aspects (e.g. memory and computational time efficiency, representation and generalization power).

Python
246
4 年前

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

Python
239
4 年前

Caffe for Sparse Convolutional Neural Network

C++
238
3 年前

Sparse Inferencing for transformer based LLMs

Python
196
9 天前

Learning both Weights and Connections for Efficient Neural Networks https://arxiv.org/abs/1506.02626

Jupyter Notebook
178
3 年前