Repository navigation

#

model-compression

microsoft/nni
Python
14260
1 年前

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Python
3130
2 年前

[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.

Python
3114
2 个月前

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

Python
2913
2 年前

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。

2622
2 年前

A curated list of neural network pruning resources.

2468
1 年前

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

Python
2257
3 个月前

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

2190
6 个月前

A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility

Python
1961
2 年前

Pytorch implementation of various Knowledge Distillation (KD) methods.

Python
1711
4 年前

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Python
1548
9 天前
Jupyter Notebook
1291
9 个月前

Collection of recent methods on (deep) neural network compression and acceleration.

949
5 个月前

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Python
921
1 年前

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

852
4 年前

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.

Python
842
3 个月前