Repository navigation

#

model-quantization

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

2061
2 个月前
C++
242
1 年前

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

177
2 个月前

This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.

Jupyter Notebook
172
2 年前

[WINNER! 🏆] Psychopathology FER Assistant. Because mental health matters. My project submission for #TFWorld TF 2.0 Challenge at Devpost.

Jupyter Notebook
78
2 年前

[ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binarization.

Python
55
1 年前

[NeurIPS 2023 Spotlight] This project is the official implementation of our accepted NeurIPS 2023 (spotlight) paper QuantSR: Accurate Low-bit Quantization for Efficient Image Super-Resolution.

Python
47
1 年前

The official implementation of the ICML 2023 paper OFQ-ViT

Python
30
2 年前

Chat to LLaMa 2 that also provides responses with reference documents over vector database. Locally available model using GPTQ 4bit quantization.

Python
29
1 年前

PyTorch implementation of "BiDense: Binarization for Dense Prediction," A binary neural network for dense prediction tasks.

Python
6
5 个月前

Enterprise multi-agent framework for secure, borderless data collaboration with zero-trust and federated learning-lightweight edge-ready.

Python
5
14 天前

Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32).

Jupyter Notebook
4
2 年前

Automated Jupyter notebook solution for batch converting Large Language Models to GGUF format with multiple quantization options. Built on llama.cpp with HuggingFace integration.

Jupyter Notebook
2
3 个月前

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.

Shell
1
24 天前

This project explores generating high-quality images using depth maps and conditioning techniques like Canny edges, leveraging Stable Diffusion and ControlNet models. It focuses on optimizing image generation with different aspect ratios, inference steps to balance speed and quality.

Python
0
6 个月前