Repository navigation

#

fp8

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.

Python
2367
2 天前

Microsoft Automatic Mixed Precision Library

Python
593
7 个月前

An innovative library for efficient LLM inference via low-bit quantization

C++
350
8 个月前

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Python
263
6 个月前

JAX Scalify: end-to-end scaled arithmetics

Python
16
6 个月前

Cog Single GPU Quantized Implementation of Step-Video-T2V

Python
1
2 个月前

FP8 dtypes enumeration in python

C++
0
1 年前