Repository navigation
#
blackwell
- Website
- Wikipedia
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
59432
10 小时前
SGLang is a fast serving framework for large language models and vision language models.
Python
18602
3 小时前
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.
C++
11772
1 天前
Prebuilt DeepSpeed wheels for Windows with NVIDIA GPU support. Supports GTX 10 - RTX 50 series. Compiled with pytorch 2.7, 2.8 and cuda 12.8
24
2 个月前
Cuda
1
3 个月前
Repository for Campbells-Luggs-Blackwells family history web site
HTML
0
3 年前