Repository navigation

#

triton

Python
5549
11 小时前

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda
2238
15 天前

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook
1581
2 年前

A service for autodiscovery and configuration of applications running in containers

Go
1134
2 年前

Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.

LLVM
845
2 年前

🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.

738
19 天前

FlagGems is an operator library for large language models implemented in the Triton Language.

Python
650
1 小时前

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Python
567
7 天前

Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series

C
494
4 天前

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Python
386
5 个月前

🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

311
18 天前

OpenDILab RL HPC OP Lib, including CUDA and Triton kernel

Python
226
1 年前

SymGDB - symbolic execution plugin for gdb

Python
217
7 年前

A performance library for machine learning applications.

Python
184
2 年前

NVIDIA-accelerated, deep learned model support for image space object detection

C++
155
25 天前