Repository navigation

#

triton

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda
2470
7 天前

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook
1582
2 年前

A service for autodiscovery and configuration of applications running in containers

Go
1135
2 年前

Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.

LLVM
853
2 年前

🚀🚀🚀A collection of some awesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applications.

763
2 个月前

FlagGems is an operator library for large language models implemented in the Triton Language.

Python
690
4 天前

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Python
576
2 个月前

Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series

C
511
15 天前

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Python
418
7 个月前

🚀🚀🚀 This repository lists some awesome public CUDA, cuda-python, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR, PTX and High Performance Computing (HPC) projects.

334
2 个月前

OpenDILab RL HPC OP Lib, including CUDA and Triton kernel

Python
227
1 年前

SymGDB - symbolic execution plugin for gdb

Python
217
7 年前

A performance library for machine learning applications.

Python
184
2 年前

NVIDIA-accelerated, deep learned model support for image space object detection

C++
158
2 个月前