Repository navigation

#

cuda-kernels

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C
7977
13 天前
Python
6881
15 小时前
xlite-dev/LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda
6311
17 小时前

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust
4629
1 天前

CUDA Kernel Benchmarking Library

Cuda
701
1 天前

Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.

C++
348
3 年前

This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010

C++
220
3 年前

CUDA tutorials for Maths & ML tutorials with examples, covers multi-gpus, fused attention, winograd convolution, reinforcement learning.

Cuda
189
2 个月前

Amplifier allows .NET developers to easily run complex applications with intensive mathematical computation on Intel CPU/GPU, NVIDIA, AMD without writing any additional C kernel code. Write your function in .NET and Amplifier will take care of running it on your favorite hardware.

C#
180
4 个月前

Some CUDA design patterns and a bit of template magic for CUDA

C++
156
2 年前

Triton implementation of FlashAttention2 that adds Custom Masks.

Python
132
1 年前

Spiking Neural Networks in C++ with strong GPU acceleration through CUDA

Cuda
129
5 年前

High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.

Cuda
113
1 年前

Attention Kernels for Symmetric Power Transformers

Python
112
15 天前