Repository navigation

#

attention-mechanism

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python
22513
1 个月前

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

Python
13523
12 天前

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Python
5608
1 年前

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python
5229
3 天前

A TensorFlow Implementation of the Transformer: Attention Is All You Need

Python
4358
2 年前

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

Python
3340
3 年前

Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017, https://arxiv.org/abs/1710.10903)

Python
3021
2 年前

Keras Attention Layer (Luong and Bahdanau scores).

Python
2810
1 年前
gordicaleksa/pytorch-GAT

My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!

Jupyter Notebook
2530
2 年前

Reformer, the efficient Transformer, in Pytorch

Python
2163
2 年前

To eventually become an unofficial Pytorch implementation / replication of Alphafold2, as details of the architecture get released

Python
1598
2 年前

Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute

Python
1531
4 年前

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python
1490
6 个月前