Repository navigation

#

attention-mechanism

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python
23662
2 天前

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

Python
13902
11 小时前

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Python
5627
2 年前

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Python
5518
1 天前

A TensorFlow Implementation of the Transformer: Attention Is All You Need

Python
4390
2 年前

Graph Attention Networks (https://arxiv.org/abs/1710.10903)

Python
3404
3 年前

Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017, https://arxiv.org/abs/1710.10903)

Python
3054
2 年前

Keras Attention Layer (Luong and Bahdanau scores).

Python
2812
2 年前
gordicaleksa/pytorch-GAT

My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!

Jupyter Notebook
2581
3 年前

Reformer, the efficient Transformer, in Pytorch

Python
2179
2 年前

To eventually become an unofficial Pytorch implementation / replication of Alphafold2, as details of the architecture get released

Python
1614
3 年前

Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute

Python
1532
5 年前

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python
1527
4 个月前