Repository navigation

#

linear-attention

Website
Wikipedia

BlinkDL / RWKV-LM

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and free sentence embedding.

attention-mechanism 深度学习 gpt gpt-2 gpt-3 language-model linear-attention lstm PyTorch rnn transformer transformers rwkv ChatGPT

Python

13523

912

12 天前

happinesslz / LION

[NeurIPS 2024] Official code of ”LION: Linear Group RNN for 3D Object Detection in Point Clouds“

3d-object-detection linear-attention

Python

171

12

6 个月前

lucidrains / taylor-series-linear-attention

Explorations into the recently proposed Taylor Series Linear Attention

人工智能 attention-mechanisms 深度学习 linear-attention

Python

97

3

8 个月前

lucidrains / agent-attention-pytorch

Implementation of Agent Attention in Pytorch

人工智能 attention-mechanisms 深度学习 linear-attention

Python

90

5

9 个月前

lironui / Multi-Attention-Network

The semantic segmentation of remote sensing images

attention-mechanism remote-sensing semantic-segmentation segmentation linear-attention

Python

77

9

3 年前

lironui / MAResU-Net

The semantic segmentation of remote sensing images

attention-mechanism attention linear-attention segmentation semantic-segmentation remote-sensing

Python

46

4

3 年前

lucidrains / autoregressive-linear-attention-cuda

CUDA implementation of autoregressive linear attention, with all the latest research findings

人工智能 attention-mechanisms CUDA 深度学习 linear-attention

Python

44

3

2 年前

glassroom / heinsen_attention

Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)

attention attention-mechanism attention-model linear-attention

Python

24

1

10 个月前

BICLab / MetaLA

Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)

linear-attention 大语言模型

Python

22

1

3 个月前

gmongaras / Cottention_Transformer

Code for the paper "Cottention: Linear Transformers With Cosine Attention"

attention linear-attention transformers

Cuda

17

0

6 个月前

robflynnyh / hydra-linear-attention

Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)

attention linear-attention 机器学习 transformers

Python

13

0

2 年前

RWKV-Wiki / rwkv-wiki.github.io

RWKV Wiki website (archived, please visit official wiki)

attention-mechanism 深度学习 gpt gpt-2 gpt-3 language-model linear-attention lstm rnn rwkv transformer transformers

Shell

10

0

2 年前

OSU-STARLAB / LeaPformer

[ICML 2024] Official implementation of "LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions."

efficiency language-modeling linear-attention transformer-architecture

Python

9

1

5 个月前

gmlwns2000 / sea-attention

Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)

attention linear-attention

Python

8

1

1 个月前

LEAP: Linear Explainable Attention in Parallel for causal language modeling with O(1) path length, and O(1) inference

linear-attention PyTorch transformers attention-mechanism 深度学习 parallel rnn transformer

Jupyter Notebook

4

0

2 年前