Repository navigation

#

vit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python
2233
5 小时前

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

Jupyter Notebook
1875
1 年前

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Jupyter Notebook
1185
1 年前

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Python
591
3 个月前

A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

Python
528
3 年前

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Python
293
4 年前

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

Python
291
2 年前

FFCS course registration made hassle free for VITians. Search courses and visualize the timetable on the go!

JavaScript
290
5 个月前

PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法

Python
282
2 年前

Official Code of Paper "Reversible Column Networks" "RevColv2"

Python
260
2 年前

MoH: Multi-Head Attention as Mixture-of-Head Attention

Python
237
6 个月前

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

Python
228
15 天前