Repository navigation

#

deit

Python
5717
2 个月前

PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法

Python
284
2 年前

Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.

Python
153
2 年前

(Unofficial) PyTorch implementation of Training Vision Transformers for Image Retrieval(El-Nouby, Alaaeldin, et al. 2021).

Python
51
2 年前

[CVPR 2024] Code for our Paper "DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets"

Python
41
8 个月前

2D Human Pose estimation using transformers. Implementation in Pytorch

Python
33
4 年前

Pytorch implementation of some vision transformers, trained on CIFAR-10.

Python
29
4 年前

[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

Python
14
1 年前

Image Classification Tutorial: ConvNext--> 98.8% on CIFAR10 + 92.4% on CIFAR100; ResNet18 -- 95.6% on CIFAR10 + 79.1% on CIFAR100

Python
11
3 个月前

This is a warehouse for DeiT-pytorch-model, can be used to train your image dataset

Python
6
2 年前

The analysis of several vision-based transformers is the main emphasis of this project, which also analyzes their distinctive properties and evaluates how well they work using a common dataset. The study intends to obtain insights into the strengths and shortcomings of various transformer designs created for computer vision tasks.

Jupyter Notebook
3
2 年前

Final assignment in the NLP course at the Technion (IEM097215). In this assignment we propose a novel architecture to handle both Text-to-Image translation and Image-to-Text translation tasks on paired data, using a unified architecture of transformers and CNNs and enforcing cycle consistency.

Python
2
3 年前

Image classification with DeiT model, including data preprocessing, k-fold CV, early stopping and model saving.

Python
1
1 年前

This repository holds the downstream task of Face Mask Classification performed on Self Currated Custom Dataset with various State of the Art deep learning models like ViT, BeIT, DeIT, LeViT, ConvNeXt, VGG16, EfficientNetV2, RegNet and MobileNetV3.

Jupyter Notebook
1
4 年前

Implementation of a Paper related to Vision Transformer

Python
0
2 年前

Image captioning with pretrained encoder on MSCOCO.

Python
0
3 年前

This is a warehouse for Agent-Attention-Models based on pytorch framework, can be used to train your image datasets.

Python
0
2 年前