Repository navigation

#

video-recognition

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

Python
1644
8 个月前

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

Python
512
5 年前

Detects license plate of car and recognizes its characters

Python
356
2 年前

AutoVideo: An Automated Video Action Recognition System

Python
339
2 年前

MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;

Jupyter Notebook
283
3 年前

PyTorch implementation of Non-Local Neural Networks (https://arxiv.org/pdf/1711.07971.pdf)

Python
251
3 年前

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

Python
196
1 年前

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

Python
184
1 年前

【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Python
153
1 年前

YAPO e+ - Yet Another Porn Organizer (extended)

Python
149
3 年前

[NeurIPS 2023] Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

Jupyter Notebook
119
2 年前

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

Python
107
5 年前

[Neurocomputing 2019] Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

C++
102
4 年前

WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

Python
98
1 年前

PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Python
89
4 年前

A search system that analyzes short video snippets (2–5 secs) and finds highly accurate matches using keyframe-based perceptual hashing. Selfhosted Video Shazam.

Java
59
10 小时前