Repository navigation

#

video-recognition

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

Python
1636
6 个月前

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

Python
513
5 年前

Detects license plate of car and recognizes its characters

Python
355
2 年前

AutoVideo: An Automated Video Action Recognition System

Python
336
2 年前

MoViNets PyTorch implementation: Mobile Video Networks for Efficient Video Recognition;

Jupyter Notebook
276
3 年前

PyTorch implementation of Non-Local Neural Networks (https://arxiv.org/pdf/1711.07971.pdf)

Python
251
3 年前

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

Python
193
1 年前

GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition?

Python
185
1 年前

【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Python
151
1 年前

YAPO e+ - Yet Another Porn Organizer (extended)

Python
150
3 年前

[NeurIPS 2023] Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

Jupyter Notebook
119
2 年前

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

Python
106
5 年前

[Neurocomputing 2019] Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

C++
104
4 年前

WACV 2024 Papers: Discover cutting-edge research from WACV 2024, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!

Python
96
1 年前

PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Python
89
4 年前

A search system that analyzes short video snippets (2–5 secs) and finds highly accurate matches using keyframe-based perceptual hashing. Selfhosted Video Shazam.

Java
57
10 天前