Computer Vision

Computer vision libraries and models for image understanding, generation, OCR, and object detection.

Repositories

A feature-rich web UI for Stable Diffusion, enabling text-to-image, image-to-image, outpainting, and inpainting. Supports extensions, LoRA, custom embeddings, and API access with a user-friendly Gradio interface.

Python
163.8k
4 months ago

Real-time face swapping and video deepfake tool that works with just a single image. Supports webcam streaming, video processing, and multiple GPU acceleration options including CUDA, CoreML, and DirectML.

Python
94.0k
8 days ago

OpenCV is an open-source computer vision and machine learning software library. It provides real-time optimized tools for image processing, object detection, video analysis, and AI model execution across multiple platforms and programming languages.

C++
89.3k
9 hours ago

An open-source OCR toolkit and Document AI engine that converts PDFs and images into LLM-ready structured data (JSON/Markdown). Features the SOTA lightweight vision-language model PaddleOCR-VL for document parsing, PP-OCRv5 for 100+ language text recognition, and deep integration with RAG and Agent ecosystems like Dify and RAGFlow.

Python
83.3k
15 hours ago

Tesseract OCR engine with neural network (LSTM) support for 100+ languages. Includes command-line tool and API library for text extraction from images.

C++
74.9k
14 hours ago

Stable Diffusion is a latent text-to-image diffusion model that generates photo-realistic images from text prompts. Built on latent diffusion architecture with a CLIP text encoder, it enables high-quality image synthesis, image-to-image translation, and inpainting tasks.

Jupyter Notebook
73.1k
2 years ago
ultralytics/ultralytics

Ultralytics YOLO is a cutting-edge computer vision framework providing state-of-the-art object detection, segmentation, classification, tracking, and pose estimation models. Fast, accurate, and easy to use with extensive deployment options.

Python
58.7k
8 hours ago
ultralytics/yolov5

YOLOv5 is a state-of-the-art computer vision model for real-time object detection, segmentation, and classification. Built on PyTorch, it offers exceptional speed, accuracy, and ease of use for both research and production deployment.

Python
57.5k
a day ago

A powerful yet simple Python library for face recognition with 99.38% accuracy on LFW benchmark. Provides easy API for face detection, facial feature analysis, and identity recognition with command-line tools.

Python
56.5k
2 years ago

FaceSwap is an open-source deepfake tool that uses deep learning to detect and swap faces in images and videos. It provides a complete workflow including face extraction, model training, and conversion with multiple model support and GPU acceleration.

Python
55.3k
24 days ago

Meta AI's Segment Anything Model (SAM) is a breakthrough foundation model for promptable image segmentation. It generates high-quality object masks from simple prompts like points or boxes, trained on 11M images with 1.1B masks, delivering exceptional zero-shot performance across diverse segmentation tasks.

Jupyter Notebook
54.4k
2 years ago