Computer Vision

Computer vision libraries and models for image understanding, generation, OCR, and object detection.

Repositories

AUTOMATIC1111 / stable-diffusion-webui

A feature-rich web UI for Stable Diffusion, enabling text-to-image, image-to-image, outpainting, and inpainting. Supports extensions, LoRA, custom embeddings, and API access with a user-friendly Gradio interface.

Python

163.8k

4 months ago

hacksider / Deep-Live-Cam

Real-time face swapping and video deepfake tool that works with just a single image. Supports webcam streaming, video processing, and multiple GPU acceleration options including CUDA, CoreML, and DirectML.

Python

94.0k

8 days ago

opencv / opencv

OpenCV is an open-source computer vision and machine learning software library. It provides real-time optimized tools for image processing, object detection, video analysis, and AI model execution across multiple platforms and programming languages.

C++

89.3k

9 hours ago

PaddlePaddle / PaddleOCR

An open-source OCR toolkit and Document AI engine that converts PDFs and images into LLM-ready structured data (JSON/Markdown). Features the SOTA lightweight vision-language model PaddleOCR-VL for document parsing, PP-OCRv5 for 100+ language text recognition, and deep integration with RAG and Agent ecosystems like Dify and RAGFlow.

Python

83.3k

15 hours ago

tesseract-ocr / tesseract

Tesseract OCR engine with neural network (LSTM) support for 100+ languages. Includes command-line tool and API library for text extraction from images.

C++

74.9k

14 hours ago

CompVis / stable-diffusion

Stable Diffusion is a latent text-to-image diffusion model that generates photo-realistic images from text prompts. Built on latent diffusion architecture with a CLIP text encoder, it enables high-quality image synthesis, image-to-image translation, and inpainting tasks.

Jupyter Notebook

73.1k

2 years ago

ultralytics / ultralytics

Ultralytics YOLO is a cutting-edge computer vision framework providing state-of-the-art object detection, segmentation, classification, tracking, and pose estimation models. Fast, accurate, and easy to use with extensive deployment options.

Python

58.7k

8 hours ago

ultralytics / yolov5

YOLOv5 is a state-of-the-art computer vision model for real-time object detection, segmentation, and classification. Built on PyTorch, it offers exceptional speed, accuracy, and ease of use for both research and production deployment.

Python

57.5k

a day ago

ageitgey / face_recognition

A powerful yet simple Python library for face recognition with 99.38% accuracy on LFW benchmark. Provides easy API for face detection, facial feature analysis, and identity recognition with command-line tools.

Python

56.5k

2 years ago

deepfakes / faceswap

FaceSwap is an open-source deepfake tool that uses deep learning to detect and swap faces in images and videos. It provides a complete workflow including face extraction, model training, and conversion with multiple model support and GPU acceleration.

Python

55.3k

24 days ago

facebookresearch / segment-anything

Meta AI's Segment Anything Model (SAM) is a breakthrough foundation model for promptable image segmentation. It generates high-quality object masks from simple prompts like points or boxes, trained on 11M images with 1.1B masks, delivering exceptional zero-shot performance across diverse segmentation tasks.

Jupyter Notebook

54.4k

2 years ago

Collections

Computer Vision

Repositories

AUTOMATIC1111 / stable-diffusion-webui

hacksider / Deep-Live-Cam

opencv / opencv

PaddlePaddle / PaddleOCR

tesseract-ocr / tesseract

CompVis / stable-diffusion

ultralytics / ultralytics

ultralytics / yolov5

ageitgey / face_recognition

deepfakes / faceswap

facebookresearch / segment-anything

Graph