Repository navigation

tesseract

Website
Wikipedia

tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)

tesseract tesseract-ocr OCR lstm 机器学习 ocr-engine Hacktoberfest

C++

70048

10259

3 天前

naptha / tesseract.js

Pure Javascript OCR for more than 100 Languages 📖🎉🖥

tesseract WebAssembly OCR JavaScript 深度学习

JavaScript

37313

2343

1 个月前

ocrmypdf / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Python OCR pdf 图像处理 tesseract

Python

31363

2181

11 天前

pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

mupdf xps pdf-documents epub OCR pdf 字体 Python 数据科学 extract-data table-extraction tesseract text-processing text-shaping

Python

8178

646

1 天前

tesseract-ocr / tessdata

Trained models with fast variant of the "best" LSTM models + legacy models

OCR tesseract

7171

2388

2 年前

aisingapore / TagUI

Free RPA tool by AI Singapore

rpa 人工智能 OpenCV tesseract 自然语言处理

JavaScript

6106

629

7 个月前

tebelorg / RPA-Python

Python package for doing RPA

rpa Python OpenCV tesseract tagui sikuli cross-platform

Python

5345

717

24 天前

thiagoalessio / tesseract-ocr-for-php

A wrapper to work with Tesseract OCR inside PHP.

OCR tesseract PHP text-recognition image-to-text

PHP

3008

551

6 个月前

otiai10 / gosseract

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Go tesseract tesseract-ocr OCR

2976

300

6 个月前

Dicklesworthstone / llm_aided_ocr

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

ai-assist llama2 大语言模型 OCR tesseract ocr-correction

Python

2756

191

7 个月前

Goldziher / kreuzberg

Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.

OCR text-extraction async document-intelligence mcp pandoc Python rag table-extraction tesseract

HTML

2417

102

1 天前

rmtheis / android-ocr

Experimental optical character recognition app

Android OCR tesseract optical-character-recognition

Java

2240

889

7 年前

sirfz / tesserocr

A Python wrapper for the tesseract-ocr API

optical-character-recognition OCR tesseract python-library cython

Python

2117

255

2 个月前

Pulover / PuloversMacroCreator

Automation Utility - Recorder & Script Generator

autohotkey tesseract 自动化 rpa robot

AutoHotkey

1869

257

3 年前

ianzhao / textshot

Python tool for grabbing text via screenshot

Python Script python-script screenshot OCR ocr-recognition tesseract tesseract-ocr

Python

1772

256

9 个月前

Akylas / OSS-DocumentScanner

Document scanning app

Android document document-scanner OpenCV pdf scanner tesseract 图像处理

C++

1503

3 小时前

ryfeus / lambda-packs

Precompiled packages for AWS Lambda

aws-lambda phantomjs Serverless Amazon Web Services Tensorflow Keras NumPy pandas tesseract scikit-learn OpenCV pillow Selenium Python lightgbm spaCy hdf PyTorch

Python

1122

236

2 年前

GauravSingh9356 / J.A.R.V.I.S

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

Python jarvis newsapi weather-api speech-recognition pyttsx3 webbrowser Hacktoberfest hactoberfest2021 tesseract tesseract-ocr optical-character-recognition ChatGPT OpenCV

Python

934

226

1 年前