Repository navigation

ocr-correction

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python

2733

188

6 个月前

Python 3 library for processing historical English

Python

1 年前

Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"

Jupyter Notebook

2 年前

This project is to correct errors from OCR text for Languages which its sentences without spaces

Python

7 年前

🐀 Clean up that ratty OCR

Python

2 年前

A living digital archive of the correspondence of Catherine de’ Medici.

HTML

6 天前