Repository navigation

#

ocr-correction

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python
2616
2 个月前
Python
67
8 个月前

Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"

Jupyter Notebook
36
1 年前

This project is to correct errors from OCR text for Languages which its sentences without spaces

Python
6
6 年前

🐀 Clean up that ratty OCR

Python
3
2 年前