Repository navigation

#

document-parser

Marker-Inc-Korea/AutoRAG

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

Python
3832
2 个月前
Filimoa/open-parse
Python
2911
5 个月前

LAYRA is a ready-to-use visual RAG system with a complete web UI built with Next.js and FastAPI, preserving document layout, tables, paragraphs, and graphical elements without any structural fragmentation.

TypeScript
434
2 天前

Parse PDFs into markdown using Vision LLMs

Python
345
2 个月前

Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pipelines (GenAI, LLM, VLLM) into your applications, supporting various tasks such as document cleanup, optical character recognition (OCR), classification, splitting, named entity recognition, and form processing

Python
67
19 天前

Tutorial on how to deskew (straighten) text images

Python
51
3 年前

A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines.

Python
50
1 个月前

An OCR based document parser to extract information from identity document images

TypeScript
21
3 年前

An open source framework for Retrieval-Augmented System (RAG) uses semantic search helps to retrieve the expected results and generate human readable conversational response with the help of LLM (Large Language Model).

Python
20
9 个月前
Jupyter Notebook
17
3 年前