Repository navigation

#

pdf-converter

#1 Locally hosted web application that allows you to perform various operations on PDF files

Java
64457
3 小时前

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python
42148
14 小时前

PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等

C#
10246
13 天前

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

Python
5578
7 天前

This repo isn't maintained anymore as phantomjs got dreprecated a long time ago. Please migrate to headless chrome/puppeteer.

JavaScript
3563
1 年前
borb-pdf/borb

borb is a library for reading, creating and manipulating PDF files in python.

Python
3505
1 个月前

Open source Python library for converting PDF to DOCX.

Python
3061
3 个月前
xhtml2pdf/xhtml2pdf

A library for converting HTML into PDFs using ReportLab

Python
2327
12 小时前

converts binary PDF to JSON and text, for server-side PDF processing and command-line use. Zero dependency.

Java
2130
4 天前

Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

Go
1720
1 年前

A high-quality PDF to Markdown tool based on large language model visual recognition. 一款基于大模型视觉识别的高质量PDF转Markdown工具

Python
1555
23 天前

A PDF to Markdown converter

JavaScript
1431
1 年前

A self-hosted, drag-and-drop & nosql file conversion server & share tool that supports 445 file formats in 13 languages.

PHP
1287
1 年前

C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF.

C#
1165
5 年前