Repository navigation

#

extract

视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.

Python
7031
2 个月前

SwiftSoup: Pure Swift HTML Parser, with best of DOM, CSS, and jquery (Supports Linux, iOS, Mac, tvOS, watchOS)

Swift
4762
24 天前

PDFsam, a desktop application to split, merge, mix, rotate PDF files and extract pages

Java
3723
1 天前

Camelot: PDF Table Extraction for Humans

Python
3681
2 年前

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

Python
3484
1 小时前

GUI and API library to work with Engine assets, serialized and bundle files

C#
2925
3 年前

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

Python
2540
3 天前

A library to read, parse, export and make subsets of different types of font files.

PHP
1771
4 个月前
extractus/article-extractor

To extract main article from given URL with Node.js

JavaScript
1690
2 个月前

A web interface to extract tabular data from PDFs

Python
1650
4 个月前

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

Java
1588
1 年前

The extension provides refactoring tools for your React codebase

TypeScript
1470
2 年前

A tool to view and extract the contents of an Windows Installer (.msi) file.

C#
1431
1 个月前
TypeScript
1428
11 天前
JavaScript
1294
2 年前

extrakto for tmux - quickly select, copy/insert/complete text without a mouse

Python
952
4 个月前