Repository navigation

#

extraction

axa-group/Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

JavaScript
5998
2 年前

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Python
5478
5 小时前

extract internal monitoring data from application logs for collection in a timeseries database

Go
3955
12 天前
C
3502
3 个月前

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

Java
3143
7 小时前

Provides functions to read and write from/to an object or array using a simple string notation

PHP
2822
7 天前
C#
2714
1 年前

Extract files from any kind of container formats

Python
2350
2 天前

node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!

HTML
1684
3 年前

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

Python
1611
4 个月前

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

Rust
1216
8 个月前
Rich Text Format
1158
2 个月前

A C++ static library offering a clean and simple interface to the 7-zip shared libraries.

C++
744
17 天前

A program to extract files from the RPA archive format.

Python
677
3 年前

Stanford Open Information Extraction made simple!

Python
660
2 年前

北京航空航天大学大数据高精尖中心自然语言处理研究团队对信息抽取领域的调研。包括实体识别,关系抽取,属性抽取等子任务,每类子任务分别对学术界和工业界进行调研。

471
3 年前

File Injector is a script that allows you to store any file in an image using steganography

Python
439
3 年前

DataTool is a program that lets you extract models, maps, and files from Overwatch.

C#
424
3 天前

PHP URI Template (RFC 6570) supports both URI expansion & extraction

PHP
411
9 个月前