Repository navigation

#

extraction

axa-group/Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

JavaScript
5951
1 年前

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams

Python
5201
21 小时前

extract internal monitoring data from application logs for collection in a timeseries database

Go
3904
11 天前
C
3422
9 个月前

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

Java
2917
1 天前

Provides functions to read and write from/to an object or array using a simple string notation

PHP
2811
1 个月前
C#
2533
9 个月前

Extract files from any kind of container formats

Python
2301
5 天前

node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!

HTML
1666
3 年前

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

Python
1574
5 天前
Rich Text Format
1128
10 个月前

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

Rust
1060
4 个月前

A C++ static library offering a clean and simple interface to the 7-zip shared libraries.

C++
701
7 小时前

Stanford Open Information Extraction made simple!

Python
653
1 年前

A program to extract files from the RPA archive format.

Python
650
3 年前

北京航空航天大学大数据高精尖中心自然语言处理研究团队对信息抽取领域的调研。包括实体识别,关系抽取,属性抽取等子任务,每类子任务分别对学术界和工业界进行调研。

466
3 年前

File Injector is a script that allows you to store any file in an image using steganography

Python
433
2 年前

PHP URI Template (RFC 6570) supports both URI expansion & extraction

PHP
409
5 个月前

DataTool is a program that lets you extract models, maps, and files from Overwatch.

C#
391
10 天前

.net text extraction & export framework

C#
376
2 个月前