Repository navigation

#

de-identification

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

Python
5270
2 天前

ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.

Java
668
8 个月前

Mediapipe-based library to redact faces from videos and images

C++
441
2 年前

Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.

Jupyter Notebook
82
3 个月前

A python client used to interact with the Private AI's API

Python
22
5 天前

A pipeline to identify (and remove) certain sequences from raw genomic data. Default taxon to identify (and remove) is Homo sapiens. Removal is optional.

Nextflow
21
12 天前

Masking identifiable information from health related documents.

Python
20
3 年前

CliniDeID automatically de-identifies clinical text notes according to the HIPAA Safe Harbor method. It accurately finds identifiers and tags or replaces them with realistic surrogates for better anonymity.

Java
10
2 年前

Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.

Python
9
4 年前

A data de-identification library written in Go

Go
8
1 年前

Source code for the paper "Generating Synthetic Training Data for Supervised De-Identification of Electronic Health Records" in Future Internet (2021).

Python
6
4 年前

Python package to replace identifiable strings in multiple files and folders at once.

Python
5
2 年前