Repository navigation
de-identification
- Website
- Wikipedia
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-identification risks and it supports well-known privacy models, such as k-anonymity, l-diversity, t-closeness and differential privacy.
Mediapipe-based library to redact faces from videos and images
A curated list of resources related to privacy engineering
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
Deidentify people's names and gender specific pronouns
A python client used to interact with the Private AI's API
A pipeline to identify (and remove) certain sequences from raw genomic data. Default taxon to identify (and remove) is Homo sapiens. Removal is optional.
Masking identifiable information from health related documents.
CliniDeID automatically de-identifies clinical text notes according to the HIPAA Safe Harbor method. It accurately finds identifiers and tags or replaces them with realistic surrogates for better anonymity.
Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.
Named entity recognition framework
A pre-commit hook to check for PII in your code.
Source code for the paper "Generating Synthetic Training Data for Supervised De-Identification of Electronic Health Records" in Future Internet (2021).
An named-entity-recognition (NER) based anonymizer for archival documents metadata.
Python package to replace identifiable strings in multiple files and folders at once.
AWS Blueprint: Automate data masking workflow