Repository navigation
pii
- Website
- Wikipedia
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
What's in your data? Extract schema, statistics and entities from datasets
Secure Vault for Customer PII/PHI/PCI/KYC Records
An AI-powered Personal Identifiable Information (PII) scanner.
A powerful scanner to scan your Filesystem, S3, MySQL, Redis, Google Cloud Storage and Firebase storage for PII and sensitive data.
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
🚨 slog: Attribute formatting
Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)
KloudDB Shield is a comprehensive Postgres Security Tool - PII Scanner , CIS Benchmarks , SSL audit , 12+ features .. Supports Postgres, RDS ,Aurora, MySQL
🛡️ PII Guard is an LLM-powered tool that detects and manages Personally Identifiable Information (PII) in logs — designed to support data privacy and GDPR compliance
Scala library and compiler plugin that prevent inadvertent leakage of sensitive fields in `case classes` (such as credentials, personal data, and other confidential information)
Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.
A package to build an end-to-end pipeline for detecting personally identifiable information from text.