Repository navigation

#

record-linkage

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

C
4600
1 个月前

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python
4381
2 个月前

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Python
1732
19 小时前
J535D165/recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
1025
2 年前

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

JavaScript
717
1 年前

🆔 Command line tool for deduplicating CSV files

Python
430
6 年前

🆔 Examples for using the dedupe library

Python
414
1 年前

Super Fast String Matching in Python

Python
370
7 个月前

🔎 Finds fuzzy matches between CSV files

Python
189
6 个月前

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Jupyter Notebook
156
3 年前

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Python
128
6 个月前

Spark RDD with Lucene's query and entity linkage capabilities

Scala
128
1 个月前

Resources for tackling record linkage / deduplication / data matching problems

125
2 年前

Record Linkage ToolKit (Find and link entities)

Python
110
2 年前

List of entity resolution software and resources.

86
7 个月前

Python package for deduplication/entity resolution using active learning

Python
81
1 年前

Python implementation of anonymous linkage using cryptographic linkage keys

Python
67
1 年前