Repository navigation

#

record-linkage

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python
4267
5 个月前

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

C
4226
15 小时前

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Python
1558
3 天前

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
997
1 年前

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

JavaScript
713
10 个月前

🆔 Command line tool for deduplicating CSV files

Python
420
5 年前

🆔 Examples for using the dedupe library

Python
411
8 个月前

Super Fast String Matching in Python

Python
368
1 个月前

🔎 Finds fuzzy matches between CSV files

Python
189
25 天前

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Jupyter Notebook
152
2 年前

Spark RDD with Lucene's query and entity linkage capabilities

Scala
127
1 个月前

Resources for tackling record linkage / deduplication / data matching problems

123
1 年前

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Python
118
16 天前

Record Linkage ToolKit (Find and link entities)

Python
110
2 年前

Python package for deduplication/entity resolution using active learning

Python
78
8 个月前

Python implementation of anonymous linkage using cryptographic linkage keys

Python
65
1 年前

List of entity resolution software and resources.

63
2 个月前