Repository navigation

#

record-linkage

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

C
4548
1 个月前

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python
4358
22 天前

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Python
1688
2 天前
J535D165/recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
1022
1 年前

Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.

JavaScript
716
1 年前

🆔 Command line tool for deduplicating CSV files

Python
427
5 年前

🆔 Examples for using the dedupe library

Python
413
1 年前

Super Fast String Matching in Python

Python
369
5 个月前

🔎 Finds fuzzy matches between CSV files

Python
190
5 个月前

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Jupyter Notebook
154
3 年前

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Python
128
4 个月前

Spark RDD with Lucene's query and entity linkage capabilities

Scala
128
2 个月前

Resources for tackling record linkage / deduplication / data matching problems

125
1 年前

Record Linkage ToolKit (Find and link entities)

Python
110
2 年前

List of entity resolution software and resources.

84
6 个月前

Python package for deduplication/entity resolution using active learning

Python
81
1 年前

Python implementation of anonymous linkage using cryptographic linkage keys

Python
65
1 年前