Repository navigation

#

Entity resolution

Created by Halbert L. Dunn

发布于 1946

entity-resolution
维基百科

相关主题

人工智能自然语言处理

Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python
4266
5 个月前

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Python
1559
2 天前

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
997
1 年前

🆔 Command line tool for deduplicating CSV files

Python
420
5 年前

🆔 Examples for using the dedupe library

Python
411
8 个月前

Recent trends of Entity Linking, Disambiguation, and Representation.

345
4 年前

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

Python
277
1 年前

An open source, high scalability toolkit in Java for Entity Resolution.

Java
218
1 年前

ReFinED is an efficient and accurate entity linking (EL) system.

Python
214
4 个月前

🔎 Finds fuzzy matches between CSV files

Python
189
24 天前

Entity resolution for Elasticsearch.

Java
159
3 个月前

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Jupyter Notebook
152
2 年前

Resources for tackling record linkage / deduplication / data matching problems

123
1 年前

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Python
118
15 天前