Repository navigation

#

Entity resolution

Created by Halbert L. Dunn

发布于 1946

entity-resolution
维基百科

相关主题

人工智能自然语言处理

Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.

🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.

Python
4358
22 天前

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Python
1688
2 天前
J535D165/recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
1022
1 年前

🆔 Command line tool for deduplicating CSV files

Python
427
5 年前

🆔 Examples for using the dedupe library

Python
413
1 年前

Recent trends of Entity Linking, Disambiguation, and Representation.

346
4 年前

This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).

Python
278
1 年前

An open source, high scalability toolkit in Java for Entity Resolution.

Java
221
1 个月前

ReFinED is an efficient and accurate entity linking (EL) system.

Python
218
8 个月前

🔎 Finds fuzzy matches between CSV files

Python
190
5 个月前

Entity resolution for Elasticsearch.

Java
162
7 个月前

Construct knowledge graphs from unstructured data sources, then use graph algorithms for enhanced GraphRAG, and a BAML-based chat bot

Jupyter Notebook
160
4 天前

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Jupyter Notebook
154
3 年前

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Python
128
4 个月前

Resources for tackling record linkage / deduplication / data matching problems

125
1 年前

OpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.

Java
120
2 个月前