Repository navigation

#

similarity

shibing624/text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。

Python
4854
4 个月前

similarity: Text similarity calculation Toolkit for Java. 文本相似度计算工具包,java编写,可用于文本相似度计算、情感分析等任务,开箱即用。

Java
1548
3 个月前
kornelski/dssim

Image similarity comparison simulating human perception (multiscale SSIM in Rust)

Rust
1141
3 天前
J535D165/recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Python
1025
2 年前

A library implementing different string similarity and distance measures using Python.

Python
1019
3 年前

Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。

Python
873
1 年前

综合了同义词词林扩展版与知网(Hownet)的词语相似度计算方法,词汇覆盖更多、结果更准确。

Python
737
4 年前

自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要,文本相似度,科学计算器,中文数字阿拉伯数字(罗马数字)转换,中文繁简转换,拼音转换。tookit(tool) of NLP,CWS(chinese word segnment),POS(Part-Of-Speech Tagging),NER(name entity recognition),Find(new words discovery),Keyword(keyword extraction),Summarize(text summarization),Sim(text similarity),Calculate(scientific calculator),Chi2num(chinese number to arabic number)

Python
664
3 年前

pHash - the open source perceptual hash library

C++
605
3 年前

difPy - Python package for finding duplicate and similar images

Python
518
9 个月前

Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。

Python
508
3 年前

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.

Java
479
2 年前

The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.

Rust
405
7 个月前

set of functions and operators for executing similarity queries

C
388
4 个月前

基于哈工大同义词词林扩展版的单词相似度计算方法

Python
371
2 年前

中文智能客服机器人demo,包含闲聊和专业问答2个部分,支持自定义组件(Chinese intelligent customer chatbot Demo, including the gossip and the professional Q&A(FAQ) , support for custom components!)

Python
313
3 年前

An extension toolkit for Immich enabling advanced management capabilities through AI-powered similarity detection

Python
299
6 天前

🦀📏 Rust library to compare strings (or any sequences). 25+ algorithms, pure Rust, common interface, Unicode support.

Rust
296
1 年前