Repository navigation

#

fasttext-embeddings

NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。

Python
533
2 年前

Spanish word embeddings computed with different methods and from different corpora

360
6 年前

Tools for shrinking fastText models (in gensim format)

Jupyter Notebook
179
1 年前

Text to abstract art generation for the holidays!

Python
90
2 年前

A monolingual and cross-lingual meta-embedding generation and evaluation framework

Python
80
3 年前

Persian sentiment analysis ( آناکاوی سهش های فارسی | تحلیل احساسات فارسی )

Jupyter Notebook
55
3 年前

Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

Python
35
7 个月前

Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"

Jupyter Notebook
31
20 天前

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

29
3 年前

Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning

Jupyter Notebook
19
5 年前

Repository for the free online book Oddly Satisfying Deep Learning from Scratch (link below!)

Jupyter Notebook
15
3 年前

Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

Python
13
3 年前

Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gensim library). The .vec and .model files are available for download (all in one archive).

12
4 个月前

FastText-based Offensive Language Detection in Tweets using OLID & SOLID datasets with LIME visualizations for interpretability.

Python
11
8 个月前

This project contains the code to use custom fasttext embeddings with flair framework.

Python
11
4 个月前

Biomedical Word embeddings generated from Spanish Biomedical corpora.

10
3 年前