Repository navigation

#

xlm-roberta

Python
3060
1 年前

🤖 A PyTorch library of curated Transformer models and their composable components

Python
883
1 年前

CINO: Pre-trained Language Models for Chinese Minority (少数民族语言预训练模型)

Python
242
2 年前

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.

Python
241
2 年前

Unattended Lightweight Text Classifiers with LLM Embeddings

Python
185
7 个月前

Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.

Rust
74
1 年前

Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity

Python
40
3 年前

Resources and tools for the Tutorial - "Hate speech detection, mitigation and beyond" presented at ICWSM 2021

Python
37
3 年前

PyTorch implementation of Sentiment Analysis of the long texts written in Serbian language (which is underused language) using pretrained Multilingual RoBERTa based model (XLM-R) on the small dataset.

Jupyter Notebook
26
2 年前

Sentiment Analysis of tweets written in underused Slavic languages (Serbian, Bosnian and Croatian) using pretrained multilingual RoBERTa based model XLM-R on 2 different datasets.

Jupyter Notebook
22
4 年前

An implementation of drophead regularization for pytorch transformers

Python
19
4 年前

This is a Pytorch (+ Huggingface transformers) implementation of a "simple" text classifier defined using BERT-based models. In this lab we will see how it is simple to use BERT for a sentence classification task, obtaining state-of-the-art results in few lines of python code.

Jupyter Notebook
19
4 年前

Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning

Python
18
2 年前

This repository is a comprehensive project that leverages the XLM-Roberta model for intent detection. This repository is a valuable resource for developers looking to build and fine-tune intent detection models based on state-of-the-art techniques.

Jupyter Notebook
14
1 年前

Improving Bilingual Lexicon Induction with Cross-Encoder Reranking (Findings of EMNLP 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

Python
13
2 年前

Official repository of the ACL 2024 paper "Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!".

Python
10
5 个月前

Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models

Python
10
2 年前

notebooks to finetune `bert-small-amharic`, `bert-mini-amharic`, and `xlm-roberta-base` models using an Amharic text classification dataset and the transformers library

Jupyter Notebook
10
1 年前