Repository navigation

#

gensim

dipanjanS/text-analytics-with-python

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.

Jupyter Notebook
1679
5 年前
kavgan/nlp-in-practice

Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.

Jupyter Notebook
1175
5 年前

Data repository for pretrained NLP models and NLP corpora.

Python
1031
7 年前

中文詞向量訓練教學

Python
516
3 年前

Fast word vectors with little memory usage in Python

Python
416
4 年前

AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

Jupyter Notebook
404
4 年前

Log Anomaly Detection - Machine learning to detect abnormal events logs

Jupyter Notebook
334
2 年前

ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Python

279
5 年前

This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.

Python
244
2 年前

Toolkit to obtain and preprocess German text corpora, train models and evaluate them with generated testsets. Built with Gensim and Tensorflow.

Jupyter Notebook
239
1 年前

Pre-trained Word2Vec Model for Turkish

Python
215
7 年前

Technical and sentiment analysis to predict the stock market with machine learning models based on historical time series data and news article sentiment collected using APIs and web scraping.

Jupyter Notebook
202
5 个月前

Web-ify your word2vec: framework to serve distributional semantic models online

Python
201
6 个月前