Repository navigation

#

categorical-features

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

C++
8359
17 小时前

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4.5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting, Random Forest and Adaboost w/categorical features support for Python

Python
474
19 天前

Tensorflow implementation of Product-based Neural Networks. An extended version is at https://github.com/Atomu2014/product-nets-distributed.

Python
375
5 年前

This repository contains a notebook demonstrating a practical implementation of the so-called Entity Embedding for Encoding Categorical Features for Training a Neural Network.

Jupyter Notebook
74
6 年前

Scikit-Learn compatible transformer that turns categorical variables into dense entity embeddings.

Jupyter Notebook
41
2 年前
Jupyter Notebook
10
4 年前

Predicting the ideological direction of Supreme Court decisions: ensemble vs. unified case-based model

Jupyter Notebook
8
7 年前

glmdisc Python package: discretization, factor level grouping, interaction discovery for logistic regression

Python
6
1 年前

A mixed attributes predictive algorithm implemented in Python.

Python
5
6 个月前

Category transformation

Python
3
3 年前

This study creates machine learning models to predict the seriousness of car crashes using 2019 and 2020 crash reports from the publicly accessable database maintained by the Chicago Police Department. A car crash is considered serious if the crash results in an injury or the car is towed due to the crash. Models use categorical features that describe conditions at the time of the crash and crash causes to predict the required target. The current focus is to classify whether a crash results in an injury. All machine learning models are trained, validated, and tested on randomly split 2019 crash reports. The best model (along with all others) are then tested using the full set of 2020 crash reports.

Jupyter Notebook
3
4 年前

Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.

Python
3
5 年前

Kaggle Categorical Feature Encoding Challenge II, private score 0.78795 (110 place)

Jupyter Notebook
2
5 年前