Repository navigation

#

data-cleansing

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

C++
414
19 天前

Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.

246
3 年前

Exploratory data analysis 📊using python 🐍of used car 🚘 database taken from ⓚ𝖆𝖌𝖌𝖑𝖊

Jupyter Notebook
227
7 年前

A domain-specific probabilistic programming language for scalable Bayesian data cleaning

Julia
225
1 年前

XGBoost, LightGBM, LSTM, Linear Regression, Exploratory Data Analysis

Jupyter Notebook
45
6 年前

An SQL data cleaning project

32
3 年前

This is a binary classification problem related with Autistic Spectrum Disorder (ASD) screening in Adult individual. Given some attributes of a person, my model can predict whether the person would have a possibility to get ASD using different Supervised Learning Techniques and Multi-Layer Perceptron.

Jupyter Notebook
26
7 年前
Java
20
7 个月前

This repo created for sharing the required/discussed files during Online Internship training program on Data Science Using Python in May-2021

Jupyter Notebook
14
4 年前

Make quick and dirty data mining made easier in Sublime Text

Python
12
4 年前

Data cleanse, clustering with Vector Quantization and Adaptive Resonance Theory

C
10
8 年前

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

TypeScript
10
4 年前

Data Cleaning is a python package for data preprocessing. This cleans the CSV file and returns the cleaned data frame. It does the work of imputation, removing duplicates, replacing special characters, and many more.

Python
8
4 年前