Repository navigation

#

datacleaning

OpenRefine is a free, open source power tool for working with messy data and improving it

Java
11469
2 小时前

Open-source low code data preparation library in python. Collect, clean and visualization your data in python with a few lines of code.

Python
2194
1 年前
JavaScript
897
2 年前

It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualization

Jupyter Notebook
249
2 年前

Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.

Python
124
8 天前

This repository contains data and code used to get and clean data from https://github.com/CSSEGISandData/COVID-19 and https://www.worldometers.info/coronavirus/

Jupyter Notebook
99
5 年前

An open-source package for python to clean raw text data

Python
70
2 年前

Benchmark for bi-level optimization solvers

Python
48
3 个月前

data and code for scrapping and cleaning data on covid-19 in India from https://www.mohfw.gov.in/ and https://www.covid19india.org/

Jupyter Notebook
41
5 年前

Validate on a table in a DB, using dbplyr

R
33
3 年前

对通达信数据进行去重和清洗处理,并将数据存入MongoDB,方便往后研究

Python
28
7 年前

This repo contains 4 different projects. Built various machine learning models for Kaggle competitions. Also carried out Exploratory Data Analysis, Data Cleaning, Data Visualization, Data Munging, Feature Selection etc

Jupyter Notebook
26
4 年前

Predicts home prices of Bangalore. Used Flutter, Flask and Jupyter Notebook.

Jupyter Notebook
18
4 年前

Personal finance database creation, SQL analysis, and Power BI dashboard

Jupyter Notebook
18
2 年前

The project provides a real-world dataset focusing on supply chain analytics

Jupyter Notebook
18
2 年前

Code to enable OpenRefine to run as an authenticated web service

JavaScript
17
11 年前

Worked on a dataset of high entropy alloys which is used to design materials for additive manufacturing. Being responsible for Performing Data Analysis and constructing Machine learning algorithms, including neural networks, Gradient boosting for carrying predictions useful for advanced material invention.

Jupyter Notebook
17
3 年前