Repository navigation

#

data-cleaning

Jupyter notebook and datasets from the pandas video series

Jupyter Notebook
2236
1 年前
R
1422
8 个月前

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python
1146
2 天前

An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型,GPU部署,数据清理) 致敬: LLaMA, MOSS, BELLE, Ziya, vLLM

Jupyter Notebook
837
1 个月前

Easy to use Python library of customized functions for cleaning and analyzing data.

Python
520
4 个月前

Schema-Inspector is a simple JavaScript object sanitization and validation module.

JavaScript
504
9 个月前

The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.

Python
453
3 个月前

Professional data validation for the R environment

R
425
2 个月前

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

C++
414
19 天前

Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!

Python
377
3 年前