Repository navigation

#

data-testing

Data validation made beautiful and powerful

Python
286
2 小时前

re_data - fix data issues before your users & CEO would discover them 😊

Python
101
1 年前

A simple and easy to use Data Validation library for Python.

Python
77
2 个月前

DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring

Python
64
18 天前

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

Python
64
3 年前

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

Jupyter Notebook
13
2 年前

This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.

Python
10
4 年前

Spark Data Test - A PySpark-based automation testing utility to compare Spark DataFrames

Python
5
2 个月前

Data generation and validation tool for any data source

Scala
5
2 年前

Software Testing in Open Source and Data Science: A talk delivered at the Data Umbrella speaker series

4
2 年前