Repository navigation

#

datawarehouse

DataLinkDC/dinky

Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.

Java
3392
9 天前

Postgres-native columnar storage extension

C
2933
2 个月前

Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks.

Rust
1543
10 个月前

从数据仓库到用户画像,从数据建设到数据应用

580
3 年前

A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

535
1 个月前

An open-source columnar data format designed for fast & realtime analytic with big data.

Java
452
2 年前

Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!

C#
419
9 个月前

Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases

Python
229
3 年前

Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.

JavaScript
226
22 天前

Hydra九头龙,面向PB级别知识库取数、情报系统、数据平台、大规模控制调度系统。建设云计算资源管理、任务/服务统一调度、数仓、微服务化、中台基建系统化能力。——以实现大规模分布式爬虫搜索引擎为例。

Java
221
7 天前

Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)

Go
173
1 天前

A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.

TSQL
116
18 天前

All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Certificate specialization are stored in this repository.

Jupyter Notebook
86
2 年前

A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.

67
3 个月前

implementing an end-to-end tweets ETL/Analysis pipeline.

Python
57
2 年前

Accelerator to build a Microsoft Fabric modern data platform using pre-built reusable Fabric items and an orchestration ELT Framework

Python
55
5 天前

AlphaSQL provides Integrated Type and Schema Check and Parallelization for SQL file set mainly for BigQuery

C++
53
3 年前