Repository navigation

#

datalakehouse

Open Control Plane for Tables in Data Lakehouse

Java
370
9 天前

Lakevision is a tool which provides insights into your Apache Iceberg based Data Lakehouse.

Python
44
11 天前

"바로 쓰는 오라클 클라우드 - Build and Delploy Modern Apps with Oracle Cloud"의 전체 소스코드 저장소입니다.

Jupyter Notebook
7
2 年前

Projeto dbt do Data Lake da Secretaria Municipal de Saúde

Shell
5
1 天前

Connecting prestodb with external databases like mongodb, elasticsearch, mysql, hadoob etc to manipulate big data

2
2 年前

Meu décimo primeiro projeto em que crio um datalakehouse usando computação distribuído no databricks

HTML
1
3 年前

This repository provides a modular and easy-to-extend ETL pipeline that streams data from a PostgreSQL database into a StarRocks data warehouse using RisingWave as the real-time streaming computation layer.

1
5 个月前

A prototype for implementing datalake catalog management only based on arbitrary file systems

Java
1
3 个月前

This project serves as a personal lab for developing and honing skills in distributed data processing and data lake architecture.

Java
1
3 个月前

Real-Time Healthcare Data Lakehouse for Predictive Analytics (Synthea, Faker, Kafka, Spark, Delta Lake, BigQuery)

Jupyter Notebook
0
1 个月前

A scalable and optimized data warehouse solution designed for efficient data integration, transformation, and analytics. This project demonstrates ETL workflows, dimensional modeling, and query performance tuning using modern data warehousing practices.

TSQL
0
1 天前

Building a modern data warehouse with PostgreSQL, including ETL processes, data modeling, and analytics.

PLpgSQL
0
6 个月前

This repo is to run a quick demo for how to spin up an Apache Iceberg application.

0
7 个月前

We will create a sample lakehouse using Docker, execute an ETL process with Spark, and then access the data in the Iceberg table format from the Nessie Catalog.

Jupyter Notebook
0
1 年前