Repository navigation

#

apache-iceberg

risingwavelabs/risingwave

Real-time event streaming platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.

Rust
8268
7 分钟前
matanolabs/matano
Rust
1603
7 个月前

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Java
1092
3 天前

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB , MySQL and Oracle

Go
1010
27 分钟前

Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake

Rust
436
1 小时前

The Control Plane for Apache Iceberg.

TypeScript
334
12 天前

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

Dockerfile
73
2 年前

Comptaction runtime for Apache Iceberg.

Rust
69
2 天前

Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Jupyter Notebook
47
3 年前
JavaScript
42
3 个月前

Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with AWS Glue Streaming and DMS

Python
33
6 个月前

DAIVI is a reference solution with IAC modules to accelerate development of Data, Analytics, AI and Visualization applications on AWS using the next generation Amazon SageMaker Unified Studio. The goal of the DAIVI solution is to provide engineers with sample infrastructure-as-code modules and application modules to build their data platforms.

HCL
32
16 天前
Python
29
19 天前

An open-source, community-driven REST catalog for Apache Iceberg!

Go
29
1 年前

This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenarios using best practices. The code can be deployed into any Spark compatible engine like Amazon EMR Serverless or AWS Glue. A fully local developer environment is also provided.

Java
26
2 个月前

Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3

Python
24
1 年前