Repository navigation

#

apache-iceberg

risingwavelabs/risingwave

Real-time event streaming platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.

Rust
8409
3 小时前
matanolabs/matano
Rust
1617
9 个月前

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB , MySQL and Oracle

Go
1137
11 小时前

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Java
1107
11 天前

Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake

Rust
497
3 天前

The Control Plane for Apache Iceberg.

TypeScript
359
12 天前

Compaction runtime for Apache Iceberg.

Rust
89
6 天前

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

Dockerfile
74
2 年前

Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Jupyter Notebook
47
3 年前

DAIVI is a reference solution with IAC modules to accelerate development of Data, Analytics, AI and Visualization applications on AWS using the next generation Amazon SageMaker Unified Studio. The goal of the DAIVI solution is to provide engineers with sample infrastructure-as-code modules and application modules to build their data platforms.

HCL
33
15 天前

Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with AWS Glue Streaming and DMS

Python
33
8 个月前

An open-source, community-driven REST catalog for Apache Iceberg!

Go
29
1 年前

This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenarios using best practices. The code can be deployed into any Spark compatible engine like Amazon EMR Serverless or AWS Glue. A fully local developer environment is also provided.

Java
27
3 个月前

Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3

Python
25
1 年前