Repository navigation
datafusion
- Website
- Wikipedia
Apache DataFusion SQL Query Engine
the portable Python dataframe library
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Distributed compute platform implemented in Rust, and powered by Apache Arrow.
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
Apache DataFusion Comet Spark Accelerator
LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.
ClickBench: a Benchmark For Analytical Databases
DuckDB-powered data lake analytics from Postgres
Apache Kafka® compatible broker with S3, PostgreSQL, Apache Iceberg and Delta Lake
Analytical database for data-driven Web applications 🪶
Next-generation decentralized data lakehouse and a multi-party stream processing network
Distributed pushdown cache for DataFusion
Unofficial rust implementation of Apache Iceberg with integration for Datafusion
Batteries included CLI, TUI, and server implementations for DataFusion.
Query and transform data with PRQL
Postgres protocol frontend for DataFusion
Blazing-Fast Bioinformatic Operations on Python DataFrames