Repository navigation

#

parquet

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

C++
15845
1 小时前

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Go
3848
2 年前
Rust
3339
2 个月前

Official Rust implementation of Apache Arrow

Rust
3105
2 小时前

Apache Parquet Java

Java
2913
4 天前
rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

Go
2278
9 分钟前

Apache Parquet Format

Thrift
2011
2 天前

Apache Drill is a distributed MPP query layer for self describing data

Java
1989
12 天前

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Python
1851
10 天前

A large-scale entity and relation database supporting aggregation of properties

Java
1791
2 个月前
Mooncake-Labs/pg_mooncake
Rust
1622
1 小时前

cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes

Rust
1425
7 个月前

Open-source Snowflake and Fivetran alternative bundled together

Go
1411
4 小时前

Quilt is a data mesh for connecting people with actionable data

TypeScript
1346
6 小时前

Query anything (GitHub, Notion, +40 more) with SQL and let LLMs (ChatGPT, Claude) connect to using MCP

Go
1266
10 天前
Rust
1138
5 天前

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

Scala
1031
1 个月前

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB , MySQL and Oracle

Go
1009
8 小时前

Simple Windows desktop application for viewing & querying Apache Parquet files

C#
957
2 天前