Repository navigation

#

parquet

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

C++
15259
2 天前

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Go
3803
2 年前
Rust
3286
1 个月前

Official Rust implementation of Apache Arrow

Rust
2888
3 天前

Apache Parquet Java

Java
2786
3 天前
rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

Go
2011
1 天前

Apache Drill is a distributed MPP query layer for self describing data

Java
1963
10 天前

Apache Parquet Format

Thrift
1929
3 天前

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Python
1831
1 年前

A large-scale entity and relation database supporting aggregation of properties

Java
1781
2 个月前

Single-binary Postgres read replica optimized for analytics

Go
1351
3 天前

cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes

Rust
1347
3 个月前

Quilt is a data mesh for connecting people with actionable data

TypeScript
1335
2 天前

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

Scala
1020
19 天前

Simple Windows desktop application for viewing & querying Apache Parquet files

C#
860
2 个月前

ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)

C#
825
4 天前

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB and MySQL

Go
813
2 天前