Repository navigation

parquet

Website
Wikipedia

QuestDB is a high performance, open-source, time-series database

time-series low-latency 数据库 SQL Grafana simd questdb tsdb Java PostgreSQL C++time-series-database financial-analysis capital-markets market-data olap real-time-analytics tick-data kdb parquet

Java

16209

1486

3 小时前

apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

arrow parquet

C++

16021

3872

1 天前

multiprocessio / dsq

Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.

Go CSV JSON tsv excel openoffice-calc parquet SQL 命令行界面

3855

165

2 年前

roapi / roapi

Create full-fledged APIs for slowly moving datasets without writing a single line of code.

SQL GraphQL arrow REST API analytics Query (disambiguation)columnar Rust in-memory-database datafusion blob-storage cloud-native parquet datasets s3 delta-lake

Rust

3347

204

3 个月前

dathere / qsv

Blazing-fast Data-Wrangling toolkit

CSV data-wrangling Open Data data-engineering ckan excel luau parquet PostgreSQL SQLite polars SQL geocode timeseries dcat metadata 统计 sampling 人工智能

Rust

3203

6 小时前

apache / arrow-rs

Official Rust implementation of Apache Arrow

arrow parquet Rust

Rust

3156

1016

13 小时前

apache / parquet-java

Apache Parquet Java

parquet apache parquet-java

Java

2951

1483

5 天前

rilldata / rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

duckdb Svelte sveltekit 数据可视化 CSV parquet Go s3 gcs 数据分析 SQL bi data business-analytics sql-editor

2357

149

1 天前

apache / parquet-format

Apache Parquet Format

parquet apache

Thrift

2055

451

9 天前

apache / drill

Apache Drill is a distributed MPP query layer for self describing data

Java big-data SQL hive hadoop jdbc parquet

Java

1991

987

18 天前

uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Tensorflow PyTorch 深度学习机器学习 pyspark parquet

Python

1859

285

19 天前

gchq / Gaffer

A large-scale entity and relation database supporting aggregation of properties

accumulo graph graph-database hadoop big-data aggregation hbase parquet Apache Spark

Java

1791

363

4 个月前

Mooncake-Labs / pg_mooncake

Real-time analytics on Postgres tables

analytics columnstore delta-lake iceberg lakehouse parquet PostgreSQL

Rust

1728

1 天前

BemiHQ / BemiDB

Open-source Snowflake and Fivetran alternative bundled together

analytics data-warehouse duckdb iceberg olap parquet PostgreSQL

1471

11 天前

paradigmxyz / cryo

cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes

crypto 以太坊 evm parquet Rust

Rust

1446

165

9 个月前

julien040 / anyquery

Query anything (GitHub, Notion, +40 more) with SQL and let LLMs (ChatGPT, Claude) connect to using MCP

API business-intelligence CSV 数据可视化数据库 JSON MySQL parquet SQL SQLite analytics Go GitHub Notion salesforce 人工智能 ChatGPT claude 大语言模型 mcp

1360

2 天前

quiltdata / quilt

Quilt is a data mesh for connecting people with actionable data

data data-engineering data-version-control data-versioning Python serialization parquet

TypeScript

1347

1 天前

tonbo-io / tonbo

A portable embedded database using Arrow.

arrow big-data Rust embedded-database 数据库 htap lsm-tree offline-first parquet

Rust

1184

4 天前

datazip-inc / olake

Fastest open-source tool for replicating Databases to Data Lake in Open Table Formats like Apache Iceberg. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supporting Postgres, MongoDB , MySQL and Oracle

cdc change-data-capture data-pipeline 数据库 elt lakehouse replication apache-iceberg parquet s3 Hacktoberfest

1137

122

11 小时前

bigdatagenomics / adam

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

Apache Spark big-data Bioinformatics genomics parquet avro Scala Java Python R

Scala

1036

315

3 个月前