Repository navigation

#

etl-framework

Logstash - transport and process your logs, events, or other data

Java
14607
2 小时前

Flow-based programming for JavaScript

JavaScript
3536
1 年前

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

Jupyter Notebook
2237
12 天前

This repository is a getting started guide to Singer.

Makefile
1315
11 天前

Making data lake work for time series

Python
1180
1 年前

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

Python
860
2 年前

ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)

C#
830
2 个月前

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

Java
718
1 个月前

A simplified, lightweight ETL Framework based on Apache Spark

Scala
589
2 年前

Extract, Transform, Load: Any SQL Database in 4 lines of Code.

Python
556
6 年前

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.

Python
246
11 天前

Bender - Serverless ETL Framework

Java
188
2 年前

Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.

Python
175
5 个月前