Repository navigation

etl-job

Website
Wikipedia

AlexIoannides / pyspark-example-project

Implementing best practices for PySpark ETL jobs and applications.

pyspark etl-job Python data-engineering Apache Spark 数据科学 etl etl-pipeline

Python

2000

770

3 年前

san089 / goodreads_etl_pipeline

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

etl-pipeline etl-framework Apache Spark apache-airflow airflow redshift emr-cluster livy s3 data-lake scheduler data-migration data-engineering data-engineering-pipeline Python etl-job

Python

1422

238

6 年前

paillave / Etl.Net

Mass processing data with a complete ETL for .net developers

etl .NET dotnet-standard business-intelligence extract transform load CSV csv-parser csv-reader entity-framework etl-job sftp

771

104

10 天前

DataWithBaraa / sql-data-warehouse-project

A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.

数据分析 data-analytics data-cleaning data-engineering 数据科学 data-warehouse data-warehousing datalake datascience datawarehouse etl etl-job etl-pipeline SQL sql-query sql-server

TSQL

356

288

5 个月前

jbogard / bulk-writer

Provides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.

etl SQL sqlbulkcopy pipeline etl-job

244

1 年前

visiologyofficial / vixtract

etl etl-pipeline etl-framework etl-job

HTML

1 年前

cloudposse / terraform-aws-glue

Terraform modules for provisioning and managing AWS Glue resources

Amazon Web Services etl etl-job glue workflow

HCL

3 个月前

felipefrizzo / terraform-aws-kinesis-firehose

This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.

Terraform terraform-aws terraform-provider parquet big-data etl-job analytics

HCL

4 年前

ktnsh24 / DataModelling

This repo will guide you step-by-step method to create star schema dimensional model.

etl-job SQL MySQL

Python

4 年前

nsphung / pyspark-template

A Python PySpark Projet with Poetry

poetry pyspark Project Python template black pytest isort Jupyter Notebook Apache Spark spark-sql data-engineering 数据科学 etl-job etl etl-pipeline

Jupyter Notebook

3 个月前

michaelbironneau / analyst

A declarative, SQL-like DSL for data integration tasks.

SQL data-integration etl etl-job

7 年前

kishlayjeet / Twitter-Data-Pipeline-using-Airflow-and-AWS-S3

An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.

airflow apache-airflow boto3 data-engineering data-engineering-pipeline data-pipeline etl etl-job etl-pipeline Python s3 scheduler tweepy X (Twitter)twitter-api

Python

2 年前

yennanliu / AirflowJob

Airflow POC demo : 1) env set up 2) airflow DAG 3) Spark/ML pipeline | #DE

etl-job airflow Apache Spark data-engineering 机器学习 Instagram etl Docker Python travis 数据科学 infrastructure environment elt etl-pipeline dag Shell

Python

3 年前

ankiano / etl

Extract transform load CLI tool for extracting small and middle data volume from sources (databases, csv files, xls files, gspreadsheets) to target (databases, csv files, xls files, gspreadsheets) in free combination.

etl data-engineering data-pipeline etl-job extractor 数据库 Google Sheets data-lake elt business-intelligence

Python

4 个月前

Joshua-omolewa / Retailstore_ETL_pipeline_project

Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and transforms the raw data (ETL process) using Apache spark to meet business requirements and also enables Data Analyst create Data Visualization using Superset. Airflow is used to orchestrate the pipeline

airflow Docker etl-job etl-pipeline Python snowflake Apache Spark

Python

2 年前

TheCocoTeam / source-watcher-core

This is a PHP project which combines ETL with different strategies to extract data from multiple databases, files, and services, transform it and load it into multiple destinations.

etl etl-framework etl-pipeline etl-job CSV transformation

PHP

6 个月前