Repository navigation

#

emr-cluster

BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics

Jupyter Notebook
138
4 年前

Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS

HCL
74
2 个月前

Bits of code I use during live demos

Jupyter Notebook
31
4 个月前

The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.

Python
27
3 年前

This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/data/blob/main/AdventureWorks.zip, it's a zipped file with some .csvs inside that we will apply transformations.

Smarty
17
3 年前

Apache Spark TPC-DS benchmark setup with EMR launch setup

Smarty
16
3 年前

Uses EMR clusters to export dynamoDB tables to S3 and generates import steps

Shell
11
3 年前

A boilerplate for spark projects with docker support for local development and scripts for emr support.

Scala
9
7 年前

Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracts data from S3, transform data using spark, load transformed data back to S3.

Python
9
4 年前

This project demonstrates the use of Amazon Elastic Map Reduce (EMR) for processing large datasets using Apache Spark. It includes a Spark script for ETL (Extract, Transform, Load) operations, AWS command line instructions for setting up and managing the EMR cluster, and a dataset for testing and demonstration purposes.

Python
7
1 年前

A large-scale data framework that will enable us to store and analyze financial market data and drive future predictions for investment.

TSQL
7
5 年前

Generic python library that enables to provision emr clusters with yaml config files (Configuration as Code)

Python
3
2 年前

Collection of code for submitting Spark/Hadoop/Hive/Pig tasks to EMR (AWS Elastic MapReduce) | #DE

Scala
3
5 年前