Repository navigation

#

hadoop-mapreduce

Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.

Java
261
1 年前

Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

Java
139
10 个月前

Big Data essentials: Hadoop, MapReduce, Spark. Explore tutorials and demos in Jupyter notebooks—most are self-contained and live, ready to run with a click.

Jupyter Notebook
81
23 天前

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Python
56
2 年前

K-Means algorithm implementation with Hadoop and Spark for the course of Cloud Computing of the MSc AIDE at the University of Pisa.

Java
47
5 年前

A collection of mapreduce problems and solutions

Java
35
8 年前

This contain how to install Hadoop on google colab and how to run map-reduce in Hadoop

Jupyter Notebook
33
5 年前

Projects done in the Cloud Computing course.

Java
27
7 年前

Hadoop MapReduce word counting with Java

Java
25
6 年前

中文文本挖掘|舆情分析|Hadoop|Java|MapReduce

HTML
23
8 年前

2021 Spring (Distributed Computing Systems) 分布式系统与编程

Java
16
4 年前

Twitter + Flume + Hadoop (HDFS, MapReduce) + Neo4j + Pyhton

JavaScript
15
3 年前

I installed Hadoop on Virtual Machine and all Assignments are performed on Ubuntu OS. Refer to this repo for completion of the Hadoop Assignments. It is recommended that you have a stable internet connection while doing these things.

Rebol
13
2 年前