Repository navigation

hadoop-mapreduce

Website
Wikipedia

mahmoudparsian / data-algorithms-book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

hadoop-mapreduce Java distributed-computing Scala mapreduce Python 机器学习 pyspark Apache Spark design-patterns

Java

1073

663

6 个月前

bytedance / CloudShuffleService

Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.

flink Apache Spark hadoop-mapreduce

Java

255

1 年前

touero / ctenopharyngodon-idella

Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

FastAPI hadoop hadoop-mapreduce Java mapreduce Maven scraping

Java

140

6 个月前

groda / big_data

Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are self-contained and live—ready to run with a click.

big-data bigdata Apache Spark spark-sql Docker mapreduce pyspark hadoop Jupyter Notebook hadoop-hdfs hadoop-mapreduce

Jupyter Notebook

4 个月前

vim89 / datapipelines-essentials-python

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Apache Spark spark-sql Python pyspark etl etl-pipeline etl-framework XML xml-parsing datalake big-data hadoop hadoop-mapreduce hadoop-hdfs data-pipeline

Python

2 年前

maniram-yadav / Big_DataHadoop_Projects

Big data projects implemented by Maniram yadav

Apache Spark pig hadoop hdfs sqoop hive mapreduce big-data-analytics hadoop-mapreduce hadoop-hdfs flume

PigLatin

7 年前

seraogianluca / k-means-mapreduce

K-Means algorithm implementation with Hadoop and Spark for the course of Cloud Computing of the MSc AIDE at the University of Pisa.

机器学习 hadoop-mapreduce Apache Spark hadoop

Java

5 年前

caizkun / mapreduce-examples

A collection of mapreduce problems and solutions

mapreduce hadoop-mapreduce

Java

8 年前

anjalysam / Hadoop

This contain how to install Hadoop on google colab and how to run map-reduce in Hadoop

hadoop hadoop-mapreduce

Jupyter Notebook

5 年前

absnaik810 / CloudComputing

Projects done in the Cloud Computing course.

hadoop hadoop-mapreduce hbase inverted-index NoSQL hdfs

Java

7 年前

jmaister / wordcount

Hadoop MapReduce word counting with Java

hadoop-mapreduce Java Maven

Java

5 年前

jyzhangchn / FBDP-project2

中文文本挖掘|舆情分析|Hadoop|Java|MapReduce

Java knn naive-bayes hadoop-mapreduce

HTML

7 年前

arshdeepbahga / cloud-computing-solutions-architect-book-code

Source code for the examples in the book Cloud Computing Solutions Architect: A Hands-On Approach by Arshdeep Bahga and Vijay Madisetti

cloud-computing Amazon Web Services aws-lambda aws-s3 aws-dynamodb boto3 aws-ec2 aws-iot aws-sqs aws-apigateway aws-iam serverless-architectures hadoop-mapreduce Apache Spark storm flink spark-streaming MongoDB

CSS

6 年前

benedekh / bigdata-projects

Student projects in Big Data field.

bigdata big-data Apache Spark hadoop hadoop-mapreduce mapreduce

Java

2 个月前

MoustafaAMahmoud / BigDataInDepth

Data Engineering Course

hadoop hadoop-mapreduce Apache Spark distributed-systems Scala kafka

TeX

1 年前

QiushiSun / Distributed-Computing-Systems

2021 Spring (Distributed Computing Systems) 分布式系统与编程

distributed-systems distributed-computing Apache Spark hadoop-mapreduce flink

Java

4 年前

lucas91batista / twitter-hashtag-graph

Twitter + Flume + Hadoop (HDFS, MapReduce) + Neo4j + Pyhton

Twitter hadoop hadoop-mapreduce hadoop-hdfs Neo4j

JavaScript

3 年前

Keerthivasan13 / CSCI572-Information_Retrieval_And_Web_Search_Engines

Search Engine projects

information-retrieval scraping-websites crawling pagerank-algorithm hadoop-mapreduce hadoop apache solr lucene tika jsoup networkx search-engine autocomplete spellchecker PHP

Java

5 年前

rajatgarg149 / BigData-Essentials-HDFS-SPARK-RDD

big-data Coursera Apache Spark mapreduce hadoop hadoop-mapreduce distributed-file-system

Jupyter Notebook

6 年前

James-QiuHaoran / distributed-computing-platform-mapreduce

This repository contains a simple Hadoop-like (MapReduce) distributed computing platform implemented in Java. It is extended from a course project at UIUC awarded the best Java version implementation and it's open-sourced for reference.

mapreduce hadoop hadoop-mapreduce distributed-computing distributed-file-system membership-management distributed-systems cloud-computing

Java

4 年前