Repository navigation

map-reduce

Website
Wikipedia

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

distributed-computing map-reduce Go distributed-systems

3538

292

1 个月前

numaproj / numaflow

Kubernetes-native platform to run massively parallel data/streaming jobs

Kubernetes stream-processing data-processing pipeline map-reduce Hacktoberfest

Rust

2244

141

2 天前

Qihoo360 / poseidon

A search engine which can hold 100 trillion lines of log data.

poseidon search-engine Go big-data map-reduce

1987

434

8 年前

JuliaFolds / Transducers.jl

Efficient transducers for Julia

Julia 语言 transducers parallel high-performance map-reduce distributed-computing iterators

Julia

440

9 天前

tirthajyoti / Spark-with-Python

Fundamentals of Spark with Python (using PySpark), code examples

pyspark Apache Spark dataframe 机器学习 big-data 数据库 map-reduce Python hdfs analytics hadoop distributed-computing parallel-computing SQL apache

Jupyter Notebook

350

273

3 年前

tkf / ThreadsX.jl

Parallelized Base functions

Julia 语言 high-performance map-reduce transducers sorting-algorithms parallel

Julia

330

9 天前

phelps-sg / python-bigdata

Data science and Big Data with Python

数据科学 Python hbase NumPy numerical-methods notebook-jupyter Apache Spark map-reduce

Jupyter Notebook

135

164

2 年前

xarray-contrib / flox

Fast & furious GroupBy operations for dask.array

dask xarray map-reduce

Python

133

5 天前

asavinov / prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

workflow data-processing map-reduce Apache Spark pandas Python feature-engineering 数据科学 data-wrangling data-preprocessing data-preparation business-intelligence olap

Python

4 年前