Repository navigation
map-reduce
- Website
- Wikipedia
Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.
A search engine which can hold 100 trillion lines of log data.
Kubernetes-native platform to run massively parallel data/streaming jobs
Efficient transducers for Julia
Fundamentals of Spark with Python (using PySpark), code examples
Parallelized Base functions
Data science and Big Data with Python
Fast & furious GroupBy operations for dask.array
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Data-parallelism on CUDA using Transducers.jl and for loops (FLoops.jl)
The core parallel and shared memory library used by Hack, Flow, and Pyre
RedisGears python client
A lightweight modern map reduce framework brought to k8s
There are Python 2.7 codes and learning notes for Spark 2.1.1
Inverted Indexer, web crawler, sort, search and poster steamer written using Python for information retrieval.
Appengine Datastore Mapper in Go