Repository navigation

#

streaming-algorithms

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.

Java
651
4 个月前
C++
198
4 年前

Performant implementations of various streaming algorithms, including Count–min sketch, Top k, HyperLogLog, Reservoir sampling.

Rust
88
1 年前

Estimating k-mer coverage histogram of genomics data

C++
77
2 年前

Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering

C++
76
4 年前

t-digest module for Redis

C
73
5 年前

Streaming operations on NumPy arrays

Python
37
7 个月前

An online statistics library, written in Go

Go
34
5 个月前

A Set of Streaming Algorithms in C++, Python, and Go

C++
33
8 年前

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

Python
22
6 个月前

Streaming, Memory-Limited, r-truncated SVD Revisited!

MATLAB
21
4 年前

This is the codebase for Faucet, described in our manuscript: https://academic.oup.com/bioinformatics/article/34/1/147/4004871, by Roye Rozov, Gil Goldshlager, Eran Halperin, and Ron Shamir

C++
18
8 年前

Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)

R
16
1 年前

A simple, time-tested, family of random hash functions in Python, based on CRC32 and xxHash, affine transformations, and the Mersenne Twister. 🎲

Python
9
3 年前

Create MPEG2-TS encapsulated stream-segments.

C
8
8 年前

Preview large text files online

HTML
6
3 年前