Repository navigation

#

streaming-algorithms

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.

Java
631
7 个月前
C++
195
4 年前

Performant implementations of various streaming algorithms, including Count–min sketch, Top k, HyperLogLog, Reservoir sampling.

Rust
86
8 个月前

Estimating k-mer coverage histogram of genomics data

C++
78
1 年前

Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering

C++
74
3 年前

t-digest module for Redis

C
73
4 年前

Streaming operations on NumPy arrays

Python
36
3 个月前

A Set of Streaming Algorithms in C++, Python, and Go

C++
33
8 年前

An online statistics library, written in Go

Go
33
1 个月前

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

Python
22
2 个月前

Streaming, Memory-Limited, r-truncated SVD Revisited!

MATLAB
21
3 年前

This is the codebase for Faucet, described in our manuscript: https://academic.oup.com/bioinformatics/article/34/1/147/4004871, by Roye Rozov, Gil Goldshlager, Eran Halperin, and Ron Shamir

C++
18
8 年前

Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)

R
16
8 个月前

A simple, time-tested, family of random hash functions in Python, based on CRC32 and xxHash, affine transformations, and the Mersenne Twister. 🎲

Python
9
3 年前

Create MPEG2-TS encapsulated stream-segments.

C
8
8 年前

Python-Wrapper for Francesco Parrella's OnlineSVR C++ implementation with scikit-learn-compatible interface.

C++
6
7 个月前