Repository navigation

#

massive-datasets

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

Java
1586
4 个月前

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

Python
219
3 天前

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

Makefile
79
5 个月前

Command line tool to quickly generate a lot of files in a lot of directories

C++
6
3 年前

Building a Bloom Filter on English dictionary words

Jupyter Notebook
4
4 年前

The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is illicit, using the attribute "Is Laundering" as a label to be predicted.

Jupyter Notebook
3
8 个月前

This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"

TeX
2
8 个月前

gipa -- compression/decompression tool to package compress and encode massive archive files with floating-point data

Python
2
8 年前

Building PageRank algorithm on Web Graph around Stanford.edu using NetworkX python library

Jupyter Notebook
2
4 年前

TF-Package: Multiple-Input Multiple-Output Keras Data-Generator for massive and complex datasets

Python
1
2 年前

Permite abrir e manipular arquivos massivos de texto/dados cujo seria impossivel abrir em um computador, por exemplo um arquivo de texto de +20gb, permite manipular o arquivo pegando apenas as linhas necessárias sem travar o computador por falta de memória.

Python
1
3 年前

Series of SQL exercise working with databases, using Google BigQuery to scale to massive datasets taught by educators in Kaggle.com

Jupyter Notebook
1
6 年前

📺 Content Recommendation System for the Netflix Prize Challenge with Collaborative Filtering.

Jupyter Notebook
1
1 年前

Calculate statistical measures of one column in big data Datasets with these simply Hadoop Application

Java
1
8 年前

Training the MASSIVE dataset by Amazon(english-US, German-DE and Swahili-KE)

Python
0
2 年前

Map Reduce program to suggest new friends based on count of mutual friends

Java
0
7 年前

Lab assignments for the Analysis of Massive Data Sets course @ FER, University of Zagreb

C#
0
7 年前