Repository navigation

#

hadoop-hdfs

seaweedfs/seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. Enterprise version is at seaweedfs.com.

Go
25541
2 小时前

MorphL Community Edition uses big data and machine learning to predict user behaviors in digital products and services with the end goal of increasing KPIs (click-through rates, conversion rates, etc.) through personalization

Python
263
6 年前

A tool for scale and performance testing of HDFS with a specific focus on the NameNode.

Java
132
2 年前

Big Data essentials: Hadoop, MapReduce, Spark. Explore tutorials and demos in Jupyter notebooks—most are self-contained and live, ready to run with a click.

Jupyter Notebook
81
23 天前

Learn how to use Spark SQL and HSpark connector package to create / query data tables that reside in HBase region servers

69
3 年前

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Python
56
2 年前

旅游网站(携程网部分数据)大数据分析-hadoop课程设计(本科课设级别)

Java
37
1 年前

A fully-functional Hadoop Yarn cluster as docker-compose deployment.

Shell
23
3 天前

Ansible Playbook For Setup Hadoop HDFS

Jinja
23
3 年前

Open source data infrastructure platform. Designed for developers, built for speed.

TypeScript
22
3 年前

Twitter + Flume + Hadoop (HDFS, MapReduce) + Neo4j + Pyhton

JavaScript
15
3 年前

Marathon on yarn

Java
13
2 年前

A Java Hdfs client example and full Kerberos example for call hadoop commands directly in java code or on your local machine.

Java
12
8 年前

λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)

Java
11
5 个月前