Repository navigation

#

sparksql

Scala
2164
2 天前

Scala examples for learning to use Spark

Scala
445
5 年前

Process Common Crawl data with Python and Spark

Python
442
3 个月前

An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)

JavaScript
382
2 年前

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

C#
304
5 个月前

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Scala
282
7 年前
Jupyter Notebook
251
1 年前

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Java
199
5 年前

Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试

Java
114
8 年前

PySpark functions and utilities with examples. Assists ETL process of data modeling

Jupyter Notebook
104
5 年前

Read SparkSQL parquet file as RDD[Protobuf]

Scala
93
7 年前

A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino

Jupyter Notebook
90
2 个月前

Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.

R
86
4 年前

type-class based data cleansing library for Apache Spark SQL

Scala
78
6 年前

Bulletproof Apache Spark jobs with fast root cause analysis of failures.

Scala
73
4 年前

已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.

Scala
57
4 年前