Repository navigation

#

data-processing

onceupon/Bash-Oneliner

A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.

10565
6 个月前

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Go
7627
13 天前
NVIDIA/DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++
5523
1 天前

A lightweight data processing framework built on DuckDB and 3FS.

Python
4787
7 个月前
dashbitco/broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

Elixir
2577
3 天前

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python
2389
4 年前

Kubernetes-native platform to run massively parallel data/streaming jobs

Rust
2274
19 小时前

Extract Transform Load for Python 3.5+

Python
1602
2 年前

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Jupyter Notebook
1396
6 个月前

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python
1364
4 天前
allenai/dolma

Data and tools for generating and inspecting OLMo pre-training data.

Python
1321
10 天前