Repository navigation

#

data-processing

onceupon/Bash-Oneliner

A collection of handy Bash One-Liners and terminal tricks for data processing and Linux system maintenance.

10525
5 个月前

Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.

Go
7529
18 小时前
NVIDIA/DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++
5484
11 小时前

A lightweight data processing framework built on DuckDB and 3FS.

Python
4762
6 个月前
dashbitco/broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

Elixir
2570
21 天前

Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

Python
2388
4 年前

Kubernetes-native platform to run massively parallel data/streaming jobs

Rust
2244
2 天前

Extract Transform Load for Python 3.5+

Python
1600
2 年前

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Jupyter Notebook
1391
5 个月前
allenai/dolma

Data and tools for generating and inspecting OLMo pre-training data.

Python
1298
1 天前

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python
1144
2 天前