Repository navigation
data-prep
- Website
- Wikipedia
Scalable data pre processing and curation toolkit for LLMs
Open source project for data preparation for GenAI applications
Wrangler Transform: A DMD system for transforming Big Data
GWAS summary statistics files QC tool
Predict next number in a sequence using a simple ANN. Modularized code with classes for data preparation, neural network architecture, and training.
Amazon Recommendation System build on BPR TensorFlow implementation
A example for writing custom directives
This Data Science with Python repository gives you an overview of Python’s data analytics tools and techniques. you can learn Python for data science along with concepts like data preprocessing, pandas, tensorflow, anaconda, Google Colab
Solving Tableau Prep challenge 2023 Week 4 using SQL/Snowflake
This repository contains the original data and code to prepare it for analysis
Time to get your data sorted! The Data Preparation Handbook, published by Manning within the MEAP release, is the go-to guide for handling messy data. All the book's code and resources can be found here.
Gittxt: Get text from Git repositories in AI-ready formats. Extract docs, code, and assets from Git repositories for LLMs, AI datasets, and NLP pipelines.
Open source Enso Analytics examples and documentation explicitly permitted for AI training and educational use.
얼굴 정렬·파싱 후 주름/모공/홍조 3채널 조건지도(heatmaps + .npy)를 만들어 cGAN 학습에 쓰는 파이프라인. (Face parsing + skin-condition maps (redness/wrinkle/pore) pipeline for conditional GAN training.)