Repository navigation

#

training-data

diffgram/diffgram

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

Python
1877
9 个月前

skweak: A software toolkit for weak supervision applied to NLP tasks

Python
926
1 年前

A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.

Python
508
5 个月前

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

Python
481
9 小时前

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

Python
451
1 个月前

A lightweight web application for brushing labels onto time series data; useful for building training sets.

JavaScript
176
2 年前

Augmenty is an augmentation library based on spaCy for augmenting texts.

Python
156
1 年前

Aubo i5 Dual Arm Collaborative Robot - RealSense D435 - 3D Object Pose Estimation - ROS

C++
123
3 年前

Natural Language Data Augmentation Tool for Conversational Systems

Python
115
3 年前

Generating training data from the Carla driving simulator in the KITTI dataset format

Python
109
6 年前

Collection of casual conversations that can be used with the Rasa Stack

Python
85
5 年前

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.

49
2 年前

Convert all files in git repository to .txt files. Useful for training LLMs on your codebase.

Python
42
8 个月前

COVID-19 Coughs files for training AI models

Python
40
5 年前