Repository navigation

#

training-data

diffgram/diffgram

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

Python
1863
5 个月前

skweak: A software toolkit for weak supervision applied to NLP tasks

Python
922
8 个月前

A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.

Python
505
20 天前

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

Python
468
10 个月前

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

Python
407
20 天前

A lightweight web application for brushing labels onto time series data; useful for building training sets.

JavaScript
173
2 年前

Augmenty is an augmentation library based on spaCy for augmenting texts.

Python
153
1 年前

Aubo i5 Dual Arm Collaborative Robot - RealSense D435 - 3D Object Pose Estimation - ROS

C++
120
3 年前

Natural Language Data Augmentation Tool for Conversational Systems

Python
115
2 年前

Generating training data from the Carla driving simulator in the KITTI dataset format

Python
110
6 年前

Collection of casual conversations that can be used with the Rasa Stack

Python
85
5 年前

SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.

48
1 年前

COVID-19 Coughs files for training AI models

Python
41
5 年前

Convert all files in git repository to .txt files. Useful for training LLMs on your codebase.

Python
38
4 个月前