Repository navigation
dataset
- Website
- Wikipedia
A collective list of free APIs
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Faker is a Python package that generates fake data for you.
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
A MNIST-like fashion product database. Benchmark 👇
Open source annotation tool for machine learning practitioners.
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Techniques for deep learning with satellite & aerial imagery
Curated list of Machine Learning, NLP, Vision, Recommender Systems Project Ideas
PHP-ML - Machine Learning library for PHP
Documentation on how to access and use the Quick, Draw! Dataset.
A powerful tool for creating fine-tuning datasets for LLM
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
This repository contains compatibility data for Web technologies as displayed on MDN
esProc SPL is a JVM-based programming language designed for structured data computation, serving as both a data analysis tool and an embedded computing engine.
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard