Repository navigation

#

data-preprocessing

Easy to use Python library of customized functions for cleaning and analyzing data.

Python
521
4 天前

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

C++
424
4 天前

A dynamic, scalable AI chatbot built with Django REST framework, supporting custom training from PDFs, documents, websites, and YouTube videos. Leveraging OpenAI's GPT-3.5, Pinecone, FAISS, and Celery for seamless integration and performance.

Python
393
2 年前

Deal with bad samples in your dataset dynamically, use Transforms as Filters, and more!

Python
379
3 年前

The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.

C++
138
25 天前

Social Media Mining Toolkit (SMMT) main repository

Python
137
3 年前

SEGAN pytorch implementation https://arxiv.org/abs/1703.09452

Python
110
7 年前

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

95
24 天前

Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"

95
24 天前