Repository navigation

#

ingestion-pipeline

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

Python
2940
17 小时前

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database

Python
268
3 年前

AzLogDcrIngestPS - Unleashing the power of Log Ingestion API with Azure LogAnalytics custom table v2, Azure Data Collection Rules and Azure Data Ingestion Pipeline

PowerShell
32
3 个月前

Google Cloud Storage connector, pre-processor and model for predicting user search intent based on keywords

Python
25
5 年前

Google Analytics connector, pre-processor and model for predicting churning users for digital publishers.

Python
10
6 年前

My experiments with Apache Spark for Humans ⭐

Java
6
2 年前

Extract Transform and Load unstructured data into the Clarifai's AI platform

Python
6
18 天前

DataStax or Cassandra Ingest from Relational Databases with StreamSets

PLSQL
4
6 年前

Sample Azure Data Factory pipeline for ingesting Data Packages directly from the Download API of the Ordnance Survey Data Hub into Azure Storage.

4
3 年前

Data stack for WeLearn LPI projects. This pipeline can collect, vectorize and store data from various sources.

HTML
2
21 小时前

Created a data pipeline using sqoop to ingest data from sql server into the hive table and used hive for feature engineering and analysis.

Shell
2
5 年前

The multinational retail data contralisation project is a data warehousing project that focuses on ingesting data from disparate sources to create a centralised warehouse

Python
1
2 年前

Explores the design and implementation of a modern, adaptable data infrastructure using microservices.

1
1 个月前

A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangGraph. Inspired from my Upgrad_IIITB PG Course.

Jupyter Notebook
1
4 个月前

Automating Ingestion Excel Files On To Azure Data Studio (SQL-Server)

Jupyter Notebook
1
3 年前

A Question Answering(Q/A) Chatbot on Insurance Documents. Powered by Retrieval Augmented Generation(RAG), LlamaIndex and LangChain. Inspired from my Upgrad_IIITB PG Course.

Jupyter Notebook
1
4 个月前

Transform incoming AWS WorkMail email with Excel attachment to CSV and save to S3 bucket

Python
0
2 年前