Repository navigation

#

pdf-processing

TypeScript
863
2 年前
Python
785
1 年前
Python
131
8 个月前

Official Python client library for Nutrient Document Web Services API - PDF processing, OCR, watermarking, and document manipulation with automatic Office format conversion

Python
54
1 个月前

This library provides a type-safe and ergonomic interface for document processing operations including conversion, merging, compression, watermarking, and text extraction using Nutrient DWS Processor API.

TypeScript
35
1 个月前

Anthropic's Contextual Retrieval implementation with visual chunk comparison. Preview context enrichment before/after embedding.

HTML
24
10 天前

Python scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.

Python
23
2 年前

Local, privacy-friendly resume analysis: convert, classify, and get advice using TF‑IDF, Logistic Regression, and sentence-transformer embeddings.

Python
16
11 天前

LangGraphRAG: A terminal-based Retrieval-Augmented Generation system using LangGraph. Features include message history caching, query transformation, and vector database retrieval. Ideal for NLP researchers and developers working on advanced conversational AI and information retrieval systems.

Python
14
1 年前

A NPM Package built on top of pdf-lib that provides functonalities like merge, rotate, split,download pdf to disk and many more...

JavaScript
14
2 年前

A powerful, multi-modal Telegram bot leveraging cutting-edge AI technologies including Gemini, DeepSeek, OpenRouter, and 50+ AI models for comprehensive conversational assistance, media processing, and collaborative features with MCP (Model Context Protocol) integration.

Python
8
7 天前

📚 AI-Powered Book EPUB Knowledge Extractor & Summarizer Transform your PDF books into structured knowledge effortlessly! This tool leverages AI to analyze books page by page, extracting key insights, definitions, and concepts, and organizes them into Markdown summaries for easier study

Python
8
6 天前

An all-in-one GUI management toolkit built with PyQt6, offering a suite of tools for file synchronization, media organization, PDF merging, code formatting, and more.

Python
6
7 个月前

AI-powered RAG-based tool for summarizing, extracting insights, and answering questions about research papers with high accuracy

HTML
6
6 个月前

The Document Summarizer leverages Hugging Face’s facebook/bart-large-cnn model to transform lengthy documents into concise summaries. Built with ReactJS (Vite) for the frontend and Flask for the backend, it supports PDF and text files, offering real-time summarization for researchers, students, and professionals.

JavaScript
4
10 个月前

Local RAG app with zero-config Docker setup. FastAPI + Streamlit + Qdrant + Ollama. Just run `docker-compose up --build`! 🚀

Python
4
2 个月前