Repository navigation
ai-scraping
- Website
- Wikipedia
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥
Python scraper based on AI
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Lightweight library for scraping web-sites with LLMs
🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
Oxylabs AI Studio python SDK
Crawl a website starting from a URL, find relevant pages, and extract data – all guided by your natural language prompt.
⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.
AI web scraper built with Crawl4AI for extracting structured leads data from websites.
How to guides on web-crawling or scraping
Python, Javascript, and Rust libraries for the Spider Cloud API.
All Scrapers Resource Available Here! Give Us Stars🌟
Fastest and cheapest distributed residential proxy network.
Extract Google Maps business leads and enrich contact details using AI & web scraping
ScrapeGraphAI is a Python-based web-scraping framework that pairs large-language-model reasoning with a graph-style pipeline engine to turn websites (or local XML/HTML/JSON/Markdown files) into structured data with just a handful of lines of code.
AI Scraper : scrap and extract data from website in any format (CSV, JSON, HTML...) using Selenium or Crawl4ai, and using Ollama or Sambanova API, and using Streamlit for UI as chatbot
A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications