Repository navigation

#

ai-scraping

The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥

TypeScript
49397
8 小时前
D4Vinci/Scrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

Python
6453
13 分钟前

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.

TypeScript
1956
19 小时前

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

Python
1750
9 天前
Python
1208
15 小时前

🔥 This repository contains complete application examples, including websites and other projects, developed using Firecrawl.

Jupyter Notebook
507
3 个月前

⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...

JavaScript
80
8 个月前

[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.

Go
73
5 天前

AI web scraper built with Crawl4AI for extracting structured leads data from websites.

Python
41
6 个月前

Python, Javascript, and Rust libraries for the Spider Cloud API.

Python
18
17 天前

All Scrapers Resource Available Here! Give Us Stars🌟

TypeScript
14
1 个月前

Fastest and cheapest distributed residential proxy network.

TypeScript
9
2 天前

ScrapeGraphAI is a Python-based web-scraping framework that pairs large-language-model reasoning with a graph-style pipeline engine to turn websites (or local XML/HTML/JSON/Markdown files) into structured data with just a handful of lines of code.

Python
4
2 个月前

AI Scraper : scrap and extract data from website in any format (CSV, JSON, HTML...) using Selenium or Crawl4ai, and using Ollama or Sambanova API, and using Streamlit for UI as chatbot

Python
3
3 个月前

A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications

Go
3
7 个月前

Extract Google Maps business leads and enrich contact details using AI & web scraping

Python
2
2 个月前

AI-powered web scraper using Javascript/Typescript.

TypeScript
2
2 个月前

Use LLaMA 3 and Python to extract structured data from websites like Amazon, leveraging LLM-powered parsing for resilient, AI-driven web scraping.

2
4 个月前