Repository navigation

webscraping

Website
Wikipedia

The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data 🔥

人工智能爬虫 Markdown scraper html-to-markdown 大语言模型 scraping web-crawler ai-scraping webscraping web-scraping web-data web-data-extraction ai-agents data-extraction ai-crawler ai-search web-scraper web-search

TypeScript

61342

4964

6 小时前

huginn / huginn

Create agents that monitor and act on your behalf. Your agents are standing by!

自动化 notifications scraper webscraping feedgenerator RSS agent 监控 feed twitter-streaming huginn X (Twitter)

Ruby

47566

4116

15 小时前

assafelovic / gpt-researcher

LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.

人工智能 Python agent 自动化 research search webscraping 大语言模型 deepresearch mcp mcp-server

Python

23695

3132

4 天前

getmaxun / maxun

⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡

自动化无代码 scraper web-automation web-scraper web-scraping API browser browser-automation Playwright 自托管 robotic-process-automation rpa no-code-web-scraper agents data-extraction webscraping Hacktoberfest hacktoberfest-accepted

TypeScript

13668

1108

1 天前

pystardust / ani-cli

A cli tool to browse and play anime

Shell 命令行界面 Anime posix steamdeck Termux webscraping fzf Linux macOS rofi 终端 Windows

Shell

9927

642

20 天前

D4Vinci / Scrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

爬虫 crawling crawling-python Playwright Python scraping selectors stealth-game web-scraper web-scraping web-scraping-python webscraping xpath 自动化人工智能 ai-scraping data data-extraction mcp mcp-server

Python

7418

417

3 天前

lorien / awesome-web-scraping

List of libraries, tools and APIs for web scraping and data processing.

web-scraping captcha-recaptcha crawling crawling-python scraping scraping-framework scraping-python scraping-tool webscraping 爬虫 spider

Makefile

7355

825

9 个月前

alirezamika / autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

scraping scraper scrape webscraping 爬虫 web-scraping 人工智能 Python webautomation 自动化机器学习

Python

6982

711

4 个月前

niespodd / browser-fingerprinting

Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

Bot detection Chromium stealth-game Puppeteer scraper webscraping Web 自动化 chromium-browser bot-detection chromedriver fingerprinting 爬虫 recaptcha spider browser-fingerprinting

JavaScript

4878

263

1 年前

jaypyles / Scraperr

Self-hosted webscraper.

Open Source 自托管 webscraper Docker helm Kubernetes Playwright Python scraping web-scraper web-scrapers web-scraping webscraping

TypeScript

4279

203

3 个月前

daijro / camoufox

🦊 Anti-detect browser

antidetect antidetect-browser fingerprint Firefox Playwright webscraping Network scraping

C++

3497

297

7 个月前

scrapoxy / scrapoxy

Scrapoxy is a super proxies manager that orchestrates all your proxies into one place, rather than spreading management across multiple scrapers. It manages IP rotation and fingerprinting, and smartly routes traffic to avoid bans.

antibot proxies webscraping

TypeScript

2369

260

2 个月前