Repository navigation

#

headless-chrome

A high-level browser automation library.

JavaScript
19598
1 年前
apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

TypeScript
17481
1 天前

🖥 Chrome automation made simple. Runs locally or headless on AWS Lambda.

TypeScript
13231
6 年前

Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.

HTML
7053
1 年前

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Python
5532
2 天前
JavaScript
3613
1 个月前

Headless chrome/chromium automation library (unofficial port of puppeteer)

Python
3573
4 年前

Puppeteer Pool, run a cluster of instances in parallel

TypeScript
3386
1 年前

Puppeteer example scripts for running Headless Chrome from Node.

JavaScript
3053
5 年前

🤖 A Node queue API for generating PDFs using headless Chrome. Comes with a CLI, S3 storage and webhooks for notifying subscribers about generated PDFs

JavaScript
2625
1 年前
playwright-community/playwright-go

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

Go
2602
10 天前
JavaScript
2260
9 天前

Run Lighthouse in CI, as a web service, using Docker. Pass/Fail GH pull requests.

JavaScript
2231
5 年前