Repository navigation

website-scraper

Website
Wikipedia

website-scraper / node-website-scraper

Download website to local directory (including all css, images, js, etc.)

JavaScript website-scraper scraper Node.js Hacktoberfest

JavaScript

1638

288

8 天前

goclone-dev / goclone

Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.

Go cloning 爬虫 website-scraper

1616

327

5 小时前

z0m31en7 / Uscrapper

Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.

OSINT Python web-scraping website-scraper information-extraction information-gathering osint-python osint-tool reconnaissance webscraping websites Selenium webcrawler darkweb tor

Python

708

9 个月前

josephlimtech / linkedin-profile-scraper-api

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Puppeteer Node.js scraper scraping scraping-websites website-scraper JSON linkedin 爬虫 crawling spider Express linkedin-scraper linkedin-profile

TypeScript

674

171

1 年前

website-scraper / website-scraper-puppeteer

Plugin for website-scraper which returns html for dynamic websites using puppeteer

Node.js JavaScript scraper website-scraper Puppeteer Chrome Chromium Hacktoberfest

JavaScript

344

2 天前

Kooboo / Kooboo

CMS, WebSite, Application and Ecommerce Development Tool Using JavaScript

内容管理系统 Development JavaScript website-builder kooboo website-development website-scraper magento shopify templates WordPress

337

103

4 个月前

OSINT-TECHNOLOGIES / dpulse

DPULSE - Tool for complex approach to domain OSINT

data-gathering information-gathering Cybersecurity infosectools intelligence intelligence-gathering OSINT osint-tool web-scraping webscraping website-scraper osint-tools pentest pentest-tool pentesting

Python

147

1 个月前

html2rss / html2rss-web

🕸 Generates RSS feeds of any website & serves to the web! Automatic scraping. Ready to use configs. Write your own. Rolling Docker releases for speedy updates.

Ruby Docker scraper RSS feed builder website-scraper rss-feed rolling-release rss-aggregator roda

Ruby

108

2 天前

erlange / wbm-dl

Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.

internet-archive wayback-machine internet C#website-scraper command-line-tool command-line-app command-line-parser console console-application console-app

3 年前

xarantolus / Collect

A server to collect & archive websites that also supports video downloads

自托管 webinterface archive video-downloader website-scraper web-archiving

TypeScript

3 年前

LexiestLeszek / scrapeGPT

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

爬虫 huggingface large-language-models 大语言模型 ollama proxy rag retrieval-augmented-generation robots-txt scraper Telegram website-scraper

Python

2 年前

MLArtist / WebScraper

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

crawling-python 爬虫 scraper scraping scrapper website-scraper robots-txt user-agent beautifulsoup beautifulsoup4

Python

2 个月前

shurco / goClone

🌱 goClone - clone websites in seconds

cloning 爬虫 Go scraping website-scraper scrapper Hacktoberfest scraping-websites crawling

8 天前

CRAKZOR / linkedin-post-automator

Automatically curates and posts content to LinkedIn. It can optionally use web scraping to gather data, which is then fed to ChatGPT to craft engaging LinkedIn posts.

ChatGPT API linkedin linkedin-scraper openai-api website-scraper chatgpt-bot

Python

4 个月前