Repository navigation

#

webscraper

Web Scraper in Go, similar to BeautifulSoup

Go
2195
1 年前

Self-hosted webscraper.

TypeScript
1412
5 个月前

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Pascal
722
2 个月前

a class that uses scraped proxies to make http GET/POST requests (Python requests)

Python
390
4 年前

An AI assistant tool that integrates coding, writing, and reading functions. For better alternatives see https://monica.im/desktop

TypeScript
313
2 年前

Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object

Python
264
1 年前

Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.

Python
233
10 个月前

📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an LLM-friendly input with a simple prefix http://127.0.0.1:3000/https://website-to-scrape.com/

TypeScript
186
6 个月前

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

C#
176
2 年前

RSS feed builder created with Bun🥖 and Hono🔥- builds from webpages, email folders, and REST API calls.

TypeScript
163
3 天前

SQL Based DSL Web Scraper/Screen Scraper

C#
154
4 年前

Financial Web Scraper & Sentiment Classifier

Python
152
5 年前

A Python command-line tool for scraping and downloading subtitles from AppleTV and iTunes movie pages.

Python
146
1 个月前

Spotify Scraper to extract all the information from spotify, download mp3 with cover of the song

Python
145
10 个月前

A php crawler that finds emails on the internets

PHP
135
4 年前