Repository navigation

#

article-extractor

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python
4763
23 天前
extractus/article-extractor

To extract main article from given URL with Node.js

JavaScript
1748
1 个月前

Readability / Html Content / Article Extractor & Web Scrapping library written in PHP

PHP
462
2 年前

SmartReader is a library to extract the main content of a web page, based on a port of the Readability library by Mozilla

C#
173
22 天前

An article extractor in Rust

Rust
133
4 年前

Reddit bot to preview and post hyperlinks as comments

Python
102
3 年前

The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.

TypeScript
74
6 个月前

Extract article or news by url or html, parse the title and content, output in markdown format.

Python
50
1 年前

This is a small and easy-to-use desktop application that allows exporting Web of Science API Expanded and InCites API data in Excel/CSV/JSON/XML with a configurable and flexible data export structure.

Vue
40
7 个月前

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.

33
3 年前

Involution King Fun Book (IKFB, Chinese: 快卷, 卷王快乐本) is an integrated management system for papers and literature. Powered by Electron.

Vue
32
3 年前

【 Spring Boot 实战开发】10 分钟快速构建一个自己的技术文章博客

Kotlin
30
7 年前

디시인사이드 Client-Side 글 검색기 입니다.

Python
20
2 年前

📚 Сборник полезных штук из Natural Language Processing: Определение языка текста, Разделение текста на предложения, Получение основного содержимого из html документа

Python
20
3 年前