Repository navigation

#

web-archiving

ArchiveBox/ArchiveBox

🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

Python
24794
3 个月前

Core Python Web Archiving Toolkit for replay and recording of web archives

JavaScript
1543
13 天前
Rhizome-Conifer/conifer

Collect and revisit web pages.

Python
1513
7 个月前

A High-Fidelity Web Archiving Extension for Chrome and Chromium based browsers!

TypeScript
1050
25 天前

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)

JavaScript
936
3 个月前

Automatically archive links to videos, images, and social media content from Google Sheets (and more).

Python
865
9 天前

Free web archiving and sharing service based on Cloudflare. 跑在 Cloudflare 上的免费网页归档和分享工具。

TypeScript
864
2 个月前

Run a high-fidelity browser-based web archiving crawler in a single Docker container

TypeScript
852
19 天前

Serverless replay of web archives directly in the browser

TypeScript
826
1 个月前

InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS

Python
642
3 个月前

Wayback Machine API interface & a command-line tool

Python
544
1 年前

Indelible links

JavaScript
476
8 天前

Webrecorder Player for Desktop (OSX/Windows/Linux). (Built with Electron + Webrecorder)

JavaScript
448
5 年前
JavaScript
445
6 年前

Streaming WARC/ARC library for fast web archive IO

Python
428
8 个月前

A Tool To Push Web Resources Into Web Archives

Python
420
2 年前

WarcDB: Web crawl data as SQLite databases.

Python
404
1 年前

🐋 Web Archiving Integration Layer: One-Click User Instigated Preservation

Roff
377
5 个月前

Official ArchiveBox browser extension: automatically/manually preserve your browsing history using ArchiveBox.

JavaScript
346
4 个月前

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!

TypeScript
314
1 天前