![]() ![]() ![]() Google Images Download: Python script for downloading images.flickr_download: Simple script to download a Flickr set.FicSave: Online fanfiction downloader Source code is available, website however is now offline.FanFicFare: Tool for making eBooks from stories on fanfiction and other web sites.Discord-Channel-Scraper: Discord server archival (json output, download attachments and emojies).gallery-dl: Download image galleries and collections from pixiv, exhentai, danbooru and more.floatplane_ripper: Script to rip all videos from.comics-downloader: Command-line tool to download comicsand manga in pdf/epub/cbz/cbr from supported sites.ChanThreadWatch: Saves threads from *chan-style boards and checks for updates until the thread dies.BBCSoundDownloader: Bulk downloader for BBC's Sound Effects library.ytdl-sub: Automate downloading and metadata generation with YouTubeDL.Youtube-DL: A command-line program to download videos from YouTube and a few hundred more sites.you-get: Dumb downloader that scrapes the web.wpull: Wget-compatible web downloader and crawler.wget2: Successor of GNU Wget, works multi-threaded.wget: Utility for non-interactive download of files from the Web.Suck-It: Recursively visit and download a website's content to your disk (multi-threaded).rsync: An open source utility that provides fast incremental file transfer.Rclone: A command line program to sync files and directories to and from various cloud storage providers.Plowshare: Command-line tool to manage file-sharing site.news-crawl: Cralwer for news feeds based on StromCrawler that prouduces WARC files. ![]() httpie: A tool similar to curl and wget but designed to be user friendly, useful for web scraping with shell scripts but be aware you're adding a dependency by doing so.Horahora: Video hosting website and video archival manager for Niconico, Bilibili, and YouTube.curl: Tool and library for transferring data with URL syntax, supporting many protocols.CrowLeer: Powerful C++ web crawler based on libcurl.aria2: A lightweight multi-protocol & multi-source command-line download utility.annie: YouTube-DL alternative written in Golang.wikiteam: set of tools for archiving wikis.webrecorder: An integrated platform for creating high-fidelity, ISO-compliant web archives in a user-friendly interface, providing access to archived content, and sharing collections.wail: Web Archiving Integration Layer: One-Click User Instigated Preservation.HTTrack: Download a website from the Internet to a local directory.Heritrix: Extensible, web-scale, archival-quality web crawler.grab-site: The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns.Collect: A server to collect & archive websites that also supports video downloads.Browsertrix Crawler: Browsertrix Crawler is a simplified (Chrome) browser-based high-fidelity crawling system, designed to run a complex, customizable browser-based crawl in a single Docker container.Takes browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more. ArchiveBox: The open source self-hosted web archive.I will try to organize the list with more useful sections in the future Note: This is only a first draft/brainstorm. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |