Save the main content of a web page to local Markdown, with optional image downloading for offline reading and long-term note taking.
- Clip a single URL or multiple URLs
- Batch import URLs from a text file (one URL per line; blank lines and
#comments are allowed) - Extract main content and convert to Markdown (static HTML first)
- Optionally download images and rewrite them as relative paths
python -m pip install -r requirements-web-clipper.txtSingle URL:
python web_clipper.py "https://example.com/"Multiple URLs:
python web_clipper.py "https://example.com/" "https://www.python.org/"Batch import:
python web_clipper.py --input urls.txtBy default, outputs go to clippings/, one directory per page:
clippings/
<title__hash>/
index.md
assets/
img_...
--out-dir <dir>: output directory (default:clippings)--no-images: do not download images (downloads by default)--timeout <sec>: request timeout in seconds (default: 25)--fail-fast: stop immediately on the first error (continues by default)
- Limited support for heavily JS-rendered pages (this tool focuses on static HTML extraction)
- Anti-bot protections / auth walls / paywalls may prevent full content extraction
Licensed under GNU AGPLv3. See LICENSE.