MCP HubMCP Hub
pskill9

website-downloader

by: pskill9

MCP server to download entire websites

76created 23/12/2024
Visit
web
downloader

📌Overview

Purpose: This framework serves to download entire websites efficiently while preserving their structure for offline use.

Overview: The Website Downloader MCP Server offers a robust tool that utilizes wget to download websites, ensuring that all links and resources are appropriately converted for local access.

Key Features:

  • Recursive Downloading: The server downloads entire websites recursively, capturing all necessary resources such as CSS files and images, which ensures a complete offline browsing experience.

  • Link Conversion: It converts links to be usable in a local environment, maintaining the integrity of the website's navigation post-download.

  • Domain Restriction: Downloads are restricted to the same domain, which helps in maintaining legal compliance and focus on relevant content.

  • Website Structure Preservation: The original structure of the website is preserved throughout the downloading process, allowing users to access content in the same way it appears online.


Website Downloader MCP Server

This MCP server provides a tool to download entire websites using wget. It preserves the website structure and converts links to work locally.

Prerequisites

The server requires wget to be installed on your system.

Installing wget

macOS

Using Homebrew:

brew install wget

Linux (Debian/Ubuntu)

sudo apt-get update
sudo apt-get install wget

Linux (Red Hat/Fedora)

sudo dnf install wget

Windows

  1. Using Chocolatey:
choco install wget
  1. Or download the binary from: https://eternallybored.org/misc/wget/
    • Download the latest wget.exe
    • Place it in a directory that's in your PATH (e.g., C:\Windows\System32)

Usage

The server provides a tool called download_website with the following parameters:

  • url (required): The URL of the website to download
  • outputPath (optional): The directory where the website should be downloaded. Defaults to the current directory.
  • depth (optional): Maximum depth level for recursive downloading. Defaults to infinite. Set to 0 for just the specified page, 1 for direct links, etc.

Example

{
  "url": "https://example.com",
  "outputPath": "/path/to/output",
  "depth": 2
}

Features

The website downloader:

  • Downloads recursively with infinite depth
  • Includes all page requisites (CSS, images, etc.)
  • Converts links to work locally
  • Adds appropriate extensions to files
  • Restricts downloads to the same domain
  • Preserves the website structure

Installation

  1. Build the server:
npm install
npm run build
  1. Add to MCP settings:
{
  "mcpServers": {
    "website-downloader": {
      "command": "node",
      "args": ["/path/to/website-downloader/build/index.js"]
    }
  }
}