MCP HubMCP Hub
mendableai

firecrawl-mcp-server

by: mendableai

Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.

2553created 06/12/2024
Visit
Firecrawl
scraping

📌Overview

Purpose: Firecrawl MCP Server is designed to provide a robust implementation of the Model Context Protocol (MCP) for enhanced web scraping functionalities integrated with Firecrawl.

Overview: The Firecrawl MCP Server offers advanced web scraping capabilities, including support for JavaScript rendering, URL discovery, and deep research. It efficiently manages tasks with rate limiting, automatic retries, and credit usage monitoring, making it suitable for both cloud and self-hosted environments.

Key Features:

  • Comprehensive Scraping Tools: Includes advanced tools for scraping, batch processing, and deep research, enabling users to extract data from single or multiple URLs effortlessly.

  • Robust Error Handling: Implements automatic retries with exponential backoff for transient errors and features comprehensive logging for performance metrics and credit tracking.

  • Customizable Configuration: Offers a range of configurable environment variables to manage API keys, retries, and credit monitoring thresholds, providing flexibility for different deployment scenarios.


Firecrawl MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.


Features

  • Scrape, crawl, search, extract, deep research, and batch scrape support
  • Web scraping with JS rendering
  • URL discovery and crawling
  • Web search with content extraction
  • Automatic retries with exponential backoff
  • Efficient batch processing with built-in rate limiting
  • Credit usage monitoring for cloud API
  • Comprehensive logging system
  • Support for cloud and self-hosted Firecrawl instances
  • Mobile/Desktop viewport support
  • Smart content filtering with tag inclusion/exclusion

Installation

Running with npx

env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Manual Installation

npm install -g firecrawl-mcp

Running on Cursor

Requires Cursor version 0.45.6+

For Cursor v0.45.6

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add New MCP Server"
  4. Enter:
    • Name: "firecrawl-mcp" (or preferred name)
    • Type: "command"
    • Command: env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp

For Cursor v0.48.6

  1. Open Cursor Settings
  2. Go to Features > MCP Servers
  3. Click "+ Add new global MCP server"
  4. Enter the following:
{
  "mcpServers": {
    "firecrawl-mcp": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR-API-KEY"
      }
    }
  }
}

For Windows users experiencing issues, try:

cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"

Replace your-api-key with your Firecrawl API key, which you can obtain from https://www.firecrawl.dev/app/api-keys.

After setup, refresh the MCP server list. The Composer Agent will automatically use Firecrawl MCP when appropriate; you can also explicitly request it for web scraping tasks.

Running on Windsurf

Add this to your ./codeium/windsurf/model_config.json:

{
  "mcpServers": {
    "mcp-server-firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}

Running with SSE Local Mode

To run the server using Server-Sent Events (SSE) locally instead of the default stdio transport:

env SSE_LOCAL=true FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

Use URL: http://localhost:3000/sse

Installing via Smithery (Legacy)

To install Firecrawl for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude

Configuration

Environment Variables

Required for Cloud API:

  • FIRECRAWL_API_KEY: Your Firecrawl API key (required for cloud API; optional if using self-hosted with FIRECRAWL_API_URL)
  • FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances (e.g. https://firecrawl.your-domain.com)

Optional Configuration:

Retry Configuration:

  • FIRECRAWL_RETRY_MAX_ATTEMPTS (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY (ms, default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY (ms, default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR (default: 2)

Credit Usage Monitoring:

  • FIRECRAWL_CREDIT_WARNING_THRESHOLD (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD (default: 100)

Configuration Examples

Cloud API with custom retries and credit monitoring:

export FIRECRAWL_API_KEY=your-api-key

export FIRECRAWL_RETRY_MAX_ATTEMPTS=5
export FIRECRAWL_RETRY_INITIAL_DELAY=2000
export FIRECRAWL_RETRY_MAX_DELAY=30000
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3

export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500

Self-hosted instance example:

export FIRECRAWL_API_URL=https://firecrawl.your-domain.com
export FIRECRAWL_API_KEY=your-api-key # if authentication needed

export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500

Usage with Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "mcp-server-firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",
        "FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
        "FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
        "FIRECRAWL_RETRY_MAX_DELAY": "30000",
        "FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",
        "FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
        "FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
      }
    }
  }
}

System Configuration Defaults

const CONFIG = {
  retry: {
    maxAttempts: 3,
    initialDelay: 1000,
    maxDelay: 10000,
    backoffFactor: 2,
  },
  credit: {
    warningThreshold: 1000,
    criticalThreshold: 100,
  },
};

Retry Behavior:

  • Retries failed requests due to rate limits using exponential backoff.

Credit Usage Monitoring:

  • Tracks API credit usage to warn or alert based on thresholds to avoid service interruption.

Rate Limiting and Batch Processing

  • Automatic rate limit handling with exponential backoff
  • Parallel processing for batch operations
  • Smart request queuing and throttling
  • Automatic retries for transient errors

Available Tools

1. Scrape Tool (firecrawl_scrape)

Scrape content from a single URL with advanced options.

Example arguments:

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com",
    "formats": ["markdown"],
    "onlyMainContent": true,
    "waitFor": 1000,
    "timeout": 30000,
    "mobile": false,
    "includeTags": ["article", "main"],
    "excludeTags": ["nav", "footer"],
    "skipTlsVerification": false
  }
}

2. Batch Scrape Tool (firecrawl_batch_scrape)

Scrape multiple URLs efficiently with rate limiting and parallel processing.

Example arguments:

{
  "name": "firecrawl_batch_scrape",
  "arguments": {
    "urls": ["https://example1.com", "https://example2.com"],
    "options": {
      "formats": ["markdown"],
      "onlyMainContent": true
    }
  }
}

Response includes operation ID for status checking.


3. Check Batch Status (firecrawl_check_batch_status)

Check the status of a batch operation.

{
  "name": "firecrawl_check_batch_status",
  "arguments": {
    "id": "batch_1"
  }
}

4. Search Tool (firecrawl_search)

Search the web and optionally extract content from search results.

{
  "name": "firecrawl_search",
  "arguments": {
    "query": "your search query",
    "limit": 5,
    "lang": "en",
    "country": "us",
    "scrapeOptions": {
      "formats": ["markdown"],
      "onlyMainContent": true
    }
  }
}

5. Crawl Tool (firecrawl_crawl)

Start an asynchronous crawl with advanced options.

{
  "name": "firecrawl_crawl",
  "arguments": {
    "url": "https://example.com",
    "maxDepth": 2,
    "limit": 100,
    "allowExternalLinks": false,
    "deduplicateSimilarURLs": true
  }
}

6. Extract Tool (firecrawl_extract)

Extract structured data using LLM capabilities.

Example arguments:

{
  "name": "firecrawl_extract",
  "arguments": {
    "urls": ["https://example.com/page1", "https://example.com/page2"],
    "prompt": "Extract product information including name, price, and description",
    "systemPrompt": "You are a helpful assistant that extracts product information",
    "schema": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "price": { "type": "number" },
        "description": { "type": "string" }
      },
      "required": ["name", "price"]
    },
    "allowExternalLinks": false,
    "enableWebSearch": false,
    "includeSubdomains": false
  }
}

Example response:

{
  "content": [
    {
      "type": "text",
      "text": {
        "name": "Example Product",
        "price": 99.99,
        "description": "This is an example product description"
      }
    }
  ],
  "isError": false
}

Options:

  • urls: array of URLs to extract from
  • prompt: custom prompt for extraction
  • systemPrompt: system instruction
  • schema: JSON schema for structured output
  • allowExternalLinks, enableWebSearch, includeSubdomains: flags for extraction behavior

Extraction uses managed LLM in cloud or your configured LLM for self-hosted.


7. Deep Research Tool (firecrawl_deep_research)

Conduct deep web research using crawling, search, and LLM analysis.

Example:

{
  "name": "firecrawl_deep_research",
  "arguments": {
    "query": "how does carbon capture technology work?",
    "maxDepth": 3,
    "timeLimit": 120,
    "maxUrls": 50
  }
}

Arguments:

  • query (required): Research topic
  • maxDepth (optional): Recursive depth (default 3)
  • timeLimit (optional): Time limit in seconds (default 120)
  • maxUrls (optional): Max URLs to analyze (default 50)

Returns an LLM-generated final analysis and may include structured activities and sources.


8. Generate LLMs.txt Tool (firecrawl_generate_llmstxt)

Generate standardized llms.txt for a domain defining LLM interaction.

Example:

{
  "name": "firecrawl_generate_llmstxt",
  "arguments": {
    "url": "https://example.com",
    "maxUrls": 20,
    "showFullText": true
  }
}

Arguments:

  • url (required): base site URL
  • maxUrls (optional): max URLs (default 10)
  • showFullText (optional): include llms-full.txt contents

Returns the generated llms.txt and optionally the full text file.


Logging System

Includes logging for:

  • Operation status and progress
  • Performance metrics
  • Credit usage monitoring
  • Rate limit tracking
  • Error conditions

Example logs:

[INFO] Firecrawl MCP Server initialized successfully
[INFO] Starting scrape for URL: https://example.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit usage has reached warning threshold
[ERROR] Rate limit exceeded, retrying in 2s...

Error Handling

  • Automatic retries for transient errors
  • Rate limit backoff handling
  • Detailed error messages
  • Credit usage warnings
  • Network resilience

Example error response:

{
  "content": [
    {
      "type": "text",
      "text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
    }
  ],
  "isError": true
}

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Run tests: npm test
  4. Submit a pull request

License

MIT License - see LICENSE file for details