mcp-DEEPwebresearch
by: qpd-v
Enhanced MCP server for deep web research
📌Overview
Purpose: The MCP Deep Web Research Server aims to facilitate advanced web research by integrating intelligent search and content extraction capabilities.
Overview: This server is a specialized Model Context Protocol (MCP) server designed for real-time web research, focusing on efficient content extraction and insightful data analysis. It is a fork of an existing project, enhanced to better navigate the complexities of deep web information retrieval.
Key Features:
-
Intelligent Search Queue System: Manages batch search operations with built-in rate limiting, progress tracking, and automatic retries to ensure efficient and organized web research.
-
Enhanced Content Extraction: Employs TF-IDF scoring, keyword proximity analysis, and readability scoring, enabling thorough analysis and clean formatting of extracted web content.
MCP Deep Web Research Server (v0.3.0)
A Model Context Protocol (MCP) server for advanced web research.
Latest Changes
- Added
visit_page
tool for direct webpage content extraction - Optimized performance within MCP timeout limits
- Reduced default
maxDepth
andmaxBranching
parameters - Improved page loading efficiency
- Enhanced error handling for timeouts
- Reduced default
This project is a fork of mcp-webresearch, enhanced for deep web research.
Features
-
Intelligent Search Queue System
- Batch search operations with rate limiting
- Queue management with progress tracking
- Error recovery and automatic retries
-
Enhanced Content Extraction
- TF-IDF based relevance scoring
- Keyword proximity analysis
- Readability scoring
- Improved HTML structure parsing
-
Core Features
- Google search integration
- Research session tracking
- Markdown conversion
Prerequisites
- Node.js >= 18
Installation
Global Installation (Recommended)
npm install -g mcp-deepwebresearch
Local Project Installation
npm install mcp-deepwebresearch
Claude Desktop Integration
Add the following entry to your claude_desktop_config.json
:
Windows
{
"mcpServers": {
"deepwebresearch": {
"command": "mcp-deepwebresearch",
"args": []
}
}
}
macOS
{
"mcpServers": {
"deepwebresearch": {
"command": "mcp-deepwebresearch",
"args": []
}
}
}
First-time Setup
After installation, run:
npx playwright install chromium
Usage
Start a chat with Claude and send a prompt for web research. Use the agentic-research
prompt for deeper web research.
Tools
-
deep_research
- Performs comprehensive research.
- Returns findings, progress, and timing.
-
parallel_search
- Multiple Google searches in parallel.
-
visit_page
- Extract content from a webpage.
Prompts
agentic-research
Guided prompt for thorough web research with instructions on topic exploration and citation.
Configuration Options
The server can be configured through environment variables:
MAX_PARALLEL_SEARCHES
: Default 5SEARCH_DELAY_MS
: Default 200MAX_RETRIES
: Default 3TIMEOUT_MS
: Default 55000LOG_LEVEL
: Default 'info'
Error Handling
Common Issues
-
Rate Limiting
- Solution: Adjust
SEARCH_DELAY_MS
orMAX_PARALLEL_SEARCHES
.
- Solution: Adjust
-
Network Timeouts
- Solution: Ensure requests complete within the timeout.
-
Browser Issues
- Solution: Check Playwright installation.
Development
Setup
pnpm install
pnpm build
Testing
pnpm test
Code Quality
pnpm lint
Contributing
- Fork the repository.
- Create a feature branch.
- Commit changes.
- Push to the branch.
- Open a Pull Request.
Coding Standards
- Follow TypeScript best practices.
- Document new features and APIs.
Requirements
- Node.js >= 18
- Playwright
Verified Platforms
- macOS
- Windows
License
MIT