deep-research-mcp-server
by: ssdeanx
MCP Deep Research Server using Gemini creating a Research AI Agent
📌Overview
Purpose: To provide a lightweight yet powerful tool for conducting deep, iterative research utilizing advanced AI language models and web scraping techniques.
Overview: Deep-research is an AI-powered research assistant designed to navigate complex research tasks effectively. By integrating web scraping and advanced language processing capabilities via Gemini LLMs, it enables users to gather insights and generate reports within a concise and understandable codebase.
Key Features:
-
MCP Integration: Enables seamless functionality within AI agent ecosystems as a Model Context Protocol tool.
-
Iterative Deep Dive: Facilitates in-depth exploration of topics through adaptive query refinement and result processing.
-
Gemini-Powered Queries: Utilizes Gemini LLMs to create intelligent and targeted search queries tailored to user research goals.
-
Depth & Breadth Control: Allows customizable parameters to dictate the thoroughness and scope of research inquiries.
-
Smart Follow-up Questions: Automatically generates relevant follow-up questions to enhance understanding and guide further exploration.
-
Comprehensive Markdown Reports: Produces structured reports summarizing findings in a readily usable Markdown format.
-
Concurrent Processing for Speed: Optimizes research workflows by allowing multiple search queries and results to be processed in parallel.
Deep Research
Your AI-Powered Research Assistant. Conduct iterative, deep research using search engines, web scraping, and Gemini LLMs, all within a lightweight and understandable codebase.
This tool utilizes Firecrawl for efficient web data extraction and Gemini for advanced language understanding and report generation.
Goals
To provide the simplest yet most effective implementation of a deep research agent, designed to be easily understood, modified, and extended, with a target codebase under 500 lines of code.
Key Features
- MCP Integration: Integrates seamlessly as a Model Context Protocol (MCP) tool.
- Iterative Deep Dive: Explores topics through iterative query refinement.
- Gemini-Powered Queries: Leverages Gemini LLMs for smart search queries.
- Configurable Parameters: Control depth and breadth of research.
- Smart Follow-up Questions: Generates follow-up questions for clarity.
- Comprehensive Markdown Reports: Produces detailed reports.
- Concurrent Processing: Maximizes speed through parallel processing.
Persona Agents in Deep Research
What are Persona Agents?
In deep-research
, persona agents guide the behavior of Gemini language models by defining specific roles, skills, and communication styles, leading to:
- Focused Output: Aligns LLM responses with desired expertise.
- Consistency: Maintains tone and style throughout the process.
- Enhanced Performance: Optimizes output for specific tasks.
Examples of Personas:
- Research Strategist: For strategic query generation.
- Research Assistant: Focuses on analysis and key learnings.
- Query Refiner: Guides users towards effective questions.
By leveraging persona agents, deep-research
aims for more targeted and high-quality research outcomes.
Requirements
- Node.js (v22.x recommended)
- API keys for:
- Firecrawl API
- Gemini API (o3 mini model, knowledge cutoff: August 2024)
Setup
-
Clone the repository:
git clone [your-repo-link-here]
-
Install dependencies:
npm install
-
Set up environment variables: Create a
.env.local
file and add your API keys. -
Build the project:
npm run build
Usage
As MCP Tool
Start the MCP server:
node --env-file .env.local dist/mcp-server.js
Invoke the deep-research
tool using the following parameters:
query
(required)depth
(optional, 1-5)breadth
(optional, 1-5)
Example:
const result = await mcp.invoke("deep-research", {
query: "Explain the principles of blockchain technology",
depth: 2,
breadth: 4
});
Standalone CLI Usage
Run deep-research
directly:
npm run start "your research query"
License
MIT License - Free and Open Source. Use it freely!
Recent Improvements (v0.2.0)
Enhanced Research Validation
- Added academic input/output validation.
- Minimum input validation: 10 characters + 3 words.
- Output validation based on citation density and recent sources.
Gemini Integration Upgrades
- Integrated Gemini Flash 2.0 for faster processing.
- Improved error handling for API calls.
Code Quality Improvements
- Added concurrent processing pipeline and enhanced type safety.
- Optimized dependencies (30% smaller).
Performance
- Achieved substantial performance improvements in research cycles and error reduction.