MCP HubMCP Hub
web-agent-master

google-search

by: web-agent-master

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.

210created 04/03/2025
Visit
Playwright
Node.js

πŸ“ŒOverview

Purpose: To provide a Playwright-based Node.js tool that allows users to execute Google searches directly and bypass anti-scraping mechanisms seamlessly.

Overview: This tool enables local execution of Google searches, functioning either as a command-line interface or as a Model Context Protocol (MCP) server. It offers real-time search capabilities that can be integrated into AI assistants, ensuring that results are retrieved efficiently and without restrictions imposed by external APIs.

Key Features:

  • Local SERP API Alternative: Eliminates the need for paid API services by executing searches directly on the user's machine.

  • Advanced Anti-Bot Detection Bypass Techniques: Employs intelligent methods like browser fingerprint management and state preservation to mimic real user behavior, minimizing the risk of being blocked by search engines.

  • MCP Server Integration: Integrates smoothly with AI assistants such as Claude, providing them with immediate access to search functionalities without requiring additional API keys.

  • Completely Open Source and Free: Fully customizable and extensible, allowing users to modify the code per their needs without any usage restrictions.


Google Search Tool

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches and extract results. It can be used as a command-line tool or as a Model Context Protocol (MCP) server, providing real-time search capabilities to AI assistants like Claude.

Key Features

  • Local SERP API Alternative: Executes searches locally without relying on paid services.
  • Advanced Anti-Bot Detection Bypass Techniques:
    • Intelligent browser fingerprint management
    • Automatic saving and restoring of browser state
    • Smart mode switching between headless and headed modes
    • Randomization of device and locale settings
  • MCP Server Integration: Allows real-time search capabilities without additional API keys.
  • Completely Open Source: Freely customizable and extensible code.

Technical Features

  • Developed with TypeScript for type safety.
  • Browser automation based on Playwright.
  • Command-line parameter support.
  • JSON format output with search results.
  • Supports both headless and headed modes for debugging.
  • Detailed logging and robust error handling.

Installation

git clone https://github.com/web-agent-master/google-search.git
cd google-search
npm install   # Or use yarn or pnpm
npm run build # Compiles TypeScript code
npm link      # Links package globally for MCP functionality

Windows Environment Notes

Special adaptations ensure proper functionality in Windows environments, including:

  • .cmd files for command-line tools
  • Log file storage in the system temporary directory
  • Cross-platform file path handling

Usage

Command Line Tool

google-search "search keywords"

google-search --limit 5 --timeout 60000 --no-headless "search keywords"

Command Line Options

  • -l, --limit <number>: Result count limit (default: 10)
  • -t, --timeout <number>: Timeout in milliseconds (default: 60000)
  • --no-headless: Show browser interface.
  • -V, --version: Display version number.
  • -h, --help: Display help information.

MCP Server Integration

This project provides MCP server functionality for AI assistants:

pnpm build

Integration with Claude Desktop

  1. Edit the Claude Desktop configuration file to add server configuration.
  2. Restart Claude after modifying the configuration.
{
  "mcpServers": {
    "google-search": {
      "command": "npx",
      "args": ["google-search-mcp"]
    }
  }
}

Project Structure

google-search/
β”œβ”€β”€ package.json          # Project configuration
β”œβ”€β”€ tsconfig.json         # TypeScript configuration
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts          # Entry file
β”‚   β”œβ”€β”€ search.ts         # Search functionality implementation
β”‚   β”œβ”€β”€ mcp-server.ts     # MCP server implementation
β”‚   └── types.ts          # Type definitions
β”œβ”€β”€ dist/                 # Compiled JavaScript files
β”œβ”€β”€ bin/                  # Executable files
β”œβ”€β”€ README.md             # Project documentation
└── .gitignore            # Git ignore file

Technology Stack

  • TypeScript: Development language.
  • Node.js: Runtime environment.
  • Playwright: Browser automation.
  • Commander: Command line argument parsing.
  • Model Context Protocol (MCP): For assistant integration.

Development Guide

Run commands in the project root directory:

pnpm install             # Install dependencies
pnpm run postinstall     # Install Playwright browsers
pnpm build               # Compile TypeScript code

Error Handling

Built-in mechanisms provide friendly error messages and useful information during failures.

Notes

  • Tool is intended for learning and research purposes.
  • Adhere to Google's terms of service.
  • Proxies may be required in some regions for Google access.
  • State files containing browser data should be secured.

Comparison with Commercial SERP APIs

This project offers several advantages over paid search APIs:

  • Completely Free: No API fees.
  • Local Execution: No third-party dependency.
  • Privacy Protection: No query logging by third parties.
  • Customizability: Fully open source.
  • No Usage Limits: No restrictions on API calls.