📌Overview

Purpose: To provide a Playwright-based Node.js tool that allows users to execute Google searches directly and bypass anti-scraping mechanisms seamlessly.

Overview: This tool enables local execution of Google searches, functioning either as a command-line interface or as a Model Context Protocol (MCP) server. It offers real-time search capabilities that can be integrated into AI assistants, ensuring that results are retrieved efficiently and without restrictions imposed by external APIs.

Key Features:

Local SERP API Alternative: Eliminates the need for paid API services by executing searches directly on the user's machine.
Advanced Anti-Bot Detection Bypass Techniques: Employs intelligent methods like browser fingerprint management and state preservation to mimic real user behavior, minimizing the risk of being blocked by search engines.
MCP Server Integration: Integrates smoothly with AI assistants such as Claude, providing them with immediate access to search functionalities without requiring additional API keys.
Completely Open Source and Free: Fully customizable and extensible, allowing users to modify the code per their needs without any usage restrictions.

Google Search Tool

A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches and extract results. It can be used directly as a command-line tool or as a Model Context Protocol (MCP) server to provide real-time search capabilities to AI assistants like Claude.

Key Features

Local SERP API alternative: no need for paid API services; searches executed locally.
Advanced anti-bot detection bypass:
- Intelligent browser fingerprint management simulating real user behavior.
- Automatic saving and restoration of browser state to reduce verification frequency.
- Automatic headless/headed mode switching.
- Randomization of device and locale settings.
Raw HTML retrieval of search result pages for analysis and debugging.
Automatic full-page screenshot capture when saving HTML content.
MCP server integration to support AI assistants like Claude without additional API keys.
Completely open source and free, with no usage restrictions.

Technical Features

Developed with TypeScript for type safety.
Browser automation based on Playwright supporting multiple browsers.
Command-line support for search keywords.
MCP server support for AI integration.
Returns search results with title, link, and snippet.
Option to retrieve raw HTML of result pages.
JSON format output.
Supports both headless and headed modes.
Detailed logging and robust error handling.
Browser state saving and restoration to avoid anti-bot detection.

Installation

git clone https://github.com/web-agent-master/google-search.git
cd google-search

# Install dependencies
npm install
# or yarn
yarn
# or pnpm
pnpm install

# Compile TypeScript
npm run build
# or yarn
yarn build
# or pnpm
pnpm build

# Link package globally (required for MCP functionality)
npm link
# or yarn
yarn link
# or pnpm
pnpm link

Windows Environment Notes

.cmd files included for Windows Command Prompt and PowerShell compatibility.
Log files stored in system temp directory (instead of Unix /tmp).
Windows-specific process signal handling ensures proper server shutdown.
Cross-platform file path handling supports Windows separators.

Usage

Command Line Tool

# Basic search
google-search "search keywords"

# With options
google-search --limit 5 --timeout 60000 --no-headless "search keywords"

# Using npx
npx google-search-cli "search keywords"

# Development mode
pnpm dev "search keywords"

# Debug mode (shows browser)
pnpm debug "search keywords"

# Get raw HTML of search results
google-search "search keywords" --get-html

# Get and save HTML
google-search "search keywords" --get-html --save-html

# Get and save HTML to specific file
google-search "search keywords" --get-html --save-html --html-output "./output.html"

Command Line Options

-l, --limit <number>: Limit number of results (default: 10)
-t, --timeout <number>: Timeout in ms (default: 60000)
--no-headless: Show browser UI (for debugging)
--remote-debugging-port <number>: Remote debugging port (default: 9222)
--state-file <path>: Browser state file (default: ./browser-state.json)
--no-save-state: Do not save browser state
--get-html: Get raw HTML instead of parsed results
--save-html: Save HTML to file (with --get-html)
--html-output <path>: Specify HTML output file (with --get-html and --save-html)
-V, --version: Show version
-h, --help: Show help

Output Example

{
  "query": "deepseek",
  "results": [
    {
      "title": "DeepSeek",
      "link": "https://www.deepseek.com/",
      "snippet": "DeepSeek-R1 is now live and open source, rivaling OpenAI's Model o1..."
    },
    {
      "title": "deepseek-ai/DeepSeek-V3",
      "link": "https://github.com/deepseek-ai/DeepSeek-V3",
      "snippet": "We present DeepSeek-V3, a strong Mixture-of-Experts language model."
    }
  ]
}

HTML Output Example

With --get-html:

{
  "query": "playwright automation",
  "url": "https://www.google.com/",
  "originalHtmlLength": 1291733,
  "cleanedHtmlLength": 456789,
  "htmlPreview": "<!DOCTYPE html><html lang=\"zh-CN\">..."
}

With --get-html --save-html:

{
  "query": "playwright automation",
  "url": "https://www.google.com/",
  "originalHtmlLength": 1292241,
  "cleanedHtmlLength": 458976,
  "savedPath": "./google-search-html/playwright_automation-2025-04-06T03-30-06-852Z.html",
  "screenshotPath": "./google-search-html/playwright_automation-2025-04-06T03-30-06-852Z.png",
  "htmlPreview": "<!DOCTYPE html><html lang=\"zh-CN\">..."
}

MCP Server

Enables AI assistants like Claude to use Google search in real time.

pnpm build

Integration with Claude Desktop

Edit Claude Desktop config file:

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json (e.g., C:\Users\username\AppData\Roaming\Claude\claude_desktop_config.json)

Add MCP server configuration and restart Claude.

Example config:

{
  "mcpServers": {
    "google-search": {
      "command": "npx",
      "args": ["google-search-mcp"]
    }
  }
}

Windows alternatives:

Using cmd.exe with npx:

{
  "mcpServers": {
    "google-search": {
      "command": "cmd.exe",
      "args": ["/c", "npx", "google-search-mcp"]
    }
  }
}

Using node with full script path:

{
  "mcpServers": {
    "google-search": {
      "command": "node",
      "args": ["C:/path/to/your/google-search/dist/src/mcp-server.js"]
    }
  }
}

(Replace path with actual install location.)

After setup, you can request searches in Claude, e.g., "search for the latest AI research".

Project Structure

google-search/
├── package.json         # Configuration and dependencies
├── tsconfig.json        # TypeScript config
├── src/
│   ├── index.ts         # Entry file
│   ├── search.ts        # Search implementation with Playwright
│   ├── mcp-server.ts    # MCP server
│   └── types.ts         # Type definitions
├── dist/                # Compiled JS files
├── bin/
│   └── google-search    # CLI entry script
├── README.md            # Documentation
└── .gitignore           # Git ignore file

Technology Stack

TypeScript for development
Node.js runtime
Playwright for browser automation
Commander for CLI argument parsing
Model Context Protocol (MCP) for AI integration
MCP SDK for server implementation
Zod for schema validation
pnpm for package management

Development Guide

Run in project root:

pnpm install              # Install dependencies
pnpm run postinstall      # Install Playwright browsers
pnpm build                # Compile TypeScript
pnpm clean                # Clean compiled output

CLI Development

pnpm dev "search keywords"    # Development mode
pnpm debug "search keywords"  # Debug mode with UI
pnpm start "search keywords"  # Run compiled
pnpm test                     # Run tests

MCP Server Development

pnpm mcp           # Run MCP server in dev mode
pnpm mcp:build     # Run compiled MCP server

Error Handling

Friendly messages on browser startup failure
Automatic error return for network issues
Detailed logs on parsing failures
Graceful exit with info on timeout

Notes

General

For learning and research only.
Comply with Google's terms and policies.
Avoid frequent requests to prevent blocking.
Some regions may require proxy to access Google.
Playwright browsers auto-download on first use.

State Files

Contain cookies and storage data; keep secure.
Improve search success and reduce bot verification.

MCP Server

Requires Node.js v16 or higher.
Use absolute paths when configuring Claude Desktop MCP server.
Ensure Claude Desktop is updated.

Windows-Specific

May require admin rights to install Playwright browsers.
Run terminal as administrator if permission issues arise.
Allow Windows Firewall access for browsers.
Browser state saved in user home directory as .google-search-browser-state.json.
Logs stored in system temp directory under google-search-logs.

Comparison with Commercial SERP APIs

Advantages over paid APIs (e.g., SerpAPI):

Completely free, no API fees.
Local execution with no third-party dependencies.
Protects privacy — no query logging.
Fully open source, customizable and extensible.
No usage limits or frequency restrictions.
Native MCP integration for AI assistants like Claude.