google-search
by: web-agent-master
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
πOverview
Purpose: To provide a Playwright-based Node.js tool that allows users to execute Google searches directly and bypass anti-scraping mechanisms seamlessly.
Overview: This tool enables local execution of Google searches, functioning either as a command-line interface or as a Model Context Protocol (MCP) server. It offers real-time search capabilities that can be integrated into AI assistants, ensuring that results are retrieved efficiently and without restrictions imposed by external APIs.
Key Features:
-
Local SERP API Alternative: Eliminates the need for paid API services by executing searches directly on the user's machine.
-
Advanced Anti-Bot Detection Bypass Techniques: Employs intelligent methods like browser fingerprint management and state preservation to mimic real user behavior, minimizing the risk of being blocked by search engines.
-
MCP Server Integration: Integrates smoothly with AI assistants such as Claude, providing them with immediate access to search functionalities without requiring additional API keys.
-
Completely Open Source and Free: Fully customizable and extensible, allowing users to modify the code per their needs without any usage restrictions.
Google Search Tool
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches and extract results. It can be used as a command-line tool or as a Model Context Protocol (MCP) server, providing real-time search capabilities to AI assistants like Claude.
Key Features
- Local SERP API Alternative: Executes searches locally without relying on paid services.
- Advanced Anti-Bot Detection Bypass Techniques:
- Intelligent browser fingerprint management
- Automatic saving and restoring of browser state
- Smart mode switching between headless and headed modes
- Randomization of device and locale settings
- MCP Server Integration: Allows real-time search capabilities without additional API keys.
- Completely Open Source: Freely customizable and extensible code.
Technical Features
- Developed with TypeScript for type safety.
- Browser automation based on Playwright.
- Command-line parameter support.
- JSON format output with search results.
- Supports both headless and headed modes for debugging.
- Detailed logging and robust error handling.
Installation
git clone https://github.com/web-agent-master/google-search.git
cd google-search
npm install # Or use yarn or pnpm
npm run build # Compiles TypeScript code
npm link # Links package globally for MCP functionality
Windows Environment Notes
Special adaptations ensure proper functionality in Windows environments, including:
.cmd
files for command-line tools- Log file storage in the system temporary directory
- Cross-platform file path handling
Usage
Command Line Tool
google-search "search keywords"
google-search --limit 5 --timeout 60000 --no-headless "search keywords"
Command Line Options
-l, --limit <number>
: Result count limit (default: 10)-t, --timeout <number>
: Timeout in milliseconds (default: 60000)--no-headless
: Show browser interface.-V, --version
: Display version number.-h, --help
: Display help information.
MCP Server Integration
This project provides MCP server functionality for AI assistants:
pnpm build
Integration with Claude Desktop
- Edit the Claude Desktop configuration file to add server configuration.
- Restart Claude after modifying the configuration.
{
"mcpServers": {
"google-search": {
"command": "npx",
"args": ["google-search-mcp"]
}
}
}
Project Structure
google-search/
βββ package.json # Project configuration
βββ tsconfig.json # TypeScript configuration
βββ src/
β βββ index.ts # Entry file
β βββ search.ts # Search functionality implementation
β βββ mcp-server.ts # MCP server implementation
β βββ types.ts # Type definitions
βββ dist/ # Compiled JavaScript files
βββ bin/ # Executable files
βββ README.md # Project documentation
βββ .gitignore # Git ignore file
Technology Stack
- TypeScript: Development language.
- Node.js: Runtime environment.
- Playwright: Browser automation.
- Commander: Command line argument parsing.
- Model Context Protocol (MCP): For assistant integration.
Development Guide
Run commands in the project root directory:
pnpm install # Install dependencies
pnpm run postinstall # Install Playwright browsers
pnpm build # Compile TypeScript code
Error Handling
Built-in mechanisms provide friendly error messages and useful information during failures.
Notes
- Tool is intended for learning and research purposes.
- Adhere to Google's terms of service.
- Proxies may be required in some regions for Google access.
- State files containing browser data should be secured.
Comparison with Commercial SERP APIs
This project offers several advantages over paid search APIs:
- Completely Free: No API fees.
- Local Execution: No third-party dependency.
- Privacy Protection: No query logging by third parties.
- Customizability: Fully open source.
- No Usage Limits: No restrictions on API calls.