In an era of information overload, quickly accessing accurate search results has become the foundation for many work and research tasks. However, traditional methods of obtaining search engine results often face limitations—either they depend on paid APIs or struggle with anti-scraping mechanisms. The tool we’ll explore today solves these problems: it’s a Node.js tool built on Playwright that enables local Google searches, bypasses anti-scraping restrictions, and even provides real-time search capabilities for AI assistants.
What Problems Does This Tool Solve?
If you frequently need to retrieve Google search results in bulk, you’ve likely encountered these frustrations: paid SERP (Search Engine Results Page) APIs are costly and have call limits; custom-built scrapers get detected repeatedly, leading to temporary account bans; and finding a suitable tool to help AI assistants access real-time information feels impossible.
This Google search tool is designed specifically to address these issues. It runs entirely locally, no third-party API services required. By simulating real user behavior, it avoids anti-scraping detection. Plus, it integrates seamlessly with AI assistants like Claude, giving AI the ability to search in real time.
Core Features: Why It’s Worth Trying
1. Local Alternative to Paid SERP APIs
You no longer need to pay for search engine result APIs or worry about hitting call limits. All search operations happen on your local device, so data acquisition costs are nearly zero—and you have full control over the process.
2. Intelligent Bypass of Anti-Robot Detection
Google’s anti-scraping mechanisms are growing more sophisticated, and ordinary automation tools get flagged easily. This tool uses a combination of techniques to counter this:
-
Dynamically manages browser fingerprints, making each request appear to come from a real user on a different device -
Automatically saves and restores browser state (e.g., cookies, login data) to reduce repeated verifications -
Switches from headless mode to headed mode automatically when verification pages appear, making it easy to complete manual checks -
Randomizes device models, region settings, and other parameters to lower the risk of being labeled a robot
3. Comprehensive Result Processing Capabilities
Beyond extracting structured data like titles, links, and snippets, it also:
-
Retrieves the raw HTML of search results pages (automatically removing CSS and JavaScript for easier analysis) -
Captures full-page screenshots automatically to preserve visual records -
Outputs results in JSON format for easy post-processing
4. Seamless Integration with AI Assistants
Through a Model Context Protocol (MCP) server, it can directly provide real-time search capabilities to AI assistants like Claude. This means when AI needs up-to-date information, you won’t have to search manually—it will call this tool automatically to get results.
5. Fully Open-Source and Free
All code is transparent and accessible. You can modify features or extend compatibility to other search engines based on your needs, with no usage restrictions.
Technical Specifications: How It Works
This tool is built on a modern tech stack that balances stability and scalability:
-
Programming Language: TypeScript, which provides type safety to reduce code errors -
Browser Automation: Powered by Playwright, supporting multiple browser engines (Chrome, Firefox, WebKit) -
Command-Line Support: Search keywords and custom parameters can be entered directly via the command line -
Output Format: Defaults to JSON, including key information like search queries, titles, links, and snippets -
Operating Modes: Supports headless mode (runs in the background) and headed mode (displays the browser interface for debugging) -
Logging System: Provides detailed logs to simplify troubleshooting -
State Management: Saves and restores browser state to minimize interception by anti-scraping mechanisms
Installation Guide: How to Deploy It on Your Device
Whether you use Windows, Mac, or Linux, follow these steps to install the tool:
Basic Installation Steps
-
Clone the Repository
Open your terminal (Command Prompt or PowerShell for Windows), and run the following commands:git clone https://github.com/web-agent-master/google-search.git cd google-search
-
Install Dependencies
Three package managers are supported: npm, yarn, and pnpm. Choose the one you use regularly:# Using npm npm install # Or using yarn yarn # Or using pnpm (recommended for efficiency) pnpm install
-
Compile TypeScript Code
The tool is written in TypeScript, so you need to compile it to JavaScript first:# Using npm npm run build # Or using yarn yarn build # Or using pnpm pnpm build
-
Link to Global (Optional but Recommended)
To use the command-line tool from any directory, run the link command:# Using npm npm link # Or using yarn yarn link # Or using pnpm pnpm link
Special Notes for Windows Users
Windows users don’t need to worry about compatibility—this tool includes Windows-specific optimizations:
-
Provides .cmd
files to ensure normal operation in Command Prompt and PowerShell -
Automatically stores log files in the system’s temporary directory (instead of /tmp
used in Linux) -
Optimizes process signal handling to ensure the server shuts down properly -
Supports Windows path separators ( \
)—no manual path conversion required
If you encounter browser installation failures on first run, try launching the terminal as an administrator and re-running the installation commands.
Usage Guide: From Basic Searches to Advanced Features
Using It as a Command-Line Tool
The most straightforward way to use the tool is via the command line. Here’s the basic syntax:
# Simple search
google-search "your search keyword"
# Example: Search for "latest artificial intelligence research"
google-search "latest artificial intelligence research"
Customizing Search Parameters
To adjust result limits, timeout duration, and other settings, use these parameters:
Development and Debugging Modes
If you need to modify the code or troubleshoot issues, these commands are useful:
# Run in development mode (real-time compilation)
pnpm dev "search keyword"
# Debug mode (displays browser interface to observe operations)
pnpm debug "search keyword"
# Run temporarily with npx (no global installation needed)
npx google-search-cli "search keyword"
Example Output
By default, the tool returns structured results in JSON format:
{
"query": "deepseek",
"results": [
{
"title": "DeepSeek",
"link": "https://www.deepseek.com/",
"snippet": "DeepSeek-R1 is now live and open source, rivaling OpenAI's Model o1. Available on web, app, and API. Click for details. Into ..."
},
{
"title": "deepseek-ai/DeepSeek-V3",
"link": "https://github.com/deepseek-ai/DeepSeek-V3",
"snippet": "We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token."
}
// More results...
]
}
When using the --get-html
parameter, it returns HTML-related information:
{
"query": "playwright automation",
"url": "https://www.google.com/",
"originalHtmlLength": 1291733,
"cleanedHtmlLength": 456789,
"htmlPreview": "<!DOCTYPE html><html itemscope=\"\" itemtype=\"http://schema.org/SearchResultsPage\" lang=\"en\"><head>..."
}
When combined with --save-html
, it also shows the save path and screenshot path:
{
"query": "playwright automation",
"url": "https://www.google.com/",
"originalHtmlLength": 1292241,
"cleanedHtmlLength": 458976,
"savedPath": "./google-search-html/playwright_automation-2025-04-06T03-30-06-852Z.html",
"screenshotPath": "./google-search-html/playwright_automation-2025-04-06T03-30-06-852Z.png",
"htmlPreview": "<!DOCTYPE html><html itemscope=\"\" itemtype=\"http://schema.org/SearchResultsPage\" lang=\"en\">..."
}
Using It as an MCP Server to Provide Search Capabilities for AI Assistants
Through the Model Context Protocol (MCP), this tool can enable AI assistants like Claude to perform Google searches directly. Below are the steps to integrate it with Claude Desktop:
Prerequisites
First, ensure you’ve completed the project build:
pnpm build
Configuring Claude Desktop
-
Locate the Configuration File
-
Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
-
Windows: %APPDATA%\Claude\claude_desktop_config.json
(you can enter%APPDATA%\Claude
directly in the File Explorer address bar to access this folder)
-
-
Add Server Configuration
Open the configuration file, add the following content (choose the option that matches your system), then restart Claude:Option 1: General Configuration (Recommended for Mac/Linux)
{ "mcpServers": { "google-search": { "command": "npx", "args": ["google-search-mcp"] } } }
Option 2: Windows cmd.exe Configuration
{ "mcpServers": { "google-search": { "command": "cmd.exe", "args": ["/c", "npx", "google-search-mcp"] } } }
Option 3: Direct Node Call for Windows (Recommended for Better Compatibility)
{ "mcpServers": { "google-search": { "command": "node", "args": ["C:/your-installation-path/google-search/dist/mcp-server.js"] } } }
Note: Replace
C:/your-installation-path
with the actual directory where you installed the tool.
After configuration, enter a command like “Search for 2024 AI industry reports” in Claude. The AI will automatically call this tool to retrieve the latest results.
Project Structure: Understanding the Tool’s Components
The tool has a clear code structure, making it easy to understand and modify for secondary development:
google-search/
├── package.json # Project configuration and dependency management
├── tsconfig.json # TypeScript compilation configuration
├── src/
│ ├── index.ts # Command-line parsing and main logic entry
│ ├── search.ts # Core search functionality (Powered by Playwright)
│ ├── mcp-server.ts # MCP server implementation code
│ └── types.ts # Type definitions (ensures code type safety)
├── dist/ # Compiled JavaScript files
├── bin/ # Executable scripts (command-line entry)
└── README.md # Project documentation
Tech Stack Breakdown: The Core Technologies Behind It
The tool relies on mature technologies, each playing a critical role:
-
TypeScript: Provides static type checking to reduce runtime errors and improve code maintainability -
Node.js: Serves as the runtime environment, enabling cross-platform operation on local devices -
Playwright: Handles browser automation, simulates user actions, and supports multiple browser engines -
Commander: Parses command-line parameters and processes user input options (e.g., --limit
,--timeout
) -
Model Context Protocol (MCP): Implements communication protocols with AI assistants, allowing the tool to be called by AI -
MCP SDK: Simplifies MCP server development and enables quick integration with AI assistants -
Zod: Validates data to ensure input/output formats meet expectations and improve tool stability -
pnpm: An efficient package manager that saves disk space and speeds up dependency installation
Development Guide: How to Modify and Extend the Tool
If you have development experience and want to customize the tool to your needs, refer to these common commands:
Basic Development Commands
# Install dependencies (run after first cloning the project)
pnpm install
# Install Playwright browsers (required for automation)
pnpm run postinstall
# Compile TypeScript code (run after making modifications)
pnpm build
# Clean compiled output (run before recompiling if needed)
pnpm clean
Debugging and Testing
# Run in development mode (auto-recompiles when code changes)
pnpm dev "search keyword"
# Debug mode (displays browser interface to observe operations)
pnpm debug "search keyword"
# Run compiled code (to verify final results)
pnpm start "search keyword"
# Run tests (to ensure functionality works as expected)
pnpm test
MCP Server Development
# Run MCP server in development mode (supports hot updates)
pnpm mcp
# Run compiled MCP server (for production use)
pnpm mcp:build
Error Handling: What to Do When Issues Arise
The tool includes a comprehensive error-handling system that provides clear prompts for common problems:
-
If the browser fails to launch, it shows the specific cause (e.g., port in use, browser not installed) -
If the network connection is interrupted, it prompts you to check your network status and retries automatically -
If search result parsing fails, it logs detailed records (including raw HTML) to simplify troubleshooting of page structure changes -
In case of timeouts, it exits gracefully and suggests possible causes (e.g., slow network, delayed verification page handling)
If you’re frequently blocked by Google, try these solutions:
-
Reduce request frequency to simulate the search intervals of a real user -
Enable state files (enabled by default) and specify the save path with --state-file
-
Avoid using fixed device parameters—let the tool randomize configurations automatically
Important Notes: Must-Read Before Use
Compliance and Usage Guidelines
-
This tool is for learning and research purposes only. When using it, comply with Google’s Terms of Service and local laws/regulations -
Do not send requests frequently, as this may overload Google’s servers and lead to account or IP restrictions -
Accessing Google may require a proxy in some regions. The tool itself does not provide proxy functionality—you need to configure this separately
State File Management
-
State files contain browser data like cookies and local storage, which are critical for bypassing anti-scraping measures -
Keep state files secure and do not share them with others (they may contain personal login information) -
If you encounter persistent verification issues, try deleting the state file to let the tool rebuild the browser environment
System Requirements
-
Node.js version 16 or higher (version 18+ recommended for better compatibility) -
The first run automatically downloads browsers (approximately several hundred MB)—ensure a stable internet connection -
Minimum configuration: 2GB RAM, modern CPU (supports 64-bit systems)
Comparison with Commercial SERP APIs: Why Choose a Local Tool?
In short, if you need to use search result retrieval frequently over the long term, or have requirements for data privacy and customization, this local tool is the better choice.
Frequently Asked Questions (FAQ)
Do I need a VPN to use this tool?
Yes, since it requires access to Google Search, your environment must support connecting to Google services. The tool itself does not include proxy functionality—you need to configure a working proxy in advance.
Why does a browser window pop up during a search?
There are two possible reasons: either you used the --no-headless
parameter (debug mode), or you encountered a Google verification page (e.g., CAPTCHA). The tool automatically switches to headed mode to make it easy for you to complete manual verification, after which it continues the search.
Can I use it to crawl large volumes of results in bulk?
Not recommended. While the tool can bypass some anti-scraping mechanisms, Google has strict limits on high-frequency requests. Sending a large number of requests frequently may lead to temporary IP blocks. It’s best to control search frequency and wait a few minutes between searches.
What should I do if Windows says “command not found”?
This may happen if you didn’t run npm link
(or yarn link
/pnpm link
) or if the system environment variables haven’t been updated. Solutions:
-
Re-run the link command (requires administrator privileges) -
Call the tool directly using the relative path: node ./dist/index.js "search keyword"
How do I update the tool to the latest version?
Navigate to the project directory and run these commands:
git pull
pnpm install
pnpm build
Can it retrieve results from search engines other than Google?
Currently, the tool only supports Google Search. However, since the code is open-source, you can modify the URL and result parsing rules in src/search.ts
to adapt it to other search engines (e.g., Bing, Baidu).
This tool provides a flexible, low-cost solution for users who need to retrieve Google search results frequently. Whether used as a standalone command-line tool or integrated with AI assistants to enhance real-time information access, it meets basic needs. If you have development skills, you can further extend its functionality by modifying the code to better fit your specific use case.