How I Automated SEO Data Analysis with a Local AI Agent: A Step-by-Step Guide
The Core Question This Article Answers: As an indie developer managing multiple websites, how can you break free from tedious manual data exports and build a fully automated SEO data analysis system using a local AI Agent?
As an indie developer managing multiple websites, I used to be trapped in a mechanical loop every week: logging into Google Search Console, Google Analytics, and Bing Webmaster Tools, exporting data, copying and pasting it into spreadsheets, and then staring blankly at rows of numbers. This repetitive drudgery not only consumed time but also dulled my sensitivity to the data itself.
Until one day, I realized: I have a locally running AI Agent (OpenClaw), so why not let it handle this repetitive labor? So, I spent an afternoon building a fully localized SEO data query system. Now, I just say one sentence to the AI, and it automatically analyzes the SEO performance of all my websites and provides actionable optimization suggestions.
The Ideal Workflow: From Manual Labor to Intelligent Dialogue
The Core Question This Section Answers: How does this automated system operate in a real-world scenario, and what kind of efficiency gains can it deliver?
My morning workflow has become incredibly relaxed. I wake up, brew a cup of coffee, and casually ask the AI: “Check my SEO performance for the last week.” In just 30 seconds, the AI delivers a complete analysis report covering core data from Google Search Console, Google Analytics, and Bing Webmaster.
For example, the AI might tell me that total clicks increased by 12% and impressions rose by 8%. It would also keenly identify that while a specific keyword has 3,200 impressions, the click-through rate (CTR) is only 1.2%. Based on this, the AI directly suggests: “Optimize the meta description to highlight ‘Free Online Solver’.”
No logging into backends, no exporting CSVs, no manual data comparison. The AI completes all the analysis work automatically, even discovering optimization opportunities I might have overlooked. This is the workflow I want: tools serving people, not people serving tools.
Technical Architecture: The Triumph of Minimalism
The Core Question This Section Answers: What is the technical architecture behind this system, and why was this specific approach chosen?
The design philosophy of the entire system is minimalism: no cloud services, no databases, and no complex middleware. For indie developers, lower maintenance costs equal higher value.
The workflow is simple and direct:
-
The AI Agent receives my natural language request. -
The Agent calls a local Node.js script via the exectool. -
The script directly calls the official Google or Bing APIs. -
The API returns plain text results to the AI. -
The AI analyzes the data and provides recommendations.
All code resides on a local Mac mini, and key files are stored locally, ensuring zero reliance on third-party services. This guarantees data security and privacy. The file structure is clear and concise, divided into a scripts directory and a credentials directory:
~/.openclaw/workspace/
├── scripts/
│ ├── gsc-report.cjs # Google Search Console Query
│ ├── ga4-report.cjs # Google Analytics 4 Query
│ └── bing-report.cjs # Bing Webmaster Query
└── credentials/
├── gcp-service-account.json # Google Service Account Key
└── bing-webmaster-api-key.txt # Bing API Key
Three script files and two key files constitute the core of the system.
Configuration Guide: Building a Local SEO Automation System from Scratch
The Core Question This Section Answers: What are the specific installation and configuration steps, and what are the critical details for each?
The entire configuration process takes about 60 minutes, with most time spent on various authorizations in the Google Cloud Console. To facilitate operations, I have broken it down into 6 clear steps.
Step 1: Google Cloud Console Configuration (15 mins)
Google API access requires a Service Account. This is a “robot account” designed for programmatic access, distinct from human user logins.
-
Create a Project: Open Google Cloud Console and create a new project (e.g., named openclaw-seo). -
Create Service Account: Navigate to “IAM & Admin” -> “Service Accounts” and click “Create Service Account”. The name is arbitrary; roles can be skipped for now as permissions are granted separately on each platform later. -
Generate Key: Once created, click on the Service Account -> “Keys” -> “Add Key” -> “Create new key” -> Select “JSON” format. Download the JSON file and save it to ~/.openclaw/workspace/credentials/gcp-service-account.json. -
Enable APIs: In the Google Cloud Console, go to “APIs & Services” -> “Library”. Search for and enable both the “Google Search Console API” and “Google Analytics Data API”. Both must be enabled, or the scripts will return errors during execution.
Critical Detail: The downloaded JSON file contains a client_email field, looking something like openclaw-seo-reader@openclaw-seo.iam.gserviceaccount.com. Memorize this email address; you will need it for GSC and GA4 authorization later.
Step 2: Google Search Console Authorization (10 mins)
There is a common point of confusion here: GSC permissions are not set in Google Cloud, but on the Search Console website itself.
-
Open Google Search Console. -
Select your site (Note: This must be done for each site individually). -
In the left navigation bar, select “Settings” -> “Users and permissions” -> “Add user”. -
Enter the client_emailnoted earlier. -
Select “Restricted” permissions, as the script only needs read access.
Insight & Lesson Learned: If you have 10 sites, you must repeat this operation 10 times. There is currently no method for global batch authorization. While tedious, this is a necessary step for data security.
Step 3: Google Analytics 4 Authorization (10 mins)
The GA4 authorization process is similar to GSC, but there is a massive pitfall to avoid.
-
Open Google Analytics. -
Click “Admin” in the bottom left -> Select the target Property. -
In the “Property” column, find “Property Access Management”. -
Click the “+” in the top right -> “Add users”, enter the client_email, and select the “Viewer” role.
Major Troubleshooting Guide: The GA4 Property ID and Data Stream ID are two completely different things!
-
Property ID: Found in “Property” -> “Property details”. It is a string of pure numbers, e.g., 489888837. -
Data Stream ID: Found in “Data Streams”. It is also numeric but different in format.
The script requires the Property ID, formatted as properties/489888837. On my first attempt, I mistakenly used the Data Stream ID and spent half a day debugging script errors. Please ensure you confirm the ID type used here.
Step 4: Bing Webmaster Tools Configuration (5 mins)
Compared to Google’s complex authorization, Bing’s method is straightforward. It doesn’t require a Service Account, just an API Key.
-
Open Bing Webmaster Tools. -
Click the “Settings” (gear icon) in the top right -> “API access”. -
Click “Generate API Key”. -
Copy the generated API Key and save it to ~/.openclaw/workspace/credentials/bing-webmaster-api-key.txt. The file should contain just the plain text key.
Step 5: Installing Dependencies & Avoiding Pitfalls (5 mins)
Initialize npm in the scripts directory and install the necessary dependency libraries.
cd ~/.openclaw/workspace/scripts
npm install googleapis google-auth-library
Core Technical Pitfall: Do not add "type": "module" to your package.json!
If your project is configured for ES Modules, Node.js will treat all .js files as modules, causing a “require is not defined” error when using CommonJS require syntax.
Solution: Name your script files with the .cjs (CommonJS) suffix instead of .js. This forces Node.js to use CommonJS mode regardless of the package.json configuration. This is why my scripts are named gsc-report.cjs.
Step 6: Script Implementation & Core Logic (15 mins)
This is the core part where we write three query scripts to connect to the data sources.
GSC Script: gsc-report.cjs
This script queries keyword rankings, clicks, impressions, CTR, and top pages.
Core logic:
-
Use the googleapislibrary to connect to the Google Search Console API. -
Read the local Service Account JSON key for authentication. -
Configure the site list. There is a common pitfall here: The resource name format must match Search Console exactly. -
Domain Property: Format is sc-domain:example.com. -
URL-prefix Property: Format is https://example.com/(note the trailing slash).
-
-
Call the searchanalytics.querymethod to query data.
Insight: If the format is incorrect, the API returns a “403 User does not have sufficient permission” error. This message is misleading; you might think permissions are wrong, but it’s actually the resource name. I recommend calling searchconsole.sites.list() first to list all authorized sites and confirm the correct format.
GA4 Script: ga4-report.cjs
This script queries users, sessions, page views, traffic sources, etc.
Core logic:
-
Use google.analyticsdatato connect to the GA4 API. -
Configure the Property ID (Again: pure numbers). -
Call the properties.runReportmethod, setting date ranges and dimensions (e.g.,sessionDefaultChannelGroup,country,pagePath).
Bing Script: bing-report.cjs
The Bing API doesn’t need OAuth; it uses direct HTTPS requests, making the code the simplest.
Core logic:
-
Use Node.js native httpsmodule. -
Read the local API Key file. -
Call Bing Webmaster API endpoints (e.g., GetRankAndTrafficStats,GetQueryStats).
Making the AI Agent Understand and Execute Tasks
The Core Question This Section Answers: Once scripts are configured, how do we make the AI Agent truly “understand” this data and provide value?
Once configured, the AI Agent can directly call these scripts via the exec tool. But it’s not just about executing commands; it’s about the AI’s secondary processing of the data.
When I say: “Check the SEO performance for the slitherlinks site for the last 7 days,” the AI internally executes the following:
-
Call exec: node ~/.openclaw/workspace/scripts/gsc-report.cjs slitherlinks. -
Call exec: node ~/.openclaw/workspace/scripts/ga4-report.cjs slitherlinks 7. -
After receiving the plain text data, the AI synthesizes an analysis.
The AI can do more than just display data; it can:
-
Compare Changes: Analyze week-over-week data fluctuations. -
Discover Opportunities: Identify keywords with low CTR but high impressions—these are “low-hanging fruit” for optimization. -
Anomaly Alerts: Detect pages with traffic drops and alert immediately.
Through this combination of “Script + AI Analysis,” cold numbers transform into actionable advice.
Pitfalls Record: Headaches and Solutions
The Core Question This Section Answers: What are the most likely technical obstacles during implementation, and how can they be quickly located and resolved?
I encountered several issues during configuration. These pitfalls were rarely due to complex code but rather inconsistencies in platform rules.
-
require is not defined-
Cause: package.jsonwas set to"type": "module", causing a Node.js parsing mode conflict. -
Solution: Change the file suffix to .cjsto force CommonJS mode.
-
-
403 insufficient permission
-
Cause: Incorrect GSC resource name format, not actual permission issues. -
Solution: Call the API list method to confirm if the resource name is sc-domain:format orhttps://format.
-
-
GA4 Property ID Error
-
Cause: Confusing “Property ID” with “Data Stream ID”. -
Solution: Go to the “Property details” page in GA4 to confirm the pure numeric ID.
-
-
API not enabled
-
Cause: Forgot to enable the corresponding API in the GCP backend. -
Solution: Go to Library and enable Google Search Console API and Google Analytics Data API.
-
-
New Site Has No Data
-
Cause: Forgot to authorize the Service Account for the new site. -
Solution: Manually add the Service Account email in the GSC/GA4 backend for every new site.
-
Personal Reflection: The most frustrating issue was the GSC resource name format. I spent an hour debugging and consulting permission documentation, only to find it was the difference between sc-domain: and https://. This taught me that when debugging APIs, prioritizing parameter format checks is often more efficient than checking permission logic.
Security & Advanced Usage: Building a Robust System
Although the system is fully localized, security awareness cannot be relaxed.
-
Key Management: The credentials/directory must be in.gitignore. Uploading it to GitHub is strictly prohibited. -
Principle of Least Privilege: The Service Account only has read-only permissions. Even if the key leaks, attackers cannot modify your website data. -
File Permissions: Restrict access to key files using the chmod 600command to allow only the current user to read/write.
If you want to further automate, you can use cron to set up scheduled tasks:
crontab -e
# Add the following line to run automatically at 8 AM daily
0 8 * * * node /Users/yourname/.openclaw/workspace/scripts/gsc-report.cjs >> /tmp/seo-report.log 2>&1
Combined with the AI Agent’s active inquiry features, you can even have it proactively report yesterday’s traffic anomalies every morning.
Conclusion
Setting up the entire process takes about 30-60 minutes. Once configured, you possess a fully automated SEO data analysis system. This is not just an efficiency tool; it is a shift in working style: transforming from a data porter to a data decision-maker.
For indie developers, time is the most precious resource. Rather than jumping between multiple backends, spend a little time building your own automation system and let AI become your SEO analyst.
Practical Summary / Checklist
Preparation
-
[ ] Confirm Node.js is installed locally. -
[ ] Have a Google account and a Bing Webmaster account ready. -
[ ] Create the local working directory ~/.openclaw/workspace/.
Key Configuration Steps
-
Google Cloud: Create Service Account, download JSON key, enable GSC and GA Data API. -
GSC Authorization: Add Service Account email to user permissions for each site (Permission: Restricted). -
GA4 Authorization: Add Service Account email to Property permissions (Permission: Viewer), record Property ID. -
Bing Configuration: Generate API Key and save as a text file. -
Dependency Install: npm install googleapis google-auth-library. -
Script Writing: Note the use of .cjssuffix and GSC resource name formats.
Core Pitfalls to Avoid
-
File Suffix: Always use .cjsto avoidrequireerrors. -
Resource Format: Distinguish between sc-domain:andhttps://GSC resource formats. -
ID Distinction: GA4 uses the Property ID (numeric), not the Data Stream ID.
One-Page Summary (One-Pager)
Core Value: Automating multi-platform SEO data query and analysis via Local AI Agent + Node.js scripts, eliminating manual backend logins.
Technical Architecture:
-
Input: Natural language instruction (e.g., “Check performance for last week”). -
Processing: AI Agent -> exec tool -> Node.js script -> Official API. -
Output: Plain text data -> AI Analysis -> Actionable advice.
Key Challenges:
-
Creation and cross-platform authorization of Google Service Accounts. -
Correct matching of GSC resource name formats. -
Correct identification of GA4 Property IDs.
Use Cases: Indie developers managing multiple websites, operations personnel needing frequent SEO monitoring, and tech enthusiasts pursuing data privacy and localized deployment.
Frequently Asked Questions (FAQ)
Q1: Does this system have to run on a Mac?
A: No. The article uses a Mac mini as an example, but it essentially relies on the Node.js environment. Therefore, it can run on Windows, Linux, or any device supporting Node.js.
Q2: Why use a Service Account instead of my personal Google account password?
A: Service Accounts are designed specifically for programmatic access and offer higher security. They support fine-grained permission control (like read-only) and avoid exposing personal account passwords in code, while also supporting long-term stable automated operation.
Q3: If I have 50 websites, do I need to authorize manually 50 times?
A: Yes. Currently, Google’s permission system is resource-dimension based. The Service Account must be explicitly added to the access list of each site or property. There is no one-click global authorization available at this time.
Q4: Why should the scripts be named with the .cjs suffix?
A: This is to ensure compatibility with Node.js module systems. If your project is configured for ES Modules ("type": "module"), standard .js files cannot use the require syntax. The .cjs suffix forces Node.js to use CommonJS mode, avoiding module resolution conflicts.
Q5: The GSC script returns a 403 permission error, but I confirmed authorization. What should I do?
A: This is usually not a permission issue but a resource name format issue. Check the site URL format in your code: Domain properties require sc-domain:example.com, while URL-prefix properties require https://example.com/ (including trailing slash). A format mismatch will directly return a 403 error.
Q6: How does the AI Agent know which script to call and when?
A: This depends on the AI Agent’s tool configuration. In OpenClaw, by defining the exec tool, the AI can understand the intent of “querying SEO data” and automatically construct the correct command line instructions based on parameters (like site name, time range).
Q7: What is the cost of this solution?
A: Completely free. Google Search Console API, Google Analytics Data API, and Bing Webmaster API all offer free quotas, which are more than sufficient for the query volume of personal websites. The only cost is the electricity consumption of your local computer.

