Gemini CLI Extensions: Transform Your Terminal into an AI-Powered Control Tower

高效码农

3 months ago

Yes—Gemini CLI Extensions let you speak plain English to the shell and watch databases, design files, payment ledgers and K8s clusters bend to your will.
Below you’ll learn what the framework is, why Google built it, how to install your first extension, how to write one, and what safety guard-rails matter in production.

What Exactly Are Gemini CLI Extensions?

Core question: “What is this new framework Google dropped in October 2025 and why should engineers care?”

In short, Extensions are packaged adapters that teach the open-source Gemini CLI how to talk to external tools—Postman, Figma, BigQuery, Stripe, your home-grown Jenkins, anything that exposes an API or CLI. Each extension ships:

One or more MCP (Model Context Protocol) servers
A playbook (GEMINI.md) describing when and how to invoke tools
Optional custom slash-commands and context snippets
An allow-list/block-list for fine-grained safety

Install one, and the model immediately understands new verbs: “list my Kubernetes pods in the staging cluster,” “charge this customer ten dollars,” or “generate a scatter-plot of yesterday’s sales in Looker.”

Author’s reflection: After living inside bash/zsh for fifteen years, this is the first plugin system that feels less like “adding aliases” and more like “hiring an intern who already read every man page.”

Why Extensions > Stand-alone Scripts or Plain MCP

Core question: “Can’t I already glue tools together with Python, Ansible or plain MCP servers?”

You can, but three pain-points vanish with Extensions:

Zero-shot comprehension
The playbook gives step-by-step reasoning examples; the model needs no extra prompt engineering.
One-line distribution
gemini extensions install <github-url> clones, npm-installs dependencies and registers MCP servers. No manual systemd units, no Dockerfiles.
Built-in context layering
Context flows hierarchically: system prompt → extension playbook → local GEMINI.md → user prompt. You can override or enrich behaviour without touching code.

Scenario: Your SRE team already runs an Ansible playbook to resize GKE node-pools. Wrapping it in an extension means any on-call engineer—half-asleep at 3 a.m.—can type: “Scale the payments pool to ten nodes because CPU is red-lining” and watch the agent call your playbook with correct variables, approval gates and Slack notifications.

Installing Your First Extension in Under Five Minutes

Core question: “How do I go from zero to ‘Hello World’ with a real extension?”

We’ll use the Looker extension as the guinea pig because it showcases data, visualisation and export in one shot.

Step 1: install the CLI

npm install -g @google/gemini-cli   # requires Node ≥ 20
gemini --version                    # 0.4.x or newer

Step 2: pull the extension

gemini extensions install \
  https://github.com/gemini-cli-extensions/looker

The command clones the repo into ~/.gemini/extensions/looker, runs npm ci, then hot-registers the MCP server—no restart required.

Step 3: export credentials

export LOOKER_BASE_URL="https://acme.looker.com"
export LOOKER_CLIENT_ID="abcdef"
export LOOKER_CLIENT_SECRET="xyz123"

Step 4: chat with your data

$ gemini
> Generate a bar chart of daily revenue for the last week and export it as PNG

The agent:

Calls looker:run_query with the “daily_revenue” Looker model
Converts JSON result to Vega-Lite spec
Renders PNG locally
Writes ./daily_revenue.png

Author’s reflection: The first time I saw a PNG pop into my folder after a plain-English sentence, I felt like I’d jumped a decade ahead of cron jobs and gnuplot scripts.

Anatomy of an Extension: Files That Make the Magic

Core question: “What’s under the hood once the extension lands on my disk?”

File / Folder	Purpose
`gemini-extension.json`	Manifest—declares MCP servers, context file name, blocked tools, custom commands
`GEMINI.md`	Playbook—natural language instructions plus worked examples for the model
`src/*.js` or `server.py`	MCP server entry points—implement list_tools, call_tool
`contexts/*.md`	Optional extra context snippets injected on demand
`commands/*.json`	Slash-command shortcuts, e.g. `/deploy-prod`

The CLI discovers these files at start-up, merges them into its prompt context and surfaces every listed tool to the LLM.

Writing a Minimal Extension: From Zero to Internal Tool in 30 Lines

Core question: “How hard is it to wrap my bespoke API into an extension?”

Let’s imagine your company exposes a tiny REST endpoint that returns office-room temperature. You want engineers to ask: “What’s the temp in the Kyoto meeting room?”

Project layout

temp-ext/
 ├─ gemini-extension.json
 ├─ GEMINI.md
 └─ server.js

1. Manifest (`gemini-extension.json`)

{
  "name": "temp-ext",
  "version": "1.0.0",
  "mcpServers": {
    "office-iot": {
      "command": "node server.js",
      "env": {"PORT": "3000"}
    }
  },
  "contextFileName": "GEMINI.md"
}

2. Playbook (`GEMINI.md`)

# Office IoT Extension

Available tools:
- get_temperature(room_name: string) -> float (Celsius)
- set_temperature(room_name: string, target: float) -> bool

Example usage:
> What is the temperature in Kyoto?
  The agent calls get_temperature("Kyoto") and returns the value.

3. MCP server (`server.js`)

#!/usr/bin/env node
const { Server } = require('@modelcontextprotocol/sdk/server/index.js');
const axios = require('axios');

const server = new Server(
  {
    name: 'office-iot',
    version: '1.0.0',
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'get_temperature',
      description: 'Fetch current temperature for a room',
      inputSchema: {
        type: 'object',
        properties: {
          room: { type: 'string' },
        },
        required: ['room'],
      },
    },
  ],
}));

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  if (request.params.name === 'get_temperature') {
    const { room } = request.params.arguments;
    const res = await axios.get(`https://iot.acme.com/temp/${room}`);
    return { content: [{ type: 'text', text: String(res.data.celsius) }] };
  }
  throw new Error('Tool not found');
});

server.listen();

4. Install locally

cd temp-ext
gemini extensions install --path .

Done. Open the chat and ask: “What’s the temperature in Kyoto?”

Author’s reflection: The boilerplate is intentionally boring—JSON plus a 40-line Node server. That’s the point; boring means maintainable by any intern who knows REST.

Production-grade Safety Patterns

Core question: “How do I stop the AI from running rm -rf / or charging a million dollars?”

Google ships three primitives:

Tool allow-lists
"allowedTools": ["get_temperature"] ignores any other function, even if the model begs.
User confirmation gates
"requireConfirmation": ["set_temperature", "scale_deployment"] prompts the engineer before side-effectful calls.
Sandbox & audit
Run extensions inside gVisor containers or your own Kubernetes pod with seccomp; the MCP SDK forwards structured audit logs to stdout—pipe them to Splunk or Cloud Logging.

Scenario: A fintech startup wraps Stripe’s refund API. They set "requireConfirmation": ["create_refund"] and route logs to Splunk. Engineers can type “Refund $50 to customer X,” but the CLI pauses for a typed “y” and leaves an immutable log entry—SOC 2 auditors happy.

Chaining Extensions: Real-world Combos That Save Hours

Core question: “Can I mix multiple extensions into one sentence?”

Absolutely. The model keeps context across turns, so you can orchestrate pipelines like:

Postman → Harness → Slack
“Import this OpenAPI file into Postman, run the collection against staging, then notify #deploys if all tests pass.”
Figma → Flutter → Firebase → Chrome DevTools
“Pull the latest frame from Figma, scaffold a Flutter screen, host it on Firebase Hosting, open Chrome DevTools lighthouse and tell me the LCP score.”
BigQuery → Looker → Nano Banana (image gen)
“Query yesterday’s top 20 SKUs by revenue, visualise in Looker, then generate a banana-themed celebratory image and post it to Slack #sales.”

Each verb triggers the relevant MCP server; the LLM decides order, argument mapping and error retries.

Author’s reflection: During beta I watched a developer chain six extensions to turn “idea” into “deployed marketing page” in nine minutes—something that used to take us an entire sprint ceremony.

Performance & Resource Footprint

Core question: “Will my laptop melt if I install 30 extensions?”

Cold-start timing (measured on M2 MacBook Air, Node 20):

Phase	Median
Clone + npm ci (first run)	8-12 s
MCP registration	200 ms
Tool call round-trip (localhost)	30 ms
Memory per idle MCP server	12 MB

The CLI keeps servers warm for five minutes; after that it SIGTERM-idles them. You can cap parallelism with extensionPoolSize in ~/.geminiconfig.json.

Troubleshooting Quick-scan

Symptom	Likely Cause	Fix
“Tool not found”	Missing env var or typo in function name	`/mcp` to list registered tools
MCP server exits	Port conflict	Change port in manifest
Model calls wrong API	Ambiguous playbook	Add example dialogue in `GEMINI.md`
Confirmation loop	requireConfirmation + script mode	`--auto-confirm=false` flag

Action Checklist / Implementation Steps

Install Node ≥ 20 and npm i -g @google/gemini-cli
Pick one partner extension (Looker, Stripe, GKE) and run:
gemini extensions install <github-url>
Export required credentials; verify with /mcp inside the chat
Type a real workflow sentence; inspect returned files or logs
Fork the official template, tweak for an internal API; install with --path
Add "requireConfirmation" and audit logging before production
Share your extension URL with the team; collect GitHub stars ⭐

One-page Overview

Gemini CLI Extensions turn plain-English prompts into multi-tool workflows by packaging MCP servers, instruction playbooks and safety rules into one-command installs. Launch partners include Looker, Stripe, Figma, Harness, Postman, Shopify, Snyk, Elastic and several Google Cloud services. You can write your own extension in any language that speaks JSON-RPC, distribute it via GitHub, and keep side-effects in check with confirmation gates and audit logs. Install → export credentials → chat → ship.

FAQ

Q1: Do extensions phone home to Google?
A: No. MCP traffic is localhost; only the LLM calls (if you use Gemini-pro cloud) leave your box.

Q2: Can extensions run on ARM or Windows?
A: Yes. Node-based servers work everywhere Node runs. For compiled binaries, ship multi-arch Docker images and reference them in the manifest.

Q3: Is there a private-extension registry?
A: Not yet. Google says Q4-2025 enterprise release will include on-prem registries; for now use private Git repos and personal-access tokens.

Q4: How do I update an extension?
A: gemini extensions update <name> pulls the latest Git tag and reruns npm/pip install.

Q5: What if two extensions define the same tool name?
A: The CLI prefixes each tool with its server ID internally; you won’t see collisions, but you can alias for clarity in GEMINI.md.

Q6: Are there rate limits?
A: The CLI itself imposes no limit, but downstream APIs (Stripe, Looker, etc.) enforce their own quotas; handle HTTP 429 in your MCP server.

Q7: Can I disable an extension without uninstalling?
A: Yes. gemini extensions disable <name> keeps files but removes the server from the pool.

Happy building—your terminal is now an AI control tower.