Unlock Gemini’s Power: Local API Proxy with OpenAI Compatibility
Introduction: Bridging Gemini to Your Applications
Have you ever wanted to integrate Google’s powerful Gemini AI into your applications but found official API limits too restrictive? Meet GeminiCli2API, an innovative solution that transforms Google’s Gemini CLI into a local API service with full OpenAI compatibility. This open-source project creates a seamless bridge between Gemini’s advanced capabilities and your existing tools.
Core innovation: By leveraging Gemini CLI’s authentication, this proxy bypasses API limitations while providing standard OpenAI endpoints. All technical details are preserved exactly as in the original documentation.
Project Architecture: Three Core Components
1. 💎 Native Gemini Proxy Service (gemini-api-server.js
)
-
Function: Direct interface with Google Cloud Code Assist API -
Key capabilities: -
Automated OAuth authentication and token refresh -
API key validation via headers or URL parameters -
Role normalization fixes for consistent interactions -
Complete endpoint implementation ( listModels
,generateContent
) -
Configurable logging system
-
2. 🔄 OpenAI-Compatible Service (openai-api-server.js
)
-
Function: Translates OpenAI API calls to Gemini format -
Unique value: -
Real-time format conversion between APIs -
Full streaming support ( "stream": true
) -
Compatible with /v1/models
and/v1/chat/completions
-
Multiple authentication methods
-
3. ⚙️ Core Logic Module (gemini-core.js
)
-
Shared functionality: -
Central authentication management -
API request handling -
Response processing -
Logging implementation
-
graph LR
A[Client Request] --> B{API Format}
B -->|OpenAI| C[openai-api-server.js]
B -->|Native Gemini| D[gemini-api-server.js]
C --> E[gemini-core.js]
D --> E
E --> F[Google Cloud APIs]
F --> E
E --> G[Response Conversion]
G --> H[Client]
Key Advantages: Why Choose This Solution?
✅ Overcome API Limitations
-
Utilize Gemini CLI’s higher daily request quotas -
Bypass restrictive official API usage limits -
Maintain free access to Gemini’s advanced models
✅ Seamless OpenAI Integration
-
Connect any OpenAI-compatible client without code changes -
Use existing tools like LobeChat or NextChat immediately -
Zero migration effort for developers
✅ Enhanced Control and Visibility
-
Capture all prompts through configurable logging -
Choose between console or file-based logging -
Monitor token expiration in real-time
✅ Extensible Architecture
-
Modular design for custom enhancements -
Implement global system prompts -
Add response caching mechanisms -
Integrate content filtering layers
Installation Guide: Getting Started
Prerequisites
-
Install Node.js (version ≥18.0.0) -
Clone repository: git clone https://github.com/your-repo/geminicli2api.git
Dependency Installation
cd geminicli2api
npm install
Installs critical dependencies including
google-auth-library
anduuid
Launching Services and API Calls
Starting the Native Gemini Service
# Default startup (localhost:3000)
node gemini-api-server.js
# Network-wide access (Docker compatible)
node gemini-api-server.js 0.0.0.0
# Enable prompt logging to console
node gemini-api-server.js --log-prompts console
Calling Native Gemini APIs
# List available models
curl "http://localhost:3000/v1beta/models?key=123456"
# Generate content with system instruction
curl "http://localhost:3000/v1beta/models/gemini-2.5-pro:generateContent" \
-H "Content-Type: application/json" \
-H "x-goog-api-key: 123456" \
-d '{
"system_instruction": { "parts": [{ "text": "You are a cat named Neko." }] },
"contents": [{ "parts": [{ "text": "What's your name?" }] }]
}'
Launching OpenAI-Compatible Service
node openai-api-server.js --port 8000 --api-key your_secret_key
Calling OpenAI-Compatible Endpoints
# Standard chat completion
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_secret_key" \
-d '{
"model": "gemini-2.5-pro",
"messages": [
{"role": "system", "content": "You are a cat named Neko."},
{"role": "user", "content": "What's your name?"}
]
}'
# Streaming response
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_secret_key" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Write a 5-line poem about space"}
],
"stream": true
}'
Authentication Workflow Explained
-
Copy authorization link from terminal output when first launching -
Complete browser authentication on any device -
Paste redirect URL back into terminal -
Credentials stored at: -
Windows: C:\Users\USERNAME\.gemini\oauth_creds.json
-
macOS/Linux: ~/.gemini/oauth_creds.json
-
Tokens automatically refresh before expiration for uninterrupted service
Advanced Implementation Scenarios
1. 🔌 Integrate AI Chat Clients
-
Point LobeChat to http://localhost:8000/v1
-
Configure NextChat with your API key -
Connect VS Code extensions directly to local Gemini
2. 🔍 Build Private Prompt Libraries
# Log all prompts to file
node openai-api-server.js --log-prompts file
Sample log output:
[2025-07-21T14:30:15Z] SYSTEM: You are a senior Python developer
[2025-07-21T14:30:17Z] USER: How to optimize this sorting algorithm?
3. 🛠️ Custom Development
Extend gemini-core.js
with:
// Add global system prompt
const defaultSystemPrompt = {
parts: [{text: "Always respond using Markdown with professional but conversational tone"}]
};
// Implement simple caching
const responseCache = new Map();
function checkCache(prompt) {
return responseCache.get(prompt);
}
Technical Implementation Details
Authentication Flow
sequenceDiagram
participant User
participant Server
participant GoogleAuth
User->>Server: Launch service
Server->>GoogleAuth: Generate auth URL
GoogleAuth-->>Server: Return URL
Server-->>User: Display auth link
User->>GoogleAuth: Visit in browser
GoogleAuth-->>User: Return redirect URL
User->>Server: Paste redirect URL
Server->>GoogleAuth: Exchange token
GoogleAuth-->>Server: Return access token
Server->>User: Authentication success
Request Handling Sequence
-
Receive client request -
Validate API key (header or URL parameter) -
Convert OpenAI format to Gemini format (if needed) -
Add missing role markers -
Process system instructions -
Forward to Google APIs -
Transform and return response
Frequently Asked Questions
❓ Is this project authorized by Google?
The solution uses Google’s official OAuth authentication flow through Gemini CLI, operating within Google’s API terms of service.
❓ Are multimodal features supported?
Current version focuses on text interactions. Multimodal support is planned for future development (marked as TODO in source).
❓ How to secure API keys?
-
Pass credentials via environment variables -
Avoid hardcoding keys using --api-key
parameter -
Logging systems automatically filter sensitive keys
❓ What’s the maximum request throughput?
Performance depends on:
-
Local hardware resources -
Network bandwidth -
Google API rate limits
For production use, implement load balancing
❓ Why include OpenAI compatibility?
Many AI tools (Chatbot UI, LangChain etc.) natively support OpenAI API standards. This compatibility layer enables immediate integration without modification.
Conclusion: Expanding Gemini’s Accessibility
GeminiCli2API solves three critical challenges: overcoming API limits, enabling ecosystem compatibility, and providing enterprise-grade control. Whether you’re an individual developer exploring AI or an organization building monitored AI workflows, this project delivers a robust foundation.
Licensed under GNU GPLv3. Special acknowledgment to Google’s Gemini CLI team for their foundational work.
Next steps:
-
Clone repository and test functionality -
Connect your preferred AI client -
Extend with custom features -
Contribute to community development
Ready to experience Gemini through a new lens? Begin your API journey today!