Building a Robust Serverless AI Proxy with Cloudflare Workers

In today’s fast-paced digital landscape, developers and data scientists need seamless, reliable access to state-of-the-art AI models. Yet, regional restrictions, API key security concerns, and latency issues often stand in the way. Enter Cloudflare Workers: a serverless solution that empowers you to deploy an edge-based AI proxy, bridging the gap between your users and Google’s Gemini and Imagen models. This post walks you through setting up a secure, high-performance Cloudflare Worker that forwards requests to Gemini for text generation and Imagen for image creation—no VPN required.


Table of Contents

  1. Why Use Cloudflare Workers for AI Proxy?
  2. Key Benefits of an Edge-Based AI Proxy
  3. Architecture Overview
  4. Prerequisites and Account Setup
  5. Step-by-Step Deployment Guide

  6. Deep Dive: Module Breakdown

  7. Customizing Your AI Proxy

  8. Performance Optimization Tips
  9. Security Best Practices for API Proxies
  10. Comparing API Forwarding vs. Traditional VPN Proxies
  11. Use Cases and Real-World Examples
  12. Open Source License and Contribution Guide
  13. Conclusion and Next Steps

Why Use Cloudflare Workers for AI Proxy?

Cloudflare Workers offer a serverless execution environment that runs JavaScript at the edge—right where your users are. This architecture delivers:

  • Low Latency: Requests are routed through the nearest Cloudflare data center, reducing round-trip time for AI model inference.
  • Scalability: Automatically scales with traffic spikes without manual server management.
  • Cost Efficiency: Generous free tier for development, with pay-as-you-go pricing for production workloads.
  • Security: Environment variables ensure API keys never leak to client code.

By deploying an AI proxy on Workers, you gain a centralized, secure gateway to Google’s AI services. Whether you’re a startup building a chatbot or an enterprise integrating generative AI features, this pattern accelerates development.


Key Benefits of an Edge-Based AI Proxy

  1. Global Accessibility

    • Regions with Google API restrictions can use the proxy to bypass limitations without violating policies.
  2. API Key Safety

    • Store sensitive keys in encrypted environment variables, away from browser exposure.
  3. Flexible Front End

    • A lightweight HTML/JavaScript interface allows seamless user interaction without complex frameworks.
  4. Real-Time Streaming

    • Display partial model outputs as they arrive, enhancing user experience for long-form content generation.
  5. Unified Endpoint

    • One URL handles both text and image generation requests, simplifying client integration.

Architecture Overview

flowchart LR
    A[Client Browser] -->|HTTPS Request| B[Cloudflare Worker]
    B --> C{Route Request}
    C -->|Text Generation| D[Google Gemini API]
    C -->|Image Generation| E[Google Imagen API]
    D -->|Stream/JSON| B
    E -->|Base64/Image| B
    B -->|CORS Headers| A
  • Cloudflare Worker acts as both a reverse proxy and API orchestrator.
  • Routes incoming requests based on payload type (text vs. image).
  • Manages API keys securely and injects required headers.
  • Streams model responses back to clients with minimal overhead.

Prerequisites and Account Setup

Before you begin:

  1. Cloudflare Account: Sign up at Cloudflare and enable Workers.

  2. Google AI Access: Obtain an API key from Google AI Studio or Vertex AI.

  3. Wrangler CLI (Optional): Install wrangler for local development and deployment.

    npm install -g wrangler
    
  4. Basic JavaScript Knowledge: Familiarity with modern ES modules and fetch API.


Step-by-Step Deployment Guide

1. Create Your Cloudflare Worker

  1. Log in to the Cloudflare dashboard.
  2. Navigate to Workers & Pages > Create application.
  3. Choose Worker and name it serverless-ai-proxy.
  4. Click Create to initialize your project.

2. Configure Environment Variables

Under the Worker’s Settings tab:

  • Go to Variables > Environment Variables.

  • Add a variable:

    • Name: GEMINI_API_KEY
    • Value: Your Google AI API key
    • Encrypt the value to secure it.

This keeps your API credentials safe and separate from client-side code.

3. Implement Core Modules

In the Cloudflare online editor, create the following files:

  • worker.js
  • request-handler.js
  • api-handler.js
  • html-renderer.js
  • utils.js

Paste the modular code (detailed in Deep Dive: Module Breakdown) into each.

4. Build a Simple HTML Front End

The html-renderer.js module generates a minimal interface:

  • Dropdown for model selection
  • Theme toggle (light/dark)
  • Textarea for prompts
  • Buttons for text/image requests
  • Response display area

No external dependencies ensure quick load times and easy customization.

5. Enable CORS and Streaming

  • Configure CORS headers (Access-Control-Allow-Origin: *) in request-handler.js.
  • Leverage Cloudflare’s streaming Fetch API to push partial responses to clients.

This setup ensures broad compatibility with single-page apps and frameworks like React or Vue.

6. Save, Deploy, and Test

  • Click Save and Deploy in the Cloudflare dashboard.

  • Your worker URL (e.g., https://serverless-ai-proxy.yourdomain.workers.dev) is now live.

  • Test text generation:

    curl -X POST https://serverless-ai-proxy.yourdomain.workers.dev \
         -H "Content-Type: application/json" \
         -d '{"model":"gemini-text-v1","prompt":"Explain edge computing in simple terms."}'
    
  • Test image generation:

    curl -X POST https://serverless-ai-proxy.yourdomain.workers.dev \
         -H "Content-Type: application/json" \
         -d '{"model":"imagen-v1","prompt":"A futuristic city skyline at dusk"}'
    

Deep Dive: Module Breakdown

request-handler.js

Handles routing and CORS setup.

import { handleText, handleImage } from './api-handler.js';
import { htmlTemplate } from './html-renderer.js';

export default {
  async fetch(request) {
    const url = new URL(request.url);
    if (request.method === 'GET') {
      return new Response(htmlTemplate(), { headers: { 'Content-Type': 'text/html' } });
    }
    const body = await request.json();
    const modelType = body.model.startsWith('imagen') ? 'image' : 'text';
    if (modelType === 'text') {
      return handleText(body, request);
    }
    return handleImage(body, request);
  }
};
  • Parses GET vs. POST
  • Serves HTML on GET
  • Delegates text/image requests

api-handler.js

Communicates with Google APIs.

const API_KEY = ENV.GEMINI_API_KEY;
const GEMINI_URL = 'https://api.google.com/v1/gemini';
const IMAGEN_URL = 'https://api.google.com/v1/imagen';

export async function handleText(body, request) {
  const response = await fetch(GEMINI_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ prompt: body.prompt, model: body.model })
  });
  return new Response(response.body, {
    headers: { 'Content-Type': 'application/json' }
  });
}

export async function handleImage(body, request) {
  const response = await fetch(IMAGEN_URL, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ prompt: body.prompt, model: body.model })
  });
  const json = await response.json();
  return new Response(JSON.stringify({ image: json.data }), {
    headers: { 'Content-Type': 'application/json' }
  });
}
  • Securely injects API key
  • Handles JSON payloads
  • Streams or parses response

html-renderer.js

Generates the front-end template.

export function htmlTemplate() {
  return `
<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>AI Proxy Demo</title>
  <style>/* basic styles */</style>
</head>
<body>
  <h1>AI Proxy Interface</h1>
  <select id="modelSelect">
    <option value="gemini-text-v1">Gemini Text</option>
    <option value="imagen-v1">Imagen Image</option>
  </select>
  <textarea id="prompt" placeholder="Enter your prompt"></textarea>
  <button id="sendBtn">Send</button>
  <pre id="output"></pre>
  <script>
    document.getElementById('sendBtn').onclick = async () => {
      const model = document.getElementById('modelSelect').value;
      const prompt = document.getElementById('prompt').value;
      const res = await fetch(location.href, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ model, prompt })
      });
      const data = await res.json();
      document.getElementById('output').textContent = JSON.stringify(data, null, 2);
    };
  </script>
</body>
</html>`;
}
  • Self-contained HTML/JS
  • No external CSS or frameworks
  • Easy to fork and customize

utils.js

Helper functions (e.g., logging, error handling).

export function logRequest(request) {
  console.log(`[${new Date().toISOString()}] ${request.method} ${request.url}`);
}

export function errorResponse(message, status = 500) {
  return new Response(JSON.stringify({ error: message }), {
    status,
    headers: { 'Content-Type': 'application/json' }
  });
}

Customizing Your AI Proxy

Modifying Themes and Layout

  • Tweak CSS in html-template string.
  • Add dark/light toggle logic in JavaScript.
  • Integrate Tailwind CSS via CDN for rapid prototyping.

Adding or Restricting Models

  • In api-handler.js, update the allowedModels array.
  • Validate body.model against allowed list to prevent abuse.

Integrating Persistent Storage (KV, R2)

  • Use Cloudflare KV to store user session data or logs.
  • Leverage R2 object storage for caching large image outputs.

Performance Optimization Tips

  1. Minify Code: Enable minification in Worker settings.
  2. Cache Static Assets: Use Cache-Control headers for the HTML front end.
  3. Edge Caching: Cache repeated prompts or models with low variation.
  4. Batch Requests: For high-throughput scenarios, batch multiple prompts in one request.

Security Best Practices for API Proxies

  • Rotate API Keys: Regularly revoke and issue new keys.
  • Rate Limiting: Implement basic throttling on client requests.
  • Input Validation: Sanitize prompts to prevent injection attacks.
  • Logging & Monitoring: Stream logs to an external service for audit.

Comparing API Forwarding vs. Traditional VPN Proxies

Feature AI Proxy (This Project) VPN/Network Proxy
Target Scope Specific AI endpoints (Gemini) Any internet endpoint
Layer Application (HTTP/HTTPS) Network (TCP/UDP)
Compliance Fully compliant with Cloudflare Often blocked or risky
Security Encrypted API keys, controlled Exposes idle traffic
Use Case AI model access, data exchange Bypassing geo-blocks
Complexity Low setup, modular code High (servers, routing)

Use Cases and Real-World Examples

  • Chatbots & Virtual Assistants
    Deploy conversational agents that leverage Gemini’s natural language capabilities.
  • Content Generation Platforms
    Integrate text and image generation for blogs, marketing, or social media content.
  • Education & Prototyping
    Rapidly test AI prompts in classrooms or hackathons without infrastructure overhead.
  • Enterprise Automation
    Embed AI-driven insights into dashboards, CRMs, or analytics tools.

Open Source License and Contribution Guide

This project is released under the GNU General Public License v3.0 (GPLv3). You are free to:

  • Use, modify, and redistribute the code.
  • Contribute via Pull Requests on the discover branch.
  • Report issues or request features through GitHub Issues.

Please ensure any derivative works maintain the same open-source licensing.


Conclusion and Next Steps

Implementing a serverless AI proxy with Cloudflare Workers streamlines global access to advanced models like Gemini and Imagen. By combining edge computing, secure key management, and a customizable front end, you gain:

  • Rapid AI integration without server overhead
  • Scalability to meet unpredictable traffic
  • Compliance with security and regional policies

Ready to take your AI applications to the next level? Clone the repository on GitHub, customize the modules to your needs, and deploy your own serverless-ai-proxy. Empower your users with low-latency, unrestricted AI experiences—right from the edge.


Happy coding and AI innovating!