From Idea to Production: How to Deploy Your First LLM App with a Full CI/CD Pipeline

Why This Guide Matters
Every week, developers ask me: “How do I turn this AI prototype into a real-world application?” Many have working demos in Jupyter notebooks or Hugging Face Spaces but struggle to deploy them as scalable services. This guide bridges that gap using a real-world example: a FastAPI-based image generator powered by Replicate’s Flux model. Follow along to learn how professionals ship AI applications from local code to production.
Core Functionality Explained
In a Nutshell
User submits a text prompt → FastAPI processes the request → Calls Replicate’s image generation API → Returns the generated image.
Endpoint: /generate-image
Request Format: JSON payload with a prompt
field.

Local Testing:
git clone https://github.com/JesseQin123/fastapi-cicd.git
pip install -r requirements.txt
python app/main.py
Why Automate Your Deployment?
If you can already test locally with curl
, why bother with Docker, GitHub Actions, or Kubernetes? Here’s why:
4 Key Benefits
-
Environment Consistency
Eliminate “it works on my machine” issues. Docker ensures identical environments across development, testing, and production. -
Safety Nets
Automated workflows prevent human errors like accidentalgit push --force
on critical branches. -
Auto-Scaling
Handle traffic spikes effortlessly. Kubernetes spins up new pods when your app trends on Hacker News. -
Instant Rollbacks
Bad deployment? Argo CD reverts to stable versions faster than you can say “downtime.”
Architecture Overview

How It Works
-
Code Hosting: GitHub repository stores the source code. -
CI Pipeline: GitHub Actions builds and tests Docker images. -
Image Registry: Push images to Docker Hub for storage. -
GitOps Deployment: Argo CD monitors Kubernetes manifests in Git. -
Cluster Management: Kubernetes orchestrates pods and services. -
Public Access: LoadBalancer provides a stable public IP.
Tech Stack Breakdown

Component | Role | Alternatives |
---|---|---|
FastAPI | High-performance API builder | Flask, Django |
Replicate | Hosted AI inference | AWS SageMaker |
Docker | Containerization | Podman |
GitHub Actions | CI/CD automation | GitLab CI, Jenkins |
Kubernetes | Container orchestration | Docker Swarm |
Argo CD | GitOps deployment | FluxCD |
Step-by-Step Deployment Guide
1. Set Up Local Development
-
Create a Python virtual environment: python -m venv venv source venv/bin/activate
-
Install dependencies: pip install fastapi uvicorn replicate python-dotenv
-
Core API code: # app/main.py from fastapi import FastAPI import replicate import os app = FastAPI() os.environ.get("REPLICATE_API_TOKEN") @app.post("/generate-image") async def generate_image(prompt: str): output = replicate.run( "stability-ai/stable-diffusion:...", input={"prompt": prompt} ) return {"image_url": output[0]}
Security Tip:
Always add .env
to .gitignore
to avoid exposing API keys.
2. Dockerize the Application
Dockerfile:
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY ./app /app
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Build and Run:
docker build -t yourusername/fastapi-flux:latest .
docker run -p 8000:8000 -e REPLICATE_API_TOKEN=your_token yourimage
3. Configure GitHub Actions CI
.github/workflows/ci.yml:
name: CI Pipeline
on: [push]
jobs:
build-and-deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.10
- name: Install dependencies
run: pip install -r requirements.txt
- name: Build Docker image
run: docker build -t yourusername/fastapi-flux:${{ github.sha }} .
- name: Push to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
run: docker push yourusername/fastapi-flux:${{ github.sha }}
4. Kubernetes Deployment
k8s/deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: fastapi-deployment
spec:
replicas: 2
selector:
matchLabels:
app: fastapi
template:
metadata:
labels:
app: fastapi
spec:
containers:
- name: fastapi
image: yourusername/fastapi-flux:latest
ports:
- containerPort: 8000
envFrom:
- secretRef:
name: replicate-secret
k8s/service.yaml:
apiVersion: v1
kind: Service
metadata:
name: fastapi-service
spec:
type: LoadBalancer
ports:
- port: 8000
targetPort: 8000
selector:
app: fastapi
5. Argo CD for GitOps Automation
Setup:
-
Install Argo CD via Helm. -
Create an Application
resource pointing to your Git repo’sk8s/
directory. -
Enable auto-sync for seamless deployments.
Key Features:
-
Visualize deployment status -
One-click rollbacks -
Automatic configuration drift correction

Security Best Practices
-
Secrets Management
-
Use GitHub Secrets for CI credentials. -
Store runtime secrets in Kubernetes Secrets or AWS Secrets Manager.
-
-
Image Scanning
Enable Trivy or Dependabot to detect vulnerabilities. -
Health Checks
Add liveness and readiness probes:livenessProbe: httpGet: path: /health port: 8000 initialDelaySeconds: 5 periodSeconds: 10
-
Monitoring
Track key metrics with Prometheus + Grafana:-
Request success rate -
Latency -
Resource utilization
-
-
Environment Isolation
Use Kubernetes namespaces for dev/staging/prod separation.
Next Steps for Optimization
-
Custom Models
Fine-tune models using LoRA or RLHF. -
API Security
Add JWT authentication or rate limiting. -
Caching
Cache generated images in Redis or Cloudflare R2. -
Frontend Integration
Build a React/Vue dashboard to manage prompts. -
Advanced Traffic Routing
Implement canary deployments with Istio.
Troubleshooting Common Issues
Issue | Likely Cause | Solution |
---|---|---|
Image build fails | Dependency conflicts | Pin versions in requirements.txt |
Pods in CrashLoop | Missing secrets | Verify Secret mounts in deployment |
Timeout errors | Slow model inference | Adjust timeout settings in FastAPI |
Image pull errors | Registry permissions | Configure imagePullSecrets |
Argo CD out of sync | Network policies | Check cluster firewall rules |
Recommended Resources
Ready to deploy? Clone the sample repository, follow the steps, and share a screenshot when your first pod goes green! 🚀