Sim Studio in 10 Minutes: Build, Host, and Run Your Own AI-Agent Pipeline—No Code, Full Control

Can I really sketch an AI workflow on a canvas, feed it my own documents, and keep everything offline on my GPU laptop?
Yes—Sim Studio ships the same repo in four flavors: cloud, npm one-liner, Docker Compose, and dev container. Pick one, and your first agent is live before coffee finishes dripping.


Table of Contents

  1. Cloud Route: fastest public preview
  2. Self-Hosted Playbook: four rigor levels
  3. Knowledge Base in Practice: PDF → vectors → answers
  4. Local LLM Options: Ollama vs. vLLM
  5. Troubleshooting Field Guide
  6. Author’s Reflection: Why I Moved Back to Dev Containers
  7. Action Checklist / Implementation Steps
  8. One-Page Overview
  9. FAQ – Eight Common Questions

1. Cloud Route: 3-Minute First Agent

Core question: What is the absolute fastest way to see an agent execute?
Answer: Open sim.ai, click “New Workflow”, upload a file, hit Run.

1.1 Step-by-Step Snapshot

  1. Sign in with GitHub—no credit card.
  2. Choose template “Q&A Bot with Knowledge”.
  3. Drop a 50 MB PDF (limit for cloud).
  4. Draw three blocks: HTTP Input → Knowledge Retrieval → LLM → HTTP Output.
  5. Press Run; a chat window slides in; ask “What is the pricing?”; receive sentence plus page number.

1.2 Behind the Curtain

  • The PDF is chunked with overlapping windows, embedded via text-embedding-ada-002, stored in a pgvector index.
  • Retrieval uses cosine similarity; top-4 chunks are injected into the prompt.
  • The cloud account includes managed Postgres, object storage, and a job queue—no infra thinking required.

Author’s reflection: I used the cloud template during a live customer call. The demo deployed faster than I could explain the architecture—an instant credibility win.


2. Self-Hosted Playbook: Four Rigor Levels

Core question: How do I keep data on-prem while keeping the same canvas experience?
Answer: Pick one of four officially documented paths; each gives identical UI but progressively more control.

Path One-Line Command When to Use
npm npx simstudio Laptop spike, zero local setup
Docker Compose docker compose -f docker-compose.prod.yml up -d Production server
Dev Container VS Code “Reopen in Container” Team with mixed laptops
Manual bun install + Postgres Full transparency, kernel-level tweaks

Below, every path keeps the same order: prerequisites → command → verification → mini-story.

2.1 npm One-Liner – “I Need a Demo in Ten”

Prereq: Docker Desktop running.
Command:

npx simstudio

Verification: Terminal prints → http://localhost:3000.
Story: A product manager once asked me for an AI summary bot five minutes before a call. I ran the command, dragged three nodes, and had a working Slack slash command before the Zoom link went live.

Author’s reflection: The npm path downloads :latest images—great for speed, risky for repeatability. After a surprise breaking change, I now pin tags in CI.

2.2 Docker Compose – Server Room Favorite

Prereq: Git, Docker 20+, 8 GB RAM.
Commands:

git clone https://github.com/simstudioai/sim.git
cd sim
docker compose -f docker-compose.prod.yml up -d

Verification:

curl http://localhost:3000/health  # → {"status":"ok"}

Story: Our security team mandates in-house registry. We mirrored the Sim image to Harbor, changed the compose file tag, and rolled it through ArgoCD with zero UI difference but full audit trail.

2.3 Dev Container – Goodbye “Works on My Machine”

Prereq: VS Code + Remote-Containers extension.
Steps:

  1. Clone repo; open folder.
  2. Accept “Reopen in Container” prompt.
  3. Inside container:
bun run dev:full   # starts Next.js + socket server

Story: A new intern pushed a workflow that used Node 20 builtins. My laptop had Node 18; the graph failed only for me. Switching to the dev container locked everyone to the same Bun and Node binaries—no more phantom bugs.

2.4 Manual Install – Maximum Tinker Factor

Prereq:

  • Bun runtime
  • Node.js ≥ 20 (sandboxed code execution)
  • PostgreSQL ≥ 12 with pgvector extension

Commands (abridged but functional):

# 1. DB
docker run --name simstudio-db \
  -e POSTGRES_PASSWORD=simpass \
  -e POSTGRES_DB=simstudio \
  -p 5432:5432 -d pgvector/pgvector:pg17

# 2. App
git clone https://github.com/simstudioai/sim.git
cd sim && bun install
cd apps/sim && cp .env.example .env
# Edit .env: DATABASE_URL, BETTER_AUTH_SECRET, BETTER_AUTH_URL, ENCRYPTION_KEY
bunx drizzle-kit migrate --config=./drizzle.config.ts
bun run dev          # terminal 1
bun run dev:sockets  # terminal 2

Story: I wanted to swap the sandbox driver from E2B to a home-grown Firecracker micro-VM. Manual install let me mount my own kernel—impossible inside the pre-built image.


3. Knowledge Base in Practice: PDF → Vectors → Answers

Core question: How do I make the agent answer ONLY from my documents?
Answer: Upload to a Knowledge Collection, wire the “Knowledge Retrieval” node, set top-k ≤ 4.

3.1 Detailed Walk-Through

  1. Sidebar → Knowledge → New Collection → name “HR Handbook”.
  2. Drag handbook_2025.pdf (82 pages).
  3. In workflow canvas:

    • Add Knowledge Retrieval → select collection → top-k 4 → max chunks 1000 tokens.
    • Add LLM → system prompt:

      Answer in Chinese using the following context. Cite page numbers.
      Context: {{knowledge}}
      Question: {{userQuestion}}
      
  4. Connect: HTTP Input → Knowledge Retrieval → LLM → HTTP Output.
  5. Run; type “How many annual leave days?”; receive “15 days (page 38).”

3.2 Mechanics Recap

  • Parser: pdf-parse extracts text with position metadata.
  • Chunker: sliding 512-token windows, 64-token overlap.
  • Embedder: text-embedding-ada-002 1536-dim vectors stored in pgvector.
  • Retrieval: cosine similarity + metadata filtering by doc ID.

Author’s reflection: I initially set top-k 10 and watched the LLM ignore the middle chunks. Dialing back to 4 improved accuracy and saved ~30 % tokens—cheap and effective.


4. Local LLM Options: Ollama vs. vLLM

Core question: Can I run this stack without an OpenAI API key?
Answer: Yes—use Ollama for laptops or vLLM for high-QPS clusters; both speak OpenAI-compatible endpoints.

4.1 Ollama Route – GPU Laptop Friendly

# Pulls gemma3:4b (~3 GB) automatically
docker compose -f docker-compose.ollama.yml --profile setup up -d

Access localhost:3000; model dropdown shows gemma3:4b; latency ~600 ms on RTX 4060.

Story: At a client site with air-gap requirement, we plugged a mobile workstation to the local switch; the demo ran entirely off the internal network, and IT gave us a thumbs-up before lunch.

4.2 vLLM Route – Data-Center Scale

If you already run vLLM, export:

VLLM_BASE_URL=http://vllm-cluster:8000
VLLM_API_KEY=optional

Sim treats it as OpenAI—no further code change.

Story: During Double-11 shopping surge, QPS leapt from 200 to 2000. We horizontally scaled vLLM to four A100s and updated the load-balancer in Sim’s UI; the switch took one minute and zero downtime.


5. Troubleshooting Field Guide

Symptom Root Cause One-Line Fix
Model list empty Container localhost ≠ host OLLAMA_URL=http://host.docker.internal:11434
Port clash 3000 taken npx simstudio -p 3100
Migration fails Postgres not healthy Wait 10 s then re-run migrate
Chat hangs Socket port blocked Open port 3002 in firewall

Author’s reflection: On native Linux I forgot extra_hosts and spent two hours reading Ollama logs before spotting the DNS mismatch. Docker Desktop hides this; bare metal does not.


6. Author’s Reflection: Why I Moved Back to Dev Containers

npx gave me speed, but a CI pipeline broke after Bun’s point-release changed SQL generation order. Locking the entire toolchain inside a dev container eliminated “it works here” surprises. Moral: prototype fast, lock envs early.


7. Action Checklist / Implementation Steps

  1. Pick hosting style:

    • ≤ 1 user, spike demo → npx simstudio
    • Production → docker compose -f docker-compose.prod.yml
    • Team dev → Dev Container
  2. Ensure ports 3000, 3002, 5432 are free (or remap).
  3. Set required env vars: DATABASE_URL, BETTER_AUTH_SECRET, BETTER_AUTH_URL, ENCRYPTION_KEY.
  4. Run migrations before first launch.
  5. Upload documents to Knowledge; keep top-k ≤ 4 for best accuracy.
  6. For offline: start Ollama profile or point VLLM_BASE_URL to cluster.
  7. Commit workflow JSON to Git for version control.

8. One-Page Overview

  • Sim Studio = visual canvas + runtime + queue + knowledge base in one repo.
  • Cloud: 3-minute onboarding, fully managed.
  • Self-host: four paths from npx to bare-metal; identical UI.
  • Knowledge: auto-chunk → embed → retrieve; answers cite pages.
  • Local LLM: plug Ollama for laptops, vLLM for scale; both via OpenAI-compatible API.
  • Troubleshoot: check OLLAMA_URL, port 3002, DB health.

9. FAQ – Eight Common Questions

Q1: Is Sim Studio open-source?
A: Yes, Apache 2.0; Copilot sub-service needs an API key from sim.ai.

Q2: Which models work?
A: Any OpenAI-compatible endpoint—OpenAI, Groq, Ollama, vLLM, etc.

Q3: Must I use PostgreSQL?
A: Yes; the pgvector extension is required for embeddings.

Q4: File size limits?
A: Cloud 50 MB per file; self-host configurable via MAX_FILE_SIZE.

Q5: Can I export workflows?
A: Yes, JSON format including node positions and credentials references.

Q6: Does it support cron jobs?
A: Yes, Trigger.dev backend enables scheduled flows.

Q7: Multi-user real-time editing?
A: Partial—optimistic locking now; live-collab on roadmap.

Q8: Where are the logs?
A: apps/sim/logs inside container; or docker logs simstudio-app.