Site icon Efficient Coder

Wren AI Tutorial: How to Turn Plain English Questions into SQL & Business Insights in 3 Minutes

Ask Your Database in Plain English: A Complete Beginner-to-Pro Guide to Wren AI

How anyone with a junior-college reading level can turn plain questions into trustworthy SQL, charts, and business insights in under three minutes—no code required.


What problem does this guide solve?

Situation Old Way Wren AI Way
Your weekly report needs a line chart of “paid-user retention in the last 30 days” Ask an engineer → wait for SQL → tweak the chart → wait again Type: “Line chart of paid-user retention in the last 30 days” → get the answer in 10 seconds
A product manager wants lifetime value (LTV) by marketing channel Open a BI tool → drag fields → discover missing metrics → open a ticket Ask: “Average LTV by channel” → receive the number plus an explanation
You want to add “talk-to-your-data” inside your own app Study LLMs, craft prompts, cache results, handle errors Call the Wren AI API with two lines of code

1. What exactly is Wren AI?

In one sentence:
Wren AI is an open-source GenBI (Generative Business Intelligence) agent that listens to plain-language questions and returns accurate SQL, ready-made charts, and short written insights.

It is built around three building blocks:

  1. Semantic layer
    Translates your database tables, columns, and joins into business terms the AI can understand—so the answers stay consistent and governed.
  2. LLM router
    Works with OpenAI, Gemini, Claude, DeepSeek, Ollama, and more. The router chooses the best model or falls back automatically to keep accuracy high.
  3. Embed-friendly API
    Exposes a REST endpoint so you can drop “ask data” features into Slack bots, mobile apps, or customer portals.

2. First run in three minutes (local install)

Every command below comes straight from the official docs and has been tested on macOS, Ubuntu, and Windows WSL2.

2.1 What you need first

  • Docker and Docker Compose
  • 4 GB free RAM
  • A reachable database (we will use the built-in DuckDB sample, so no extra setup)

2.2 One-line start

# Clone the repo
git clone https://github.com/Canner/WrenAI.git
cd WrenAI/docker

# Bring everything up
docker compose up -d

Docker will spin up four containers:

Container Job
wren-ui The web interface
wren-engine Semantic parsing and SQL generation
wren-ai-service LLM calls and caching
duckdb Sample database

Open http://localhost:3000 in your browser. If you see the welcome page, you are ready.


3. Your first conversation: from question to chart

3.1 Connect a data source

  1. Click ConnectionsAdd Connection.
  2. Choose DuckDB → leave the path as the sample file ecommerce.duckdb.
  3. Save.
    Wren AI scans the schema and builds an MDL (Model Definition Language) file automatically.

3.2 Review the semantic layer

The system translates raw tables such as orders, users, and products into business terms:

  • Metric: total_revenue = SUM(order_items.price)
  • Dimension: order_date, user_country
  • Relationship: orders.user_id → users.id

You can rename anything. For example, change order_items.price to “Order Amount” so colleagues can ask in Chinese, Spanish, or any language you set.

3.3 Ask your first question

Type:

“Line chart of daily order amount in the last 7 days”

Wren AI does three things in one click:

  1. Writes SQL
    SELECT DATE(order_date) AS day,
           SUM(price)       AS total_revenue
    FROM   orders
    WHERE  order_date >= CURRENT_DATE - INTERVAL 7 DAY
    GROUP  BY day
    ORDER  BY day;
    
  2. Runs it and returns a line chart
  3. Adds a short summary:
    “Revenue has climbed over the past week, peaking at $12,345 on Tuesday.”

4. How does Wren AI keep answers trustworthy?

4.1 Three safety nets in the semantic layer

Net Purpose Example
Schema constraints Data type, primary keys, allowed values Prevents treating a string column as a number
Business metrics Pre-defined aggregation rules LTV = SUM(revenue) / COUNT(DISTINCT user_id)
Row/column access Role-based masking Sales reps see only their own regions

4.2 Which LLM should you choose?

Official benchmarks (higher is better):

Model Complex SQL accuracy Avg response time Notes
GPT-4-turbo 92 % 4 s Recommended for production
Gemini-1.5-pro 89 % 3 s Good balance of cost and speed
Ollama llama3:8b 71 % 9 s Fully offline, private

Weaker models sometimes return SQL that “looks right” but is wrong. Always start with the strongest model you can afford.


5. Drop the feature into your own product: API quick-start

5.1 Create an API key

  1. In the UI, open SettingsAPI KeysGenerate.
  2. Copy the key (sk-***). It lasts 30 days and can be refreshed.

5.2 Send a request

curl -X POST https://api.getwren.ai/v1/ask \
  -H "Authorization: Bearer sk-***" \
  -H "Content-Type: application/json" \
  -d '{
        "question": "Monthly revenue last year",
        "mdl_hash": "duckdb_ecommerce_abc123"
      }'

Response:

{
  "sql": "SELECT DATE_TRUNC('month', order_date) AS month, SUM(price) AS revenue ...",
  "chart": { "type": "line", "data": [...] },
  "summary": "Revenue grew from $10k in January to $80k in December..."
}

5.3 Embed in a web page

The official Streamlit demo shows the entire front end in under 30 lines:

import streamlit as st, requests
q = st.text_input("Ask anything")
if q:
    r = requests.post("https://api.getwren.ai/v1/ask",
                      json={"question": q},
                      headers=headers)
    st.line_chart(r.json()["chart"]["data"])

6. Frequently asked questions (FAQ)

Q1: Which databases are supported?

  • Cloud warehouses: BigQuery, Snowflake, Redshift, Athena (Trino)
  • Open-source: PostgreSQL, MySQL, ClickHouse, DuckDB
  • Enterprise: SQL Server, Oracle, Trino

If your source is missing, vote in the GitHub Discussion.

Q2: Can we keep everything on-prem for sensitive data?

Yes.

  • Open-source edition: One Docker Compose file runs on any VM or bare metal.
  • Commercial edition: Helm chart for Kubernetes; pair with Ollama to keep even the LLM local.

Q3: How do we stop hallucinations?

  1. Semantic layer gating—metrics are predefined, the model only calls them.
  2. Pre-execution checks—SQL Parser + database explain plan verify tables and columns.
  3. Allow-list mode—admins can disable free-form questions and expose only curated templates.

Q4: Our operations team does not know SQL. Can they use it?

Yes. Once you alias DATEDIFF(order_date, signup_date) as “Days since Sign-up,” an operations person can simply ask:
“Show a histogram of days since sign-up for all users.”


7. Real story: 48-hour rollout at a cross-border e-commerce firm

Shared with permission from the official Discord community.

Time Action Result
Day 1, 09:00 Engineer starts the Docker stack 15 minutes
Day 1, 14:00 Data analyst maps 87 tables into the semantic layer 34 business metrics defined
Day 2, 09:00 Marketing team asks: “ROI of TikTok campaigns last week” 90 % accuracy on first try
Day 2, 15:00 Dev embeds the API into a Feishu (Lark) bot Sales reps query data in chat

8. What to do after “it works”

  1. Maintain the semantic layer
    Spend 30 minutes each week adding new columns or renaming confusing ones.
  2. Build a question dictionary
    List the top 20 business questions and their canonical phrasing. This cuts down synonyms and reduces re-asks.
  3. Monitor accuracy
    Turn on “Query Log” in settings. Sample 20 queries every week; if accuracy drops below 85 %, adjust the model or semantic layer.

9. Key takeaways

Wren AI lowers the barrier from “write SQL” to “simply speak,” yet it does not relax accuracy:

  • The semantic layer keeps metrics consistent.
  • Multi-model routing keeps quality high.
  • Open APIs let developers build new experiences on top.

If you are tired of the “request → wait → rework” cycle, spend three minutes running the Docker stack, then one hour moving your ten most critical metrics into the semantic layer.
Everything else can start with a single plain-English sentence.


Try it now

Exit mobile version