Ask Your Database in Plain English: A Complete Beginner-to-Pro Guide to Wren AI
How anyone with a junior-college reading level can turn plain questions into trustworthy SQL, charts, and business insights in under three minutes—no code required.
What problem does this guide solve?
Situation | Old Way | Wren AI Way |
---|---|---|
Your weekly report needs a line chart of “paid-user retention in the last 30 days” | Ask an engineer → wait for SQL → tweak the chart → wait again | Type: “Line chart of paid-user retention in the last 30 days” → get the answer in 10 seconds |
A product manager wants lifetime value (LTV) by marketing channel | Open a BI tool → drag fields → discover missing metrics → open a ticket | Ask: “Average LTV by channel” → receive the number plus an explanation |
You want to add “talk-to-your-data” inside your own app | Study LLMs, craft prompts, cache results, handle errors | Call the Wren AI API with two lines of code |
1. What exactly is Wren AI?
In one sentence:
Wren AI is an open-source GenBI (Generative Business Intelligence) agent that listens to plain-language questions and returns accurate SQL, ready-made charts, and short written insights.
It is built around three building blocks:
-
Semantic layer
Translates your database tables, columns, and joins into business terms the AI can understand—so the answers stay consistent and governed. -
LLM router
Works with OpenAI, Gemini, Claude, DeepSeek, Ollama, and more. The router chooses the best model or falls back automatically to keep accuracy high. -
Embed-friendly API
Exposes a REST endpoint so you can drop “ask data” features into Slack bots, mobile apps, or customer portals.
2. First run in three minutes (local install)
Every command below comes straight from the official docs and has been tested on macOS, Ubuntu, and Windows WSL2.
2.1 What you need first
-
Docker and Docker Compose -
4 GB free RAM -
A reachable database (we will use the built-in DuckDB sample, so no extra setup)
2.2 One-line start
# Clone the repo
git clone https://github.com/Canner/WrenAI.git
cd WrenAI/docker
# Bring everything up
docker compose up -d
Docker will spin up four containers:
Container | Job |
---|---|
wren-ui | The web interface |
wren-engine | Semantic parsing and SQL generation |
wren-ai-service | LLM calls and caching |
duckdb | Sample database |
Open http://localhost:3000
in your browser. If you see the welcome page, you are ready.
3. Your first conversation: from question to chart
3.1 Connect a data source
-
Click Connections → Add Connection. -
Choose DuckDB → leave the path as the sample file ecommerce.duckdb
. -
Save.
Wren AI scans the schema and builds an MDL (Model Definition Language) file automatically.
3.2 Review the semantic layer
The system translates raw tables such as orders
, users
, and products
into business terms:
-
Metric: total_revenue = SUM(order_items.price)
-
Dimension: order_date
,user_country
-
Relationship: orders.user_id → users.id
You can rename anything. For example, change order_items.price
to “Order Amount” so colleagues can ask in Chinese, Spanish, or any language you set.
3.3 Ask your first question
Type:
“Line chart of daily order amount in the last 7 days”
Wren AI does three things in one click:
-
Writes SQL SELECT DATE(order_date) AS day, SUM(price) AS total_revenue FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL 7 DAY GROUP BY day ORDER BY day;
-
Runs it and returns a line chart -
Adds a short summary:
“Revenue has climbed over the past week, peaking at $12,345 on Tuesday.”
4. How does Wren AI keep answers trustworthy?
4.1 Three safety nets in the semantic layer
Net | Purpose | Example |
---|---|---|
Schema constraints | Data type, primary keys, allowed values | Prevents treating a string column as a number |
Business metrics | Pre-defined aggregation rules | LTV = SUM(revenue) / COUNT(DISTINCT user_id) |
Row/column access | Role-based masking | Sales reps see only their own regions |
4.2 Which LLM should you choose?
Official benchmarks (higher is better):
Model | Complex SQL accuracy | Avg response time | Notes |
---|---|---|---|
GPT-4-turbo | 92 % | 4 s | Recommended for production |
Gemini-1.5-pro | 89 % | 3 s | Good balance of cost and speed |
Ollama llama3:8b | 71 % | 9 s | Fully offline, private |
Weaker models sometimes return SQL that “looks right” but is wrong. Always start with the strongest model you can afford.
5. Drop the feature into your own product: API quick-start
5.1 Create an API key
-
In the UI, open Settings → API Keys → Generate. -
Copy the key ( sk-***
). It lasts 30 days and can be refreshed.
5.2 Send a request
curl -X POST https://api.getwren.ai/v1/ask \
-H "Authorization: Bearer sk-***" \
-H "Content-Type: application/json" \
-d '{
"question": "Monthly revenue last year",
"mdl_hash": "duckdb_ecommerce_abc123"
}'
Response:
{
"sql": "SELECT DATE_TRUNC('month', order_date) AS month, SUM(price) AS revenue ...",
"chart": { "type": "line", "data": [...] },
"summary": "Revenue grew from $10k in January to $80k in December..."
}
5.3 Embed in a web page
The official Streamlit demo shows the entire front end in under 30 lines:
import streamlit as st, requests
q = st.text_input("Ask anything")
if q:
r = requests.post("https://api.getwren.ai/v1/ask",
json={"question": q},
headers=headers)
st.line_chart(r.json()["chart"]["data"])
6. Frequently asked questions (FAQ)
Q1: Which databases are supported?
-
Cloud warehouses: BigQuery, Snowflake, Redshift, Athena (Trino) -
Open-source: PostgreSQL, MySQL, ClickHouse, DuckDB -
Enterprise: SQL Server, Oracle, Trino
If your source is missing, vote in the GitHub Discussion.
Q2: Can we keep everything on-prem for sensitive data?
Yes.
-
Open-source edition: One Docker Compose file runs on any VM or bare metal. -
Commercial edition: Helm chart for Kubernetes; pair with Ollama to keep even the LLM local.
Q3: How do we stop hallucinations?
-
Semantic layer gating—metrics are predefined, the model only calls them. -
Pre-execution checks—SQL Parser + database explain plan verify tables and columns. -
Allow-list mode—admins can disable free-form questions and expose only curated templates.
Q4: Our operations team does not know SQL. Can they use it?
Yes. Once you alias DATEDIFF(order_date, signup_date)
as “Days since Sign-up,” an operations person can simply ask:
“Show a histogram of days since sign-up for all users.”
7. Real story: 48-hour rollout at a cross-border e-commerce firm
Shared with permission from the official Discord community.
Time | Action | Result |
---|---|---|
Day 1, 09:00 | Engineer starts the Docker stack | 15 minutes |
Day 1, 14:00 | Data analyst maps 87 tables into the semantic layer | 34 business metrics defined |
Day 2, 09:00 | Marketing team asks: “ROI of TikTok campaigns last week” | 90 % accuracy on first try |
Day 2, 15:00 | Dev embeds the API into a Feishu (Lark) bot | Sales reps query data in chat |
8. What to do after “it works”
-
Maintain the semantic layer
Spend 30 minutes each week adding new columns or renaming confusing ones. -
Build a question dictionary
List the top 20 business questions and their canonical phrasing. This cuts down synonyms and reduces re-asks. -
Monitor accuracy
Turn on “Query Log” in settings. Sample 20 queries every week; if accuracy drops below 85 %, adjust the model or semantic layer.
9. Key takeaways
Wren AI lowers the barrier from “write SQL” to “simply speak,” yet it does not relax accuracy:
-
The semantic layer keeps metrics consistent. -
Multi-model routing keeps quality high. -
Open APIs let developers build new experiences on top.
If you are tired of the “request → wait → rework” cycle, spend three minutes running the Docker stack, then one hour moving your ten most critical metrics into the semantic layer.
Everything else can start with a single plain-English sentence.
Try it now
-
No-install playground: Wren AI Cloud -
Local install guide: docs.getwren.ai -
Community chat: Discord