Getting Started with the Tavily MCP Load Balancer
A practical guide for developers who want to spread API traffic across many keys without touching a single line of load-balancing logic
By the end of this guide you will be able to:
Spin up a local load balancer in under ten minutes Add, remove, or disable Tavily API keys without downtime Call search, crawl, extract, and map endpoints through either SSE or plain stdio Read real-time dashboards that tell you which key is healthy, which is resting, and which has retired itself
Table of Contents
-
Why Multiple API Keys Matter -
What the Tavily MCP Load Balancer Actually Does -
Ten-Minute Quick-Start: Install, Configure, Launch -
The Five Built-In Tools—With Copy-Paste JSON Examples -
Monitoring & Diagnostics: One Glance Is Enough -
Frequently Asked Questions (FAQ) -
Architecture at 1,000 Feet: Round-Robin, Fail-Over, Health Checks -
Developer Tips: Hot Reload, Debugging, Testing -
Next Steps & Further Reading
1. Why Multiple API Keys Matter
Imagine ordering a ride during rush hour with only one car in the entire fleet. The same bottleneck happens when you rely on a single Tavily API key:
-
Rate limits return HTTP 429 -
Single-point failure brings your service down the moment the key expires -
Zero elasticity prevents you from handling traffic spikes
A pool of keys plus a load balancer turns that single car into a coordinated fleet:
-
Requests are distributed so no key is overwhelmed -
A dead key is automatically parked until it recovers -
New keys can join the fleet at runtime without configuration gymnastics
2. What the Tavily MCP Load Balancer Actually Does
In plain English, the load balancer sits between your application and the Tavily servers and performs three jobs:
Job | How It Helps You |
---|---|
Round-robin scheduling | Sends each request to the next available key in line |
Fail-over | Removes a key from rotation after five consecutive errors and tries to revive it later |
Real-time metrics | Exposes JSON and CLI dashboards so you can see which keys are active, disabled, or recently revived |
3. Ten-Minute Quick-Start: Install, Configure, Launch
3.1 Install Dependencies
git clone <repository-url>
cd tavily-mcp-loadbalancer
npm install
3.2 Provide Your Keys
Copy the template file and open it in your editor:
cp .env.example .env
# open .env
Add keys as a comma-separated list:
TAVILY_API_KEYS=tvly-dev-key1,tvly-dev-key2,tvly-dev-key3
Only one key? You can use
TAVILY_API_KEY=tvly-dev-single
, but you will lose the balancing benefits.
3.3 Launch the Server
The fastest route:
npm run build-and-start
Terminal output:
✅ Server listening on http://0.0.0.0:60002
SSE endpoint: /sse
Message endpoint: /message
Open http://localhost:60002/sse
in any browser to confirm the connection is alive.
4. The Five Built-In Tools—With Copy-Paste JSON Examples
All examples can be sent to either:
-
The stdio interface: node dist/index.js
-
The HTTP interface: POST http://0.0.0.0:60002/message
4.1 tavily-search: Plain Web Search
Task: Find recent articles on “OpenAI GPT-4”.
{
"name": "tavily-search",
"arguments": {
"query": "OpenAI GPT-4 latest paper",
"search_depth": "basic",
"topic": "academic",
"max_results": 5
}
}
4.2 tavily-extract: Pull Clean Content from URLs
Task: Convert a long blog post into Markdown.
{
"name": "tavily-extract",
"arguments": {
"urls": ["https://example.com/long-blog"],
"extract_depth": "markdown"
}
}
4.3 tavily-crawl: Spider a Small Site
Task: Retrieve the first two levels of a documentation site.
{
"name": "tavily-crawl",
"arguments": {
"url": "https://docs.example.com",
"max_depth": 2,
"limit": 50
}
}
4.4 tavily-map: Build a Mini-Sitemap
Task: Audit SEO by collecting all reachable links.
{
"name": "tavily-map",
"arguments": {
"url": "https://example.com",
"max_depth": 3
}
}
4.5 tavily_get_stats: Health Overview of the Key Pool
{
"name": "tavily_get_stats",
"arguments": {}
}
5. Monitoring & Diagnostics: One Glance Is Enough
5.1 CLI Dashboard
Run the helper script:
./manage.sh stats
Sample output:
📊 API Key Pool Report
========================================
Total keys: 3
Active keys: 2
🔑 Key: tvly-dev-T...
Status: 🟢 Active
Errors: 0/5
Last used: 2024-07-30T14:25:00.000Z
🔑 Key: tvly-dev-Y...
Status: 🔴 Disabled
Errors: 5/5
Last used: 2024-07-30T14:20:00.000Z
5.2 Programmatic Access
node check_stats_direct.cjs
Returns pure JSON—perfect for Grafana or custom dashboards.
6. Frequently Asked Questions (FAQ)
Question | Short Answer |
---|---|
Can I run with only one key? | Yes, but you lose load-balancing benefits. |
A key shows “Disabled.” Now what? | Fix the key (e.g., billing), then restart or wait for the automatic retry. |
How do I add more keys later? | Edit .env , restart the service. Zero downtime. |
Port 60002 is already in use. | Change SUPERGATEWAY_PORT in .env and relaunch. |
How can I test all tools quickly? | ./manage.sh test runs a suite against every endpoint. |
7. Architecture at 1,000 Feet: Round-Robin, Fail-Over, Health Checks
7.1 Round-Robin Scheduling
Keys are stored in an array. Each incoming request uses the next index:
index = (index + 1) % keys.length
7.2 Fail-Over Logic
-
Any non-2xx response increments an error counter -
After five consecutive errors, the key is disabled -
Every 60 seconds a background job revives the key with a lightweight health check -
If the check passes, the key rejoins the rotation
7.3 Health Check Details
-
Uses a HEAD request to minimize bandwidth -
Timeout set to 2 seconds -
Records last-used timestamp for latency troubleshooting
8. Developer Tips: Hot Reload, Debugging, Testing
8.1 Hot Reload Development
npm run dev
The service restarts automatically when you save a file; SSE clients reconnect transparently.
8.2 Debug Logging
DEBUG=tavily:* npm run start-gateway
You’ll see granular logs such as:
tavily:balancer picked key tvly-dev-key2 +2s
tavily:health revived key tvly-dev-key3 +45s
8.3 Unit & Integration Tests
npm test
Twelve test cases cover search, extraction, fail-over, and revival.
9. Next Steps & Further Reading
You now own a self-healing, horizontally scalable gateway to the Tavily API. Common next moves:
-
Front-end integration: Point your React, Vue, or Svelte app to http://localhost:60002/sse
for real-time search suggestions -
Containerization: Wrap the service in Docker and deploy to any cloud; keys can be injected via secrets managers -
Observability: Pipe the JSON stats endpoint into Prometheus and wire Alertmanager to your Slack or WeChat workspace
Happy coding, and may your logs stay free of 429 errors.