stonegate Documentation
One endpoint for all your MCP servers. Tool discovery, automatic routing, connection pooling, circuit breakers, crash recovery, and async dispatch. Your agents share tools — no duplicate processes, no fragile connections.
1. Overview
stonegate is a compiled Rust binary that acts as a routing multiplexer for MCP (Model Context Protocol) servers. Instead of each agent connecting directly to every MCP server it needs, agents connect to stonegate once — and stonegate routes tool calls to the correct backend, manages connections, handles failures, and provides a unified tool catalog.
Key Capabilities
- Unified tool catalog — All tools from all backends appear in a single catalog. Agents see one flat namespace of available tools without knowing which server hosts which tool.
- Automatic routing — When an agent calls a tool, stonegate routes the call to the correct backend server transparently. No routing configuration needed — the tool name is the routing key.
- Connection pooling — Multiple agents share backend connections. One stonegate process manages all MCP server connections, eliminating duplicate processes and fragile per-agent connections.
- Circuit breaker — If a backend server starts failing, stonegate opens the circuit — subsequent calls fail fast instead of timing out. The circuit automatically closes when the backend recovers.
- Crash recovery — If a backend MCP server crashes, stonegate detects the failure and restarts it automatically. Agents see a brief error, then service resumes.
- Async dispatch — Long-running tool calls can be dispatched asynchronously. The agent gets a call ID immediately and polls for the result later.
- Tool schema caching — Backend tool schemas are cached to avoid redundant discovery calls. Cache invalidation happens on backend restart.
- Runtime server management — Add, remove, start, stop, and restart backend servers without restarting stonegate.
Why Use stonegate?
Without stonegate, every agent in your system needs its own connection to every MCP server it uses. With 10 agents and 5 MCP servers, that's 50 connections — 50 server processes, 50 points of failure, 50 configuration entries. With stonegate, it's 10 connections to the gateway and 5 managed backends. One place to configure, monitor, and debug.
Works with any MCP server. stonegate doesn't care what language your backends are written in — Rust, Python, TypeScript, Go. If it speaks MCP (stdio or HTTP), stonegate can manage it.
2. Installation
macOS (Homebrew)
brew install keystoneproject/tap/stonegate
macOS (Direct Download)
# Apple Silicon
curl -L https://keystoneproject.dev/releases/stonegate/darwin-aarch64/stonegate-v1.1.0-darwin-aarch64.tar.gz | tar xz
sudo mv stonegate /usr/local/bin/
# Intel
curl -L https://keystoneproject.dev/releases/stonegate/darwin-x86_64/stonegate-v1.1.0-darwin-x86_64.tar.gz | tar xz
sudo mv stonegate /usr/local/bin/
Linux
# x86_64
curl -L https://keystoneproject.dev/releases/stonegate/linux-x86_64/stonegate-v1.1.0-linux-x86_64.tar.gz | tar xz
sudo mv stonegate /usr/local/bin/
# ARM64 (aarch64)
curl -L https://keystoneproject.dev/releases/stonegate/linux-aarch64/stonegate-v1.1.0-linux-aarch64.tar.gz | tar xz
sudo mv stonegate /usr/local/bin/
3. Quick Start
Step 1: Activate
stonegate activate --key SG-XXXX-XXXX-XXXX-XXXX
stonegate status
Step 2: Configure backends
Create a config file at ~/.stonegate/config.toml:
[server]
host = "127.0.0.1"
port = 3393
[[backends]]
name = "stonemem"
command = "stonemem"
args = ["serve"]
transport = "stdio"
[[backends]]
name = "filesystem"
command = "npx"
args = ["-y", "@anthropic/mcp-filesystem"]
transport = "stdio"
[[backends]]
name = "web-search"
command = "python"
args = ["-m", "mcp_web_search"]
transport = "stdio"
Step 3: Start
stonegate serve
stonegate starts, launches all configured backends, discovers their tools, and begins routing.
Step 4: Connect your agent
{
"mcpServers": {
"stonegate": {
"url": "http://localhost:3393",
"transport": "http"
}
}
}
Your agent now has access to every tool from every backend — through one connection.
Step 5: Call a tool
# Discover all available tools
curl http://localhost:3393/tools
# Call a tool (stonegate routes to the correct backend)
curl -X POST http://localhost:3393/call \
-H "Content-Type: application/json" \
-d '{
"server": "stonemem",
"tool": "mem_search",
"arguments": {"query": "deployment schedule"}
}'
4. Configuration
[server]
host = "127.0.0.1"
port = 3393
data_dir = "~/.stonegate"
[license]
key_file = "~/.stonegate/license.key"
license_server = "https://license.keystoneproject.dev"
[gateway]
max_connections = 100 # Max concurrent agent connections
request_timeout = 30000 # Per-call timeout (ms)
schema_cache_ttl = 300 # Tool schema cache TTL (seconds)
[circuit_breaker]
failure_threshold = 5 # Failures before circuit opens
reset_timeout = 30 # Seconds before half-open test
success_threshold = 2 # Successes to close circuit
[pool]
max_idle = 10 # Max idle backend connections
idle_timeout = 300 # Idle connection timeout (seconds)
[[backends]]
name = "stonemem" # Unique backend name
command = "stonemem" # Binary to launch
args = ["serve"] # Command arguments
transport = "stdio" # "stdio" or "http"
# url = "http://localhost:3391" # For HTTP transport
env = { STONEMEM_PORT = "3391" } # Optional environment variables
auto_restart = true # Restart on crash (default: true)
health_check = true # Enable health checks (default: true)
[logging]
level = "info"
format = "json"
Adding backends at runtime
# Add a new backend without restarting stonegate
curl -X POST http://localhost:3393/server/add \
-H "Content-Type: application/json" \
-d '{
"name": "custom-tools",
"command": "/path/to/custom-mcp-server",
"args": [],
"transport": "stdio"
}'
5. API Reference
Tool Operations
Call a tool on a backend server. stonegate routes the call, manages the connection, and returns the result.
Request Body
{
"server": "string", // Required. Backend server name.
"tool": "string", // Required. Tool name.
"arguments": {} // Required. Tool arguments (JSON object).
}
Response
{
"result": { ... }, // Tool output (varies by tool)
"server": "stonemem",
"tool": "mem_search",
"duration_ms": 12
}
Error Response
{
"error": "Circuit open for server 'web-search' — backend is failing",
"code": "CIRCUIT_OPEN"
}
List all tools across all backends. Returns a unified catalog.
Response
{
"tools": [
{
"name": "mem_search",
"server": "stonemem",
"description": "Full-text search across stored entries",
"parameters": { ... }
},
{
"name": "read_file",
"server": "filesystem",
"description": "Read a file from the filesystem",
"parameters": { ... }
}
],
"total": 42
}
List tools for a specific backend server.
Discover tools matching a keyword or pattern. Useful for agents that need to find the right tool.
Query Parameters
?query=search // Keyword to match against tool names and descriptions
Get the full JSON Schema for a specific tool's parameters.
Server Management
List all backend servers with their status, tool count, and health.
Response
{
"servers": [
{
"name": "stonemem",
"status": "running",
"transport": "stdio",
"tool_count": 8,
"uptime_seconds": 3600,
"circuit": "closed",
"total_calls": 1247,
"error_rate": 0.001
}
]
}
Start a stopped backend server.
{ "name": "string" }
Stop a running backend server gracefully.
{ "name": "string" }
Restart a backend server. Useful after configuration changes.
{ "name": "string" }
Add a new backend server at runtime without restarting stonegate.
{
"name": "string", // Required. Unique backend name.
"command": "string", // Required. Binary path.
"args": ["string"], // Optional. Arguments.
"transport": "stdio", // "stdio" or "http".
"url": "string", // Required for HTTP transport.
"env": {} // Optional. Environment variables.
}
Monitoring
Get current gateway configuration (sanitized — no secrets).
Get recent call log entries with timing and status.
Query Parameters
?limit=50 // Max entries (default: 50)
&server=stonemem // Filter by backend
&status=error // Filter: success, error, timeout
Enterprise only. Audit logging and log export require an Enterprise license.
Gateway statistics — total calls, error rates, latency percentiles, per-backend breakdowns.
Response
{
"total_calls": 15234,
"total_errors": 23,
"error_rate": 0.0015,
"avg_latency_ms": 45,
"p95_latency_ms": 120,
"p99_latency_ms": 340,
"backends": {
"stonemem": { "calls": 8000, "errors": 2, "avg_ms": 12 },
"filesystem": { "calls": 5000, "errors": 5, "avg_ms": 35 },
"web-search": { "calls": 2234, "errors": 16, "avg_ms": 890 }
}
}
Gateway health — version, tier, uptime, backend count, overall status.
{
"status": "ok",
"version": "1.1.0",
"tier": "pro",
"uptime_seconds": 86400,
"backends_total": 5,
"backends_healthy": 5,
"circuits_open": 0
}
6. MCP Tool Reference
When connected as an MCP server itself, stonegate exposes gateway management tools alongside all proxied backend tools:
gate_call
Tool: gate_call
Parameters:
- server (string, required): Backend server name
- tool (string, required): Tool name
- arguments (object, required): Tool arguments
Example:
gate_call({
server: "stonemem",
tool: "mem_search",
arguments: { query: "deployment schedule", limit: 5 }
})
gate_discover
Tool: gate_discover
Parameters:
- query (string, optional): Keyword search across tool names/descriptions
Example:
gate_discover({ query: "file read" })
// Returns tools matching "file" or "read" from any backend
gate_status
Tool: gate_status
Parameters:
- server (string, optional): Filter to a specific backend
Example:
gate_status({ server: "web-search" })
// Returns health, circuit state, call stats for web-search backend
gate_call_async
Tool: gate_call_async (Pro/Enterprise)
Parameters:
- server (string, required): Backend server
- tool (string, required): Tool name
- arguments (object, required): Tool arguments
Returns immediately with a call_id.
Agent polls with gate_poll(call_id) or gate_await(call_id, timeout_ms).
7. Use Cases by Platform
Unified Tool Access for Agent Estates
Multiple Claude Code instances share a single stonegate gateway. Instead of configuring 20 MCP servers per agent, each agent connects to stonegate once.
# Before stonegate (per-agent config, 5 servers x 10 agents = 50 processes):
{
"mcpServers": {
"filesystem": { "command": "npx", "args": ["@anthropic/mcp-filesystem"] },
"database": { "command": "python", "args": ["-m", "mcp_postgres"] },
"stonemem": { "command": "stonemem", "args": ["serve"] },
"web-search": { "command": "python", "args": ["-m", "mcp_web_search"] },
"custom": { "command": "./custom-tools" }
}
}
# After stonegate (per-agent config, 1 connection):
{
"mcpServers": {
"stonegate": { "url": "http://localhost:3393" }
}
}
# Agent calls any tool through the gateway:
gate_call({ server: "database", tool: "query", arguments: { sql: "SELECT ..." } })
gate_call({ server: "stonemem", tool: "mem_search", arguments: { query: "..." } })
Result: 10 agents share 5 backend connections managed by one stonegate process. Configuration is in one place. Backend crashes are auto-recovered.
Dynamic Tool Discovery for Crew Members
CrewAI crew members discover available tools at runtime instead of having them hardcoded.
# Research agent discovers what tools are available
gate_discover({ query: "search" })
// Returns:
// - web-search/search_web (web-search backend)
// - stonemem/mem_search (stonemem backend)
// - database/search_records (database backend)
# Agent picks the right tool dynamically based on the task
# For web research:
gate_call({ server: "web-search", tool: "search_web", arguments: { query: "AI market 2026" } })
# For internal knowledge:
gate_call({ server: "stonemem", tool: "mem_search", arguments: { query: "AI market analysis" } })
Resilient Tool Calls in Multi-Step Workflows
LangGraph workflows that call external tools through stonegate get automatic circuit breaker protection and crash recovery.
# Workflow step calls an unreliable external API via custom MCP server
# Without stonegate: timeout hangs the entire graph
# With stonegate: circuit breaker returns error instantly after 5 failures
gate_call({
server: "external-api",
tool: "fetch_data",
arguments: { endpoint: "/api/v2/reports" }
})
# If external-api is failing, stonegate responds instantly:
// { "error": "Circuit open for 'external-api'", "code": "CIRCUIT_OPEN" }
// Workflow catches error and takes fallback path
Hot-Swapping Tool Backends
OpenHands agents can have their tool backends updated at runtime — no agent restart required.
# DevOps adds a new monitoring backend at runtime
curl -X POST http://localhost:3393/server/add \
-d '{"name": "monitoring", "command": "python", "args": ["-m", "mcp_grafana"]}'
# Agents immediately discover the new tools
gate_discover({ query: "grafana dashboard" })
// Returns new tools from the monitoring backend
# Remove a deprecated backend
curl -X POST http://localhost:3393/server/stop \
-d '{"name": "old-monitoring"}'
Multi-Tenant Tool Isolation
Multiple Google ADK deployments share a stonegate instance with per-tenant tool visibility.
# Tenant A's agent sees only their permitted backends
gate_call({
server: "tenant-a-db",
tool: "query",
arguments: { sql: "SELECT * FROM orders" }
})
# Tenant B's agent sees different backends
gate_call({
server: "tenant-b-db",
tool: "query",
arguments: { sql: "SELECT * FROM inventory" }
})
Pipeline Tool Orchestration
Haystack pipeline components access diverse tools through one gateway — LLMs, databases, file systems, web search — without per-component MCP configuration.
# Retriever component uses web search
gate_call({ server: "web-search", tool: "search_web", arguments: { query: "..." } })
# Reader component uses LLM
gate_call({ server: "llm", tool: "generate", arguments: { prompt: "..." } })
# Writer component uses filesystem
gate_call({ server: "filesystem", tool: "write_file", arguments: { path: "output.md", content: "..." } })
Enterprise Tool Governance
In enterprise deployments, stonegate provides a single point of governance for all agent tool access. All calls flow through the gateway — audit logging captures who called what, when, and with what arguments.
# All 200+ agents connect to stonegate
# Enterprise tier enables audit logging:
GET /log?limit=100
// Returns:
// [
// { "agent": "hr-1", "server": "database", "tool": "update_record",
// "timestamp": "...", "duration_ms": 45, "status": "success" },
// { "agent": "finance-3", "server": "payment-api", "tool": "refund",
// "timestamp": "...", "duration_ms": 230, "status": "success" },
// ...
// ]
# Export audit log for compliance review
GET /log?status=error&limit=1000
# Identify which agents are hitting errors and on which backends
Async Dispatch for Long-Running Operations
Some tool calls take minutes — data exports, complex queries, file processing. Async dispatch lets the agent continue working while waiting.
# Dispatch a long-running export asynchronously
gate_call_async({
server: "database",
tool: "export_table",
arguments: { table: "transactions", format: "csv", since: "2026-01-01" }
})
// Returns immediately: { "call_id": "abc-123", "submitted_at": "..." }
# Agent does other work...
# Check if it's done
gate_poll({ call_id: "abc-123" })
// { "status": "running" }
# Or block briefly
gate_await({ call_id: "abc-123", timeout_ms: 5000 })
// { "status": "complete", "result": { ... } }
8. Tier Comparison
| Feature | Free | Pro ($19/mo) | Enterprise ($99/mo) |
|---|---|---|---|
| Backend servers | 3 | Unlimited | Unlimited |
| Agent connections | 1 | Unlimited | Unlimited |
| Tool discovery + routing | Yes | Yes | Yes |
| Schema caching | Yes | Yes | Yes |
| Crash recovery | Yes | Yes | Yes |
| Async dispatch | No | Yes | Yes |
| Connection pooling | No | Yes | Yes |
| Circuit breaker | No | Yes | Yes |
| Runtime server add/remove | No | Yes | Yes |
| HTTP transport (multi-agent) | No | No | Yes |
| Audit logging | No | No | Yes |
| Backend metrics export | No | No | Yes |
9. License Management
Activation
stonegate activate --key SG-XXXX-XXXX-XXXX-XXXX
stonegate status
Heartbeat & SIGIL
Same model as stonemem and stonemux — daily heartbeat (telemetry only), SIGIL die command for revocation.
License key prefixes
| Product | Prefix | Port |
|---|---|---|
| stonemem | SM- | 3391 |
| stonemux | SX- | 3392 |
| stonegate | SG- | 3393 |
10. Troubleshooting
Backend won't start
# Check backend binary exists
which stonemem
# Check stonegate logs
stonegate serve --log-level debug
# Verify backend starts independently
stonemem serve --port 3391
Circuit breaker open
# Check which backends have open circuits
curl http://localhost:3393/server/list | jq '.servers[] | select(.circuit == "open")'
# Manually restart the failing backend
curl -X POST http://localhost:3393/server/restart -d '{"name": "web-search"}'
# Circuit auto-closes after reset_timeout (default 30s) if the restart succeeds
Tool not found
# List all known tools
curl http://localhost:3393/tools | jq '.tools[].name'
# Check if the backend is running
curl http://localhost:3393/server/list | jq '.servers[] | select(.name == "myserver")'
# If the backend was just added, restart to refresh tool catalog
curl -X POST http://localhost:3393/server/restart -d '{"name": "myserver"}'
High latency
# Check per-backend latency
curl http://localhost:3393/stats | jq '.backends'
# If a specific backend is slow, check its health directly
curl http://localhost:3391/health # stonemem
curl http://localhost:3392/health # stonemux
# Enable connection pooling (Pro) to reduce connection overhead
Free tier limit reached (3 backends)
# Check current backend count
curl http://localhost:3393/server/list | jq '.servers | length'
# Remove an unused backend or upgrade to Pro for unlimited backends
Getting Help
- Email: [email protected]
- GitHub: github.com/keystoneproject/stonegate/issues