Incident Console

Same alert, two agents — naive vs Backstop, on a real cluster.

Inject failure

A live incident drill — fail-safe, not just fail-over

The same poisoned alert hits both agents. The naive one acts on a hallucinated diagnosis and takes the production database to zero. Backstop catches the bad output, re-routes to a stronger model, and rolls the real deploy back — on a live Kubernetes cluster, through the TrueFoundry gateway.

Real KubernetesAWS BedrockGateway + fallbackMCP GatewayGuardrails

Catastrophes averted

destructive actions blocked by guardrails

Backstop

Idle

the fail-safe agent's outcome

Naive agent

Idle

what the unguarded agent did

Cluster integrity

—

deployments ready, both namespaces

Naive agent

one model · all tools · no guardrails

Armed

2 steps · no guardrails

Diagnose

one model — trusts the first output

Execute

every tool in hand, nothing checked

Backstop

scoped tools · quality + action gates

Armed

7 steps · 2 guardrails

Gather signals

read-only snapshot of the cluster

Redact secrets

PII masked before the model sees it

Diagnose

structured output via the AI Gateway

Quality gate + LLM-as-judge

is the diagnosis grounded in evidence?

Action gate

blast radius · protected resources · evidence match

Execute (scoped)

only a validated write reaches the cluster

Notify

page on-call + file a ticket via MCP Gateway

TrueFoundry receipts

Capabilities engaged this run.

PII Guardrail (Mutate)

Custom Guardrail

AI Gateway

MCP Gateway

Live cluster

Real Kubernetes, polled every 2s.

Connecting to cluster…