Agent-native workflow

How a coding agent discovers walwarden's machine-readable surface, runs a backup → restore-drill → evidence-bundle loop against test data, and reports it under scoped credentials with an auditable action history.

Walwarden is built to be driven by a coding agent — Claude Code, Codex, Cursor, or anything that can run a CLI and read JSON — without handing it credentials it should not hold or letting it claim more than the evidence proves. This page explains how the agent-native path fits together. For the copy-paste commands, jump to the agent-assisted quickstart.

How an agent discovers the surface

Everything an agent needs to use walwarden safely is machine-readable and published. An agent does not need to scrape the dashboard.

Surface	What it is	Where
Skill artifact	The `@walwarden/agent-skills` package — a `SKILL.md` plus a `surface.json` manifest describing supported commands, scopes, capability status, and forbidden claims. Generated from source, so it never drifts from the shipping CLI/SDK.	`@walwarden/agent-skills`
`llms.txt`	The agent entrypoint at the site root. Links every doc page and the machine-readable artifacts below.	`/llms.txt`, `/llms-full.txt`
OpenAPI contract	The REST API v1 alpha contract — operations, scopes, and request/response schemas.	`/openapi/walwarden.v1.json`
SDK and CLI	The dependency-free TypeScript client and the `walwarden` CLI, both generated against the same contract.	SDK, CLI

The skill artifact is the load-bearing piece: it is regenerated from packages/core on every release, so the commands, scopes, and "do not claim" rules an agent reads are the ones the API actually enforces.

No MCP server today

A walwarden MCP server is on the roadmap and not yet available. The supported agent surfaces today are the skill artifact, the SDK, and the CLI, all driven by a scoped API key. Do not configure an agent against an MCP endpoint that does not exist yet.

Scoped credentials, not a raw credential handoff

The agent never needs — and must never be given — your AWS keys, your database superuser password, or your dashboard session. It operates entirely through a scoped API key.

You mint an API key in the dashboard with the minimum scopes for the loop you want. The backup-and-evidence loop needs only databases:read, destinations:read, backups:trigger, and evidence:read. Add restores:write and restores:read only when you want the agent to drive a restore drill.
A request with a key that lacks a scope returns 403 with the exact requiredScope in the error body — so the agent can ask for the narrow missing scope rather than escalating to a broad key. See API auth and scopes.
The key authorizes inspection and backup-trigger work. It does not carry your target-database write credentials. A restore still runs on a machine you control, with a target DSN that stays on that machine and is never transmitted to walwarden. See the trust boundary.

The agent treats the API key like any other secret: read it from the environment or your approved secret store, never print it, never commit it.

Observable action history maps to evidence and audit

Because the agent works through the scoped API, every action it takes leaves the same auditable trail a human operator would.

A triggered backup produces a signed manifest (Ed25519 over the artifact checksum and metadata) recorded in the append-only audit chain. The agent reads it back with evidence list.
A restore drill records every state transition — token issued, claimed, downloading, verifying, restoring, completed or failed — in the same chain.
The agent exports an evidence bundle: the signed manifest plus the full audit event chain, the artifact a human or auditor reviews after the fact.

This is the difference between an agent asserting it did something and the system proving it: the agent's tool calls converge on evidence outputs that exist independent of the agent's own report.

Evidence before success

An agent must not mark a backup or restore drill successful until it has read the command result and the evidence metadata. A completed job proves a job completed; it is not the same as proven recoverability. This rule is restated, with the exact forbidden claims, in the agent integration recipes.

Start in a sandbox, against disposable data

Point the agent at test data first. The whole loop — connect, back up, restore-drill, verify — works against a throwaway database, and that is where an agent should prove it before it touches anything you care about.

Stand up a disposable Postgres database with non-sensitive seed data: a fresh Neon or Supabase project, or a local cluster the agent can reach. This is the source.
Use a separate, empty target for the restore drill — new_database mode creates a fresh database on the target cluster, so a drill against test data never overwrites anything. See restore modes.
Mint a scoped API key bound to only that test database, so even a misbehaving agent cannot touch production.

Running the loop against disposable data is the safe way to confirm the agent reports honestly — that it checks evidence, respects scopes, and refuses claims the surface does not support — before you trust it with a real database.

Hello-world: backup → restore drill → evidence bundle

The minimal end-to-end an agent should run once, against test data, to prove the loop works:

Back up. Trigger an ad-hoc backup and wait for a terminal state, then read the signed manifest it produced. Commands: agent-assisted quickstart.
Restore-drill. Restore that backup into the empty test target in new_database mode and confirm it lands. Walkthrough: run a restore drill.
Evidence bundle. Export the bundle — signed manifest plus audit chain — and confirm the integrity evidence passed. Guide: produce an evidence bundle.

A backup you have never restored from is a backup you have not yet proven recoverable. The drill is what turns step 1 into a claim you can stand behind — and the evidence bundle is the artifact that records it.

How an agent discovers the surface

Scoped credentials, not a raw credential handoff

Observable action history maps to evidence and audit

Start in a sandbox, against disposable data

Hello-world: backup → restore drill → evidence bundle

Read next

On this page