Recover from data loss
You lost production data and need it back now. Pick the backup, choose a mode without overwriting the wrong thing, run the restore, and confirm it landed.
You lost production data and you need it back. This is the page for the real event, not a practice run. It walks the exact path from "which backup do I trust" to "the data is back and I have checked it," and it flags the one step where a panicked operator can make things worse.
If you are practising recovery before an incident, follow Run a restore drill instead — same mechanics, no urgency, no production target. The mechanics are identical; this page is framed for the moment they actually matter.
The 60-second version
- Open the source database in the dashboard and pick the most recent
completedbackup. Note its manifest hash. - Decide the target. Restoring into a fresh database is the safe default. Overwriting the live database is
in_placeand is destructive — only choose it when you are certain. - Click Restore from this backup, choose the mode, and issue the token.
- Run the one-liner on a machine that can reach the target, with your target DSN.
- Confirm the tables and row counts are back, then cut traffic over.
The rest of this page is the same five steps, slower, with the safety calls spelled out.
Step 1: Pick the backup to restore from
Open the dashboard, go to the affected source database, and look at Backup history. Restore from the most recent backup whose state is completed — that is the lowest-data-loss point you have. The elapsed loss window shown on the dashboard tells you how much data sits between that backup and now; see Recoverability and RPO for how to read it.
If the most recent completed backup is from before the data loss event, prefer it. If the loss happened slowly (a bad migration that ran for hours, a creeping corruption), you may need to step back to an earlier backup that predates the damage. Backup history lists every retained artifact with its completion time and manifest hash so you can choose the right point.
The manifest hash is your identity for the artifact. The CLI verifies the SHA256 checksum against the signed manifest before it writes anything, so a corrupt or wrong artifact fails closed rather than restoring bad bytes.
Step 2: Choose the mode — and do not overwrite by accident
This is the step where haste does damage. There are two modes, and they differ in exactly one way that matters during an incident: whether they destroy what is currently in the target.
new_database(safe default). Restores into a brand-new database and touches nothing that already exists on the target cluster. Nothing is overwritten. If you are not certain, choose this. You can restore here, verify the data, and then promote it.in_place(destructive). Drops every object in the target database (pg_restore --clean --if-exists) before loading the backup. This is how you recover the live database in place — but if you point it at the wrong database, you have just destroyed a second one.
in_place carries a deliberate guard so a panicked operator cannot trigger it by reflex:
- The CLI requires the
--confirm-destructiveflag. Without it, the CLI exits with code 2 before making any API call or touching the target — nothing happens. The dashboard adds this flag to the one-liner automatically when, and only when, you select in-place mode. - In the dashboard modal, in-place mode shows a destructive-intent warning and disables Issue restore token until you tick "I understand this will overwrite the target database."
A calm rule for the incident: restore into a fresh database first, verify it, then decide whether to promote or overwrite. Reach for in_place only when the target must keep its existing identity (connection strings, replication, provider constraints) and you have confirmed the target DSN names the exact database you mean to overwrite. The full mode comparison, including the Neon repeat-restore case, is in Restore modes.
Step 3: Issue the token and run the restore
Click Restore from this backup on the chosen backup row, select your mode, and click Issue restore token. The dashboard returns a one-liner. Run it on a machine that can reach the target Postgres, substituting your target DSN.
Safe-default (new_database) recovery into a fresh database:
WALWARDEN_TOKEN=<token> npx --yes walwarden-cli restore \
--manifest <sha256> \
--target 'postgresql://user@host:5432/postgres' \
--created-database recovery_2026_06_30 \
--mode new_databaseSetting PGPASSWORD instead of inlining the password keeps the secret out of your shell history:
export PGPASSWORD='your-password'Destructive in-place recovery of the live database (note the explicit guard flag, and that the target path names the database being overwritten):
WALWARDEN_TOKEN=<token> npx --yes walwarden-cli restore \
--manifest <sha256> \
--target 'postgresql://user@host:5432/the-live-db-name' \
--mode in_place \
--confirm-destructiveThe CLI downloads the artifact from S3 via a presigned URL, verifies the manifest checksum, and pipes the bytes to pg_restore on your machine. Your target DSN never leaves the machine and walwarden never sees the dump bytes — the trust boundary is the same in an incident as in a drill (see Restore overview). Progress streams to your terminal and, in real time, to the dashboard's Restore history. The CLI exits 0 on success and the token is invalidated.
Restore tokens are valid for one hour. If yours expires mid-incident, issue a new one from the same backup row.
Step 4: Confirm the data is actually back
A restore that exits 0 is not yet a recovery you have proven. Connect to the restored database and check that the data you lost is present:
psql 'postgresql://user@host:5432/recovery_2026_06_30' -c "\dt"
psql 'postgresql://user@host:5432/recovery_2026_06_30' -c "SELECT count(*) FROM your_critical_table;"For a new_database restore, connect to the name you created (or the source database name if you did not pass --created-database). For in_place, connect to the database you overwrote. Spot-check the tables and row counts that matter for this incident before you cut production traffic over. Even one verified table count beats trusting an unverified restore.
If you restored into a fresh database as the safe default, this is the point where you decide how to promote it — repoint the application, rename, or run a second in_place restore once you have confirmed the data is good.
If the restore fails
Every failure mode seen in testing and production — version mismatches, expired or already-used tokens, database already exists on Neon, jobs stuck in downloading, and non-zero pg_restore exit codes — has a diagnosis and remedy in Restore troubleshooting. The dashboard Restore history shows the failure state and classification for the job; start there, match it to the troubleshooting entry, and retry with a fresh token.
Related
- Run a restore drill — the same flow as a planned practice run, with no production target
- Restore modes —
new_databasevsin_placein full, including the--confirm-destructiveguard - Restore troubleshooting — failure modes and their fixes
- Restore overview — why restore runs on your machine via the CLI