Restore troubleshooting

Diagnoses and remedies for the failure modes observed in live E2E testing and production.

pg_restore: error: aborting because of server version mismatch

What happened: The pg_restore binary on your restore machine is a different major version than the Postgres server that produced the dump.

How to fix: Install pg_restore matching the major version of the source database.

The source database's Postgres version is shown on the walwarden dashboard database detail page.
On macOS: brew install postgresql@16 (replace 16 with the required version), then add /opt/homebrew/opt/postgresql@16/bin to your PATH.
On Debian/Ubuntu: apt-get install postgresql-client-16.
On other Linux: use the PGDG repositories at postgresql.org/download.

After installing, confirm the version with pg_restore --version and retry.

User is not authorized to perform: s3:PutObjectRetention

What happened: The IAM role walwarden assumed does not have s3:PutObjectRetention in its policy. This is required because the bucket has Object Lock enabled and walwarden applies retention holds to every artifact.

How to fix: Add s3:PutObjectRetention (and s3:PutObjectLegalHold, s3:GetObjectRetention, s3:GetObjectLegalHold) to the IAM policy attached to the role. The full required policy is in BYO AWS S3 step 3.

After updating the policy, re-run preflight from the walwarden Destinations page.

Could not load credentials from any providers

What happened: The CLI is attempting to make AWS API calls directly but cannot find AWS credentials in the environment. This should not happen in the normal CLI restore flow (which uses a presigned URL, not AWS credentials), but can occur if the restore job was configured for a managed mode that requires direct S3 access.

How to fix: In the standard npx walwarden-cli restore flow, AWS credentials on the restore machine are not required. The presigned URL in the restore token handles S3 authentication. If you see this error:

Confirm you are using the one-liner from the walwarden dashboard, not a manually assembled command.
Confirm WALWARDEN_TOKEN is set and is the token from the dashboard, not an older or different token.
If you are running a custom invocation, verify your CLI version with walwarden --version and update to the latest if needed.

token already used

What happened: Restore tokens are single-use. Once a triggerRestore API call succeeds, the token's intent ID is marked as consumed. Any subsequent attempt to use the same token is rejected.

How to fix: Issue a new restore token from the dashboard. Navigate to the same backup row and click "Restore from this backup" again.

creating DATABASE neondb ... database already exists

What happened: You ran a new_database restore against a Neon target where the source database name already exists. This is common on the second and subsequent restores to the same Neon target — the first restore (in new_database mode) created the database; it is still there.

How to fix: Use in_place mode for subsequent restores to the same target database:

WALWARDEN_TOKEN=<new-token> npx --yes walwarden-cli restore \
  --manifest <sha256> \
  --target 'postgresql://user:password@host:5432/the-existing-db-name' \
  --mode in_place \
  --confirm-destructive

The dashboard one-liner generator automatically adds --confirm-destructive when you select in-place mode in the modal.

Alternatively, connect to the Neon target and drop the database manually:

psql 'postgresql://user:password@host:5432/postgres' \
  -c "DROP DATABASE \"the-existing-db-name\";"

Then re-run the original new_database restore.

Token expired

What happened: Restore tokens are valid for 1 hour from issuance. The token was not used within that window.

How to fix: Issue a new token from the dashboard. The token expiration countdown is shown in the one-liner panel; if it reaches zero before you run the command, close the panel and click "Restore from this backup" again.

restore_job stuck in downloading / verifying / restoring

What happened: The CLI process was killed (SIGTERM, kill -9, terminal closed) while the restore was in progress. The server has a watchdog that detects inactive restore jobs and transitions them to timed_out after a configurable interval (typically a few minutes).

What to do: Wait for the dashboard to show timed_out. Then issue a new token and start the restore again. The previous partial restore did not write any data to the target database unless the process reached the restoring state and pg_restore completed partially. In that case, use --mode in_place --confirm-destructive to overwrite the partial state on the next run.

pg_restore exit code non-zero (generic)

What happened: pg_restore failed. The CLI captures the exit code and classifies the failure as failed_terminal. The dashboard shows the error classification.

Common causes:

The target database is unreachable (wrong host, port, or credentials in --target). Verify with psql '<target-dsn>' before running the restore.
The dump file is corrupt. This should not happen if the manifest checksum verified successfully; if the checksum passed and pg_restore still fails, contact support with the restore job ID from the dashboard.
Insufficient privileges on the target — the target database user must have CREATE privileges (for new_database mode) or DROP and CREATE on the target database (for in_place mode).