Recovery Testing

Recovery Testing

1. Purpose

This document defines the procedure to test the backup and recovery process of an OpenMRS deployment powered by Restic. The goal is to validate that the system can be successfully restored from a backup snapshot with no data loss and with services fully operational.


2. Scope

  • Applies to OpenMRS deployments configured with mekomsolutions/restic-compose-backup and mekomsolutions/restic-compose-backup-restore.

  • Covers:

    • Verification of backup snapshots.

    • Restoration of specific snapshots.

    • Validation of application and database integrity after recovery.

  • Excludes:

    • Cross-provider migration testing (e.g., restoring S3 backups into Azure).

    • Disaster recovery network/infra simulation (e.g., complete datacenter failover).


3. Prerequisites

  • A functioning OpenMRS deployment with:

    • backend and db services.

    • Configured .env file with correct Restic parameters.

  • Valid Restic repository (local, S3, Azure, GCS, etc.).

  • Sufficient storage capacity for backup and restore operations.

  • Administrative access to Docker host and docker compose.


4. Recovery Testing Procedure

Step 1. Trigger a Backup

  1. Ensure backup service is running:

    docker compose -f docker-compose.yml -f docker-compose-backup.yml up -d
  2. Verify snapshot creation:

    docker compose -f docker-compose.yml -f docker-compose-backup.yml exec backup restic snapshots
    • ✅ Expected: New snapshot entry with current timestamp.


Step 2. Identify Snapshot for Recovery

  1. Choose a snapshot ID to test restore:

    docker compose -f docker-compose.yml -f docker-compose-backup.yml exec backup restic snapshots
  2. Note the snapshot ID (e.g., abc123ef).


Step 3. Simulate Failure (Optional but Recommended)

  • Stop the application stack:

    docker compose down
  • Remove volumes (only for testing in non-production):

    docker volume rm <your_backend_volume> <your_db_volume>

Step 4. Configure Restore Environment

  1. Set required environment variables in .env:

    RESTIC_RESTORE_SNAPSHOT=abc123ef RESTIC_PASSWORD=<your_password> BACKUP_PATH=./backup
  2. Confirm docker-compose-restore.yml is present.


Step 5. Execute Restore

  1. Start restore service:

    docker compose -f docker-compose.yml -f docker-compose-restore.yml up -d
    • ✅ Expected: Restore service initializes and restores snapshot into volumes.

  2. Wait for restore to complete:

    • Check logs:

      docker compose -f docker-compose.yml -f docker-compose-restore.yml logs -f restore
    • ✅ Expected: "Restore completed successfully".

  3. Restart application stack:

    docker compose up -d

Step 6. Validate Restoration

  1. Verify container health:

    docker ps --filter "status=running"
  2. Access OpenMRS UI at http://localhost/openmrs.

    • ✅ Expected: Application loads without errors.

  3. Validate database:

    • Confirm patient records, encounters, observations exist as before backup.

    • Run checksum tests if applicable.

  4. Confirm supporting files:

    • Attachments (images, complex obs).

    • Configurations (modules, checksums).


Step 7. Cleanup

After successful restore validation:

docker compose -f docker-compose.yml -f docker-compose-restore.yml rm restore docker compose -f docker-compose.yml -f docker-compose-restore.yml exec backup restic unlock -v

5. Acceptance Criteria

  • Backup snapshot can be listed and restored.

  • Application (backend, db) starts successfully post-restore.

  • No data loss: patient records, encounters, and attachments restored.

  • No orphaned or corrupted volumes.

  • Monitoring/logging confirms backup and restore without errors.


6. Reporting

Document the following after each recovery test:

  • Snapshot ID used.

  • Restore start and end time.

  • Verification results (application load, data integrity).

  • Issues encountered.

  • Cleanup actions taken.