====== Disaster Recovery ====== **Critical:** These runbooks should be tested regularly! \\ **Target audience:** PKI Administrators, Security Team Emergency procedures for CA failures, compromises, and recovery. ---- ===== Overview ===== flowchart TB subgraph PREVENT["PREVENTION"] P1[Backup Strategy] P2[Key Ceremony] P3[HSM Redundancy] end subgraph DETECT["DETECTION"] D1[Compromise detected] D2[Hardware failure] D3[Data loss] end subgraph RESPOND["RESPONSE"] R1[Emergency Revocation] R2[CA Recovery] R3[Communication] end subgraph RECOVER["RECOVERY"] C1[New CA] C2[Re-issue Certificates] C3[Trust Stores] end P1 & P2 & P3 --> D1 & D2 & D3 D1 --> R1 D2 --> R2 D3 --> R2 R1 & R2 --> C1 --> C2 --> C3 style D1 fill:#ffebee style R1 fill:#fff3e0 ---- ===== Scenarios ===== ^ Scenario ^ Description ^ RTO ^ RPO ^ | [[.:ca-backup-restore|CA Backup/Restore]] | Back up and restore CA keys | 4h | 24h | | [[.:key-ceremony|Key Ceremony]] | Secure key generation with controls | N/A | N/A | | [[.:notfall-revocation|Emergency Revocation]] | Mass revocation on compromise | 1h | 0 | ---- ===== Escalation Matrix ===== | Severity | Example | First Response | Escalation | |----------|---------|----------------|------------| | **SEV-1** | CA key compromised | Emergency revocation | CISO, Mgmt | | **SEV-2** | CA server failed | Restore from backup | IT-Ops Lead | | **SEV-3** | Intermediate compromised | Revoke sub-CA | PKI Admin | | **SEV-4** | End-entity compromised | Single certificate | PKI Operator | ---- ===== Contacts ===== **Keep emergency contacts current!** | Role | Name | Reachability | |------|------|--------------| | PKI Admin (Primary) | '''' | Tel., Email | | PKI Admin (Backup) | '''' | Tel., Email | | Security Team | security@example.com | 24/7 | | HSM Vendor Support | '''' | Support Hotline | ---- ===== RTO/RPO Definitions ===== | Metric | Definition | Target | |--------|------------|--------| | **RTO** | Recovery Time Objective - Max. time to recovery | 4h | | **RPO** | Recovery Point Objective - Max. acceptable data loss | 24h | | **MTTR** | Mean Time To Repair | < 2h | ---- ===== Related Documentation ===== * [[..:migration:rollback-strategie|Rollback Strategy]] - Migration rollback * [[..:tagesgeschaeft:zertifikat-widerrufen|Revoke Certificate]] - Single revocation * [[en:int:pqcrypt:administrator:betrieb|Operations]] - Daily maintenance ---- << [[..:start|<- Operator Scenarios]] | [[.:ca-backup-restore|-> CA Backup/Restore]] >> ---- //Wolfgang van der Stille @ EMSR DATA d.o.o. - Post-Quantum Cryptography Professional// {{tag>disaster-recovery backup emergency operator}}