Disaster Recovery

Critical: These runbooks should be tested regularly!
Target audience: PKI Administrators, Security Team

Emergency procedures for CA failures, compromises, and recovery.


Overview

flowchart TB subgraph PREVENT["PREVENTION"] P1[Backup Strategy] P2[Key Ceremony] P3[HSM Redundancy] end subgraph DETECT["DETECTION"] D1[Compromise detected] D2[Hardware failure] D3[Data loss] end subgraph RESPOND["RESPONSE"] R1[Emergency Revocation] R2[CA Recovery] R3[Communication] end subgraph RECOVER["RECOVERY"] C1[New CA] C2[Re-issue Certificates] C3[Trust Stores] end P1 & P2 & P3 --> D1 & D2 & D3 D1 --> R1 D2 --> R2 D3 --> R2 R1 & R2 --> C1 --> C2 --> C3 style D1 fill:#ffebee style R1 fill:#fff3e0


Scenarios

Scenario Description RTO RPO
CA Backup/Restore Back up and restore CA keys 4h 24h
Key Ceremony Secure key generation with controls N/A N/A
Emergency Revocation Mass revocation on compromise 1h 0

Escalation Matrix

Severity Example First Response Escalation
———-————————-————
SEV-1 CA key compromised Emergency revocation CISO, Mgmt
SEV-2 CA server failed Restore from backup IT-Ops Lead
SEV-3 Intermediate compromised Revoke sub-CA PKI Admin
SEV-4 End-entity compromised Single certificate PKI Operator

Contacts

Keep emergency contacts current!

Role Name Reachability
————————–
PKI Admin (Primary) <Name> Tel., Email
PKI Admin (Backup) <Name> Tel., Email
Security Team security@example.com 24/7
HSM Vendor Support <Vendor> Support Hotline

RTO/RPO Definitions

Metric Definition Target
——–——————–
RTO Recovery Time Objective - Max. time to recovery 4h
RPO Recovery Point Objective - Max. acceptable data loss 24h
MTTR Mean Time To Repair < 2h


« <- Operator Scenarios | -> CA Backup/Restore »


Wolfgang van der Stille @ EMSR DATA d.o.o. - Post-Quantum Cryptography Professional

Zuletzt geändert: on 2026/01/30 at 01:28 AM