====== Expiry Monitoring ====== **Complexity:** Low-Medium \\ **Duration:** 1-2 hours setup \\ **Goal:** Never let certificates expire unnoticed Monitoring certificate expiry dates with various tools and alerting strategies. ---- ===== Architecture ===== flowchart LR subgraph SOURCES["SOURCES"] F[File system] K[Kubernetes Secrets] R[Remote TLS] S[Cert Stores] end subgraph COLLECTOR["COLLECTOR"] E[cert-exporter] N[node_exporter] B[blackbox_exporter] end subgraph STORAGE["PROMETHEUS"] P[(Prometheus)] end subgraph ALERT["ALERT"] A[Alertmanager] M[E-Mail/Slack/Teams] end F --> E --> P --> A --> M K --> E R --> B --> P S --> N --> P style A fill:#ffebee style P fill:#e3f2fd ---- ===== Option 1: Prometheus + cert-exporter ===== ==== Installation ==== # Download cert-exporter wget https://github.com/enix/x509-certificate-exporter/releases/download/v3.12.0/x509-certificate-exporter_3.12.0_linux_amd64.tar.gz tar xzf x509-certificate-exporter_*.tar.gz sudo mv x509-certificate-exporter /usr/local/bin/ # Create systemd service cat << 'EOF' | sudo tee /etc/systemd/system/cert-exporter.service [Unit] Description=X509 Certificate Exporter After=network.target [Service] Type=simple ExecStart=/usr/local/bin/x509-certificate-exporter \ --watch-file=/etc/ssl/certs/*.pem \ --watch-file=/etc/nginx/ssl/*.crt \ --watch-kubeconf=false Restart=always [Install] WantedBy=multi-user.target EOF sudo systemctl daemon-reload sudo systemctl enable --now cert-exporter ==== Prometheus Configuration ==== # /etc/prometheus/prometheus.yml scrape_configs: - job_name: 'cert-exporter' static_configs: - targets: ['localhost:9793'] relabel_configs: - source_labels: [__address__] target_label: instance replacement: 'pki-server' # Check remote TLS endpoints - job_name: 'blackbox-tls' metrics_path: /probe params: module: [tls_connect] static_configs: - targets: - https://api.example.com - https://web.example.com - https://mail.example.com:465 relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: blackbox-exporter:9115 ==== Alert Rules ==== # /etc/prometheus/rules/cert-alerts.yml groups: - name: certificate-alerts rules: - alert: CertificateExpiringSoon expr: x509_cert_not_after - time() < 30 * 24 * 3600 for: 1h labels: severity: warning annotations: summary: "Certificate {{ $labels.filepath }} expires in < 30 days" description: "Time remaining: {{ $value | humanizeDuration }}" - alert: CertificateExpireCritical expr: x509_cert_not_after - time() < 7 * 24 * 3600 for: 10m labels: severity: critical annotations: summary: "CRITICAL: Certificate {{ $labels.filepath }} expires in < 7 days" description: "Expiry date: {{ $value | humanizeTimestamp }}" - alert: CertificateExpired expr: x509_cert_not_after - time() < 0 for: 0m labels: severity: critical annotations: summary: "Certificate {{ $labels.filepath }} has EXPIRED" - alert: CAExpiringSoon expr: x509_cert_not_after{is_ca="true"} - time() < 365 * 24 * 3600 for: 1d labels: severity: warning annotations: summary: "CA certificate {{ $labels.subject }} expires in < 1 year" ---- ===== Option 2: Grafana Dashboard ===== { "dashboard": { "title": "Certificate Expiry Dashboard", "panels": [ { "title": "Expiring Certificates (< 30 days)", "type": "table", "targets": [ { "expr": "sort_desc((x509_cert_not_after - time()) / 86400)", "format": "table" } ], "fieldConfig": { "overrides": [ { "matcher": { "id": "byName", "options": "Value" }, "properties": [ { "id": "custom.displayMode", "value": "color-background" }, { "id": "thresholds", "value": { "steps": [ { "color": "red", "value": 0 }, { "color": "orange", "value": 7 }, { "color": "yellow", "value": 30 }, { "color": "green", "value": 90 } ] } } ] } ] } }, { "title": "Certificates by Expiry Time", "type": "stat", "targets": [ { "expr": "count(x509_cert_not_after - time() < 7 * 86400)", "legendFormat": "< 7 days" }, { "expr": "count(x509_cert_not_after - time() < 30 * 86400)", "legendFormat": "< 30 days" } ] } ] } } ---- ===== Option 3: Simple Script (without Prometheus) ===== #!/bin/bash # /usr/local/bin/cert-check-notify.sh WARN_DAYS=30 CRIT_DAYS=7 MAIL_TO="pki-team@example.com" WEBHOOK_URL="https://teams.example.com/webhook/..." check_cert() { local cert="$1" local days_left=$(( ($(openssl x509 -enddate -noout -in "$cert" 2>/dev/null | cut -d= -f2 | date -f - +%s) - $(date +%s)) / 86400 )) local subject=$(openssl x509 -subject -noout -in "$cert" 2>/dev/null | sed 's/subject=//') if [ "$days_left" -lt 0 ]; then echo "EXPIRED|$cert|$subject|$days_left" elif [ "$days_left" -lt "$CRIT_DAYS" ]; then echo "CRITICAL|$cert|$subject|$days_left" elif [ "$days_left" -lt "$WARN_DAYS" ]; then echo "WARNING|$cert|$subject|$days_left" fi } # Check all certificates results="" for cert in /etc/ssl/certs/*.pem /etc/nginx/ssl/*.crt; do [ -f "$cert" ] || continue result=$(check_cert "$cert") [ -n "$result" ] && results+="$result\n" done # Send notification if needed if [ -n "$results" ]; then # E-Mail echo -e "Certificate issues found:\n\n$results" | \ mail -s "PKI Alert: Certificates" "$MAIL_TO" # Teams/Slack Webhook curl -s -X POST "$WEBHOOK_URL" \ -H "Content-Type: application/json" \ -d "{\"text\": \"PKI Alert: Certificates\n\n$(echo -e "$results" | sed 's/|/ | /g')\"}" fi # Cron: Daily at 08:00 echo "0 8 * * * root /usr/local/bin/cert-check-notify.sh" > /etc/cron.d/cert-check ---- ===== Option 4: PowerShell (Windows) ===== # Check-CertificateExpiry.ps1 param( [int]$WarnDays = 30, [int]$CritDays = 7, [string]$SmtpServer = "smtp.example.com", [string]$MailTo = "pki-team@example.com" ) $results = @() # Local certificates Get-ChildItem Cert:\LocalMachine\My | ForEach-Object { $daysLeft = ($_.NotAfter - (Get-Date)).Days $severity = if ($daysLeft -lt 0) { "EXPIRED" } elseif ($daysLeft -lt $CritDays) { "CRITICAL" } elseif ($daysLeft -lt $WarnDays) { "WARNING" } else { $null } if ($severity) { $results += [PSCustomObject]@{ Severity = $severity Subject = $_.Subject Thumbprint = $_.Thumbprint DaysLeft = $daysLeft NotAfter = $_.NotAfter } } } # Remote TLS Endpoints $endpoints = @( "api.example.com:443", "web.example.com:443" ) foreach ($endpoint in $endpoints) { try { $host, $port = $endpoint -split ':' $cert = (New-Object System.Net.Sockets.TcpClient($host, [int]$port)).GetStream() | ForEach-Object { (New-Object System.Net.Security.SslStream($_)).AuthenticateAsClient($host); $_.RemoteCertificate } $daysLeft = ($cert.NotAfter - (Get-Date)).Days # ... analogous to above } catch { $results += [PSCustomObject]@{ Severity = "ERROR" Subject = $endpoint Thumbprint = "N/A" DaysLeft = -1 NotAfter = "Connection failed: $_" } } } # Send report if ($results.Count -gt 0) { $body = $results | Format-Table -AutoSize | Out-String Send-MailMessage -To $MailTo -From "pki@example.com" ` -Subject "PKI Alert: $($results.Count) certificate(s) require attention" ` -Body $body ` -SmtpServer $SmtpServer } # Output for logging $results | Format-Table ---- ===== Kubernetes: cert-manager Metrics ===== # ServiceMonitor for cert-manager apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: cert-manager namespace: monitoring spec: selector: matchLabels: app.kubernetes.io/name: cert-manager namespaceSelector: matchNames: - cert-manager endpoints: - port: tcp-prometheus-servicemonitor interval: 60s **Important metrics:** | Metric | Description | |--------|-------------| | ''certmanager_certificate_expiration_timestamp_seconds'' | Expiry timestamp | | ''certmanager_certificate_ready_status'' | Ready status (1=OK) | | ''certmanager_certificate_renewal_timestamp_seconds'' | Last renewal | ---- ===== Best Practices ===== | Recommendation | Rationale | |----------------|-----------| | **30/14/7/1 days** alerts | Escalated notification | | **CA certificates separately** | Longer advance warning (1 year) | | **Remote + local** checking | Different failure sources | | **Deduplication** | Not 100 alerts for 1 certificate | | **Runbook link** in alert | Direct action instructions | ---- ===== Checklist ===== | # | Checkpoint | Done | |---|------------|------| | 1 | Exporter installed | | | 2 | Prometheus scrape configured | | | 3 | Alert rules defined | | | 4 | Alertmanager routing | | | 5 | Grafana dashboard | | | 6 | Test alert sent | | ---- ===== Related Documentation ===== * [[.:alerting-setup|Alerting Setup]] - Configure notifications * [[..:tagesgeschaeft:health-check|Health Check]] - Manual verification * [[..:automatisierung:scheduled-renewal|Scheduled Renewal]] - Auto-renewal ---- << [[.:start|<- Monitoring]] | [[.:revocation-check|-> Revocation Check]] >> ---- //Wolfgang van der Stille @ EMSR DATA d.o.o. - Post-Quantum Cryptography Professional// {{tag>monitoring prometheus grafana expiry operator}}