ComplessitΓ : Bassa-Media
Durata: 30-60 minuti per il setup
Obiettivo: Notifica proattiva in caso di problemi PKI
Configurazione dell'alerting per il monitoraggio PKI con diversi canali di notifica.
| Categoria | Esempi | Severity | Risposta |
| ββββ | βββ | βββ- | βββ- |
| Critico | Certificato scaduto, CA down | P1 | Immediata |
| Avviso | Certificato < 7 giorni, CRL < 24h | P2 | 4h |
| Info | Certificato < 30 giorni | P3 | Prossimo giorno lavorativo |
# Scaricare Alertmanager wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz tar xzf alertmanager-*.tar.gz sudo mv alertmanager-*/alertmanager /usr/local/bin/ sudo mv alertmanager-*/amtool /usr/local/bin/
# /etc/alertmanager/alertmanager.yml global: resolve_timeout: 5m smtp_smarthost: 'smtp.example.com:587' smtp_from: 'alertmanager@example.com' smtp_auth_username: 'alertmanager' smtp_auth_password: 'secret' route: receiver: 'default' group_by: ['alertname', 'severity'] group_wait: 30s group_interval: 5m repeat_interval: 4h routes: # Alert PKI critici β PagerDuty + E-Mail - match: severity: critical job: pki receiver: 'pki-critical' repeat_interval: 15m # Avvisi β E-Mail + Slack - match: severity: warning job: pki receiver: 'pki-warning' repeat_interval: 4h # Info β solo Slack - match: severity: info job: pki receiver: 'pki-info' repeat_interval: 24h receivers: - name: 'default' email_configs: - to: 'ops@example.com' - name: 'pki-critical' email_configs: - to: 'pki-team@example.com' send_resolved: true pagerduty_configs: - service_key: '<PAGERDUTY_SERVICE_KEY>' severity: critical slack_configs: - api_url: '<SLACK_WEBHOOK_URL>' channel: '#pki-alerts' title: 'π¨ PKI CRITICO: {{ .GroupLabels.alertname }}' text: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}' - name: 'pki-warning' email_configs: - to: 'pki-team@example.com' slack_configs: - api_url: '<SLACK_WEBHOOK_URL>' channel: '#pki-alerts' title: 'β οΈ PKI Avviso: {{ .GroupLabels.alertname }}' - name: 'pki-info' slack_configs: - api_url: '<SLACK_WEBHOOK_URL>' channel: '#pki-info' title: 'βΉοΈ PKI Info: {{ .GroupLabels.alertname }}' inhibit_rules: # Sopprimere avvisi quando critico Γ¨ attivo - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname']
# /etc/systemd/system/alertmanager.service [Unit] Description=Prometheus Alertmanager After=network.target [Service] Type=simple ExecStart=/usr/local/bin/alertmanager \ --config.file=/etc/alertmanager/alertmanager.yml \ --storage.path=/var/lib/alertmanager Restart=always [Install] WantedBy=multi-user.target
# Alertmanager Teams Webhook receivers: - name: 'pki-teams' webhook_configs: - url: 'https://outlook.office.com/webhook/...' send_resolved: true http_config: bearer_token: ''
Template Teams Message Card:
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{ if eq .Status \"firing\" }}FF0000{{ else }}00FF00{{ end }}",
"summary": "PKI Alert: {{ .GroupLabels.alertname }}",
"sections": [{
"activityTitle": "{{ .GroupLabels.alertname }}",
"activitySubtitle": "{{ .Status | toUpper }}",
"facts": [
{{ range .Alerts }}
{
"name": "{{ .Labels.instance }}",
"value": "{{ .Annotations.summary }}"
},
{{ end }}
],
"markdown": true
}],
"potentialAction": [{
"@type": "OpenUri",
"name": "Apri Runbook",
"targets": [{
"os": "default",
"uri": "{{ (index .Alerts 0).Annotations.runbook_url }}"
}]
}]
}
# Configurazione Alertmanager Slack receivers: - name: 'pki-slack' slack_configs: - api_url: 'https://hooks.slack.com/services/xxx/yyy/zzz' channel: '#pki-alerts' username: 'PKI-Alertmanager' icon_emoji: ':lock:' send_resolved: true title: '{{ template "slack.title" . }}' text: '{{ template "slack.text" . }}' actions: - type: button text: 'Runbook' url: '{{ (index .Alerts 0).Annotations.runbook_url }}' - type: button text: 'Dashboard' url: 'https://grafana.example.com/d/pki'
# Integrazione Alertmanager PagerDuty receivers: - name: 'pki-pagerduty' pagerduty_configs: - service_key: '<INTEGRATION_KEY>' severity: '{{ if eq .GroupLabels.severity "critical" }}critical{{ else }}warning{{ end }}' description: '{{ .GroupLabels.alertname }}: {{ .CommonAnnotations.summary }}' details: firing: '{{ template "pagerduty.firing" . }}' num_firing: '{{ .Alerts.Firing | len }}' num_resolved: '{{ .Alerts.Resolved | len }}'
# /etc/alertmanager/templates/email.tmpl {{ define "email.subject" }} [{{ .Status | toUpper }}] PKI Alert: {{ .GroupLabels.alertname }} {{ end }} {{ define "email.html" }} <!DOCTYPE html> <html> <head> <style> .critical { background-color: #ffebee; border-left: 4px solid #f44336; } .warning { background-color: #fff3e0; border-left: 4px solid #ff9800; } .resolved { background-color: #e8f5e9; border-left: 4px solid #4caf50; } </style> </head> <body> <h2>PKI Alert: {{ .GroupLabels.alertname }}</h2> <p>Stato: <strong>{{ .Status | toUpper }}</strong></p> {{ range .Alerts }} <div class="{{ .Labels.severity }}"> <h3>{{ .Labels.instance }}</h3> <p><strong>Riepilogo:</strong> {{ .Annotations.summary }}</p> <p><strong>Descrizione:</strong> {{ .Annotations.description }}</p> {{ if .Annotations.runbook_url }} <p><a href="{{ .Annotations.runbook_url }}">π Apri Runbook</a></p> {{ end }} </div> {{ end }} <hr> <p> <a href="https://grafana.example.com/d/pki">π Dashboard</a> | <a href="https://alertmanager.example.com">π Alertmanager</a> </p> </body> </html> {{ end }}
# /etc/prometheus/rules/pki-alerts.yml groups: - name: pki-alerts rules: - alert: CertificateExpiringSoon expr: x509_cert_not_after - time() < 7 * 86400 for: 1h labels: severity: warning team: pki annotations: summary: "Certificato {{ $labels.filepath }} scade tra < 7 giorni" description: "Tempo rimanente: {{ $value | humanizeDuration }}" runbook_url: "https://wiki.example.com/pki/runbook/rinnovo-certificato" - alert: CertificateExpired expr: x509_cert_not_after - time() < 0 labels: severity: critical team: pki annotations: summary: "CRITICO: Certificato {{ $labels.filepath }} Γ¨ SCADUTO" runbook_url: "https://wiki.example.com/pki/runbook/emissione-certificato" - alert: CANotReachable expr: up{job="ca"} == 0 for: 2m labels: severity: critical team: pki annotations: summary: "Server CA non raggiungibile" runbook_url: "https://wiki.example.com/pki/runbook/ca-troubleshooting"
# Grafana Alert Rule (UI o Provisioning) apiVersion: 1 groups: - orgId: 1 name: PKI Alerts folder: PKI interval: 1m rules: - uid: cert-expiry-warning title: Certificate Expiring Soon condition: B data: - refId: A relativeTimeRange: from: 600 to: 0 datasourceUid: prometheus model: expr: x509_cert_not_after - time() < 7 * 86400 - refId: B datasourceUid: '-100' model: conditions: - evaluator: params: [0] type: gt operator: type: and query: params: [A] reducer: type: count for: 1h labels: severity: warning annotations: summary: Certificato in scadenza
# Verificare configurazione Alertmanager amtool check-config /etc/alertmanager/alertmanager.yml # Inviare alert di test amtool alert add alertname=TestAlert severity=warning instance=test \ --alertmanager.url=http://localhost:9093 # Visualizzare alert attivi amtool alert --alertmanager.url=http://localhost:9093 # Creare silence (es. per manutenzione) amtool silence add alertname=CertificateExpiringSoon \ --alertmanager.url=http://localhost:9093 \ --comment="Manutenzione programmata" \ --duration=2h
| # | Punto di verifica | β |
| β | ββββββ- | β |
| 1 | Alertmanager installato | β |
| 2 | Routing configurato | β |
| 3 | Receiver E-Mail | β |
| 4 | Webhook Slack/Teams | β |
| 5 | Integrazione PagerDuty | β |
| 6 | Alert Rules definite | β |
| 7 | Link Runbook inseriti | β |
| 8 | Alert di test inviato | β |
Β« β Audit logging | β Scenari per operatori Β»
Wolfgang van der Stille @ EMSR DATA d.o.o. - Post-Quantum Cryptography Professional