====== Expiry Monitoring ======
**Complexity:** Low-Medium \\
**Duration:** 1-2 hours setup \\
**Goal:** Never let certificates expire unnoticed
Monitoring certificate expiry dates with various tools and alerting strategies.
----
===== Architecture =====
flowchart LR
subgraph SOURCES["SOURCES"]
F[File system]
K[Kubernetes Secrets]
R[Remote TLS]
S[Cert Stores]
end
subgraph COLLECTOR["COLLECTOR"]
E[cert-exporter]
N[node_exporter]
B[blackbox_exporter]
end
subgraph STORAGE["PROMETHEUS"]
P[(Prometheus)]
end
subgraph ALERT["ALERT"]
A[Alertmanager]
M[E-Mail/Slack/Teams]
end
F --> E --> P --> A --> M
K --> E
R --> B --> P
S --> N --> P
style A fill:#ffebee
style P fill:#e3f2fd
----
===== Option 1: Prometheus + cert-exporter =====
==== Installation ====
# Download cert-exporter
wget https://github.com/enix/x509-certificate-exporter/releases/download/v3.12.0/x509-certificate-exporter_3.12.0_linux_amd64.tar.gz
tar xzf x509-certificate-exporter_*.tar.gz
sudo mv x509-certificate-exporter /usr/local/bin/
# Create systemd service
cat << 'EOF' | sudo tee /etc/systemd/system/cert-exporter.service
[Unit]
Description=X509 Certificate Exporter
After=network.target
[Service]
Type=simple
ExecStart=/usr/local/bin/x509-certificate-exporter \
--watch-file=/etc/ssl/certs/*.pem \
--watch-file=/etc/nginx/ssl/*.crt \
--watch-kubeconf=false
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now cert-exporter
==== Prometheus Configuration ====
# /etc/prometheus/prometheus.yml
scrape_configs:
- job_name: 'cert-exporter'
static_configs:
- targets: ['localhost:9793']
relabel_configs:
- source_labels: [__address__]
target_label: instance
replacement: 'pki-server'
# Check remote TLS endpoints
- job_name: 'blackbox-tls'
metrics_path: /probe
params:
module: [tls_connect]
static_configs:
- targets:
- https://api.example.com
- https://web.example.com
- https://mail.example.com:465
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox-exporter:9115
==== Alert Rules ====
# /etc/prometheus/rules/cert-alerts.yml
groups:
- name: certificate-alerts
rules:
- alert: CertificateExpiringSoon
expr: x509_cert_not_after - time() < 30 * 24 * 3600
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate {{ $labels.filepath }} expires in < 30 days"
description: "Time remaining: {{ $value | humanizeDuration }}"
- alert: CertificateExpireCritical
expr: x509_cert_not_after - time() < 7 * 24 * 3600
for: 10m
labels:
severity: critical
annotations:
summary: "CRITICAL: Certificate {{ $labels.filepath }} expires in < 7 days"
description: "Expiry date: {{ $value | humanizeTimestamp }}"
- alert: CertificateExpired
expr: x509_cert_not_after - time() < 0
for: 0m
labels:
severity: critical
annotations:
summary: "Certificate {{ $labels.filepath }} has EXPIRED"
- alert: CAExpiringSoon
expr: x509_cert_not_after{is_ca="true"} - time() < 365 * 24 * 3600
for: 1d
labels:
severity: warning
annotations:
summary: "CA certificate {{ $labels.subject }} expires in < 1 year"
----
===== Option 2: Grafana Dashboard =====
{
"dashboard": {
"title": "Certificate Expiry Dashboard",
"panels": [
{
"title": "Expiring Certificates (< 30 days)",
"type": "table",
"targets": [
{
"expr": "sort_desc((x509_cert_not_after - time()) / 86400)",
"format": "table"
}
],
"fieldConfig": {
"overrides": [
{
"matcher": { "id": "byName", "options": "Value" },
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "thresholds",
"value": {
"steps": [
{ "color": "red", "value": 0 },
{ "color": "orange", "value": 7 },
{ "color": "yellow", "value": 30 },
{ "color": "green", "value": 90 }
]
}
}
]
}
]
}
},
{
"title": "Certificates by Expiry Time",
"type": "stat",
"targets": [
{
"expr": "count(x509_cert_not_after - time() < 7 * 86400)",
"legendFormat": "< 7 days"
},
{
"expr": "count(x509_cert_not_after - time() < 30 * 86400)",
"legendFormat": "< 30 days"
}
]
}
]
}
}
----
===== Option 3: Simple Script (without Prometheus) =====
#!/bin/bash
# /usr/local/bin/cert-check-notify.sh
WARN_DAYS=30
CRIT_DAYS=7
MAIL_TO="pki-team@example.com"
WEBHOOK_URL="https://teams.example.com/webhook/..."
check_cert() {
local cert="$1"
local days_left=$(( ($(openssl x509 -enddate -noout -in "$cert" 2>/dev/null | cut -d= -f2 | date -f - +%s) - $(date +%s)) / 86400 ))
local subject=$(openssl x509 -subject -noout -in "$cert" 2>/dev/null | sed 's/subject=//')
if [ "$days_left" -lt 0 ]; then
echo "EXPIRED|$cert|$subject|$days_left"
elif [ "$days_left" -lt "$CRIT_DAYS" ]; then
echo "CRITICAL|$cert|$subject|$days_left"
elif [ "$days_left" -lt "$WARN_DAYS" ]; then
echo "WARNING|$cert|$subject|$days_left"
fi
}
# Check all certificates
results=""
for cert in /etc/ssl/certs/*.pem /etc/nginx/ssl/*.crt; do
[ -f "$cert" ] || continue
result=$(check_cert "$cert")
[ -n "$result" ] && results+="$result\n"
done
# Send notification if needed
if [ -n "$results" ]; then
# E-Mail
echo -e "Certificate issues found:\n\n$results" | \
mail -s "PKI Alert: Certificates" "$MAIL_TO"
# Teams/Slack Webhook
curl -s -X POST "$WEBHOOK_URL" \
-H "Content-Type: application/json" \
-d "{\"text\": \"PKI Alert: Certificates\n\n$(echo -e "$results" | sed 's/|/ | /g')\"}"
fi
# Cron: Daily at 08:00
echo "0 8 * * * root /usr/local/bin/cert-check-notify.sh" > /etc/cron.d/cert-check
----
===== Option 4: PowerShell (Windows) =====
# Check-CertificateExpiry.ps1
param(
[int]$WarnDays = 30,
[int]$CritDays = 7,
[string]$SmtpServer = "smtp.example.com",
[string]$MailTo = "pki-team@example.com"
)
$results = @()
# Local certificates
Get-ChildItem Cert:\LocalMachine\My | ForEach-Object {
$daysLeft = ($_.NotAfter - (Get-Date)).Days
$severity = if ($daysLeft -lt 0) { "EXPIRED" }
elseif ($daysLeft -lt $CritDays) { "CRITICAL" }
elseif ($daysLeft -lt $WarnDays) { "WARNING" }
else { $null }
if ($severity) {
$results += [PSCustomObject]@{
Severity = $severity
Subject = $_.Subject
Thumbprint = $_.Thumbprint
DaysLeft = $daysLeft
NotAfter = $_.NotAfter
}
}
}
# Remote TLS Endpoints
$endpoints = @(
"api.example.com:443",
"web.example.com:443"
)
foreach ($endpoint in $endpoints) {
try {
$host, $port = $endpoint -split ':'
$cert = (New-Object System.Net.Sockets.TcpClient($host, [int]$port)).GetStream() |
ForEach-Object { (New-Object System.Net.Security.SslStream($_)).AuthenticateAsClient($host); $_.RemoteCertificate }
$daysLeft = ($cert.NotAfter - (Get-Date)).Days
# ... analogous to above
}
catch {
$results += [PSCustomObject]@{
Severity = "ERROR"
Subject = $endpoint
Thumbprint = "N/A"
DaysLeft = -1
NotAfter = "Connection failed: $_"
}
}
}
# Send report
if ($results.Count -gt 0) {
$body = $results | Format-Table -AutoSize | Out-String
Send-MailMessage -To $MailTo -From "pki@example.com" `
-Subject "PKI Alert: $($results.Count) certificate(s) require attention" `
-Body $body `
-SmtpServer $SmtpServer
}
# Output for logging
$results | Format-Table
----
===== Kubernetes: cert-manager Metrics =====
# ServiceMonitor for cert-manager
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: cert-manager
namespace: monitoring
spec:
selector:
matchLabels:
app.kubernetes.io/name: cert-manager
namespaceSelector:
matchNames:
- cert-manager
endpoints:
- port: tcp-prometheus-servicemonitor
interval: 60s
**Important metrics:**
| Metric | Description |
|--------|-------------|
| ''certmanager_certificate_expiration_timestamp_seconds'' | Expiry timestamp |
| ''certmanager_certificate_ready_status'' | Ready status (1=OK) |
| ''certmanager_certificate_renewal_timestamp_seconds'' | Last renewal |
----
===== Best Practices =====
| Recommendation | Rationale |
|----------------|-----------|
| **30/14/7/1 days** alerts | Escalated notification |
| **CA certificates separately** | Longer advance warning (1 year) |
| **Remote + local** checking | Different failure sources |
| **Deduplication** | Not 100 alerts for 1 certificate |
| **Runbook link** in alert | Direct action instructions |
----
===== Checklist =====
| # | Checkpoint | Done |
|---|------------|------|
| 1 | Exporter installed | |
| 2 | Prometheus scrape configured | |
| 3 | Alert rules defined | |
| 4 | Alertmanager routing | |
| 5 | Grafana dashboard | |
| 6 | Test alert sent | |
----
===== Related Documentation =====
* [[.:alerting-setup|Alerting Setup]] - Configure notifications
* [[..:tagesgeschaeft:health-check|Health Check]] - Manual verification
* [[..:automatisierung:scheduled-renewal|Scheduled Renewal]] - Auto-renewal
----
<< [[.:start|<- Monitoring]] | [[.:revocation-check|-> Revocation Check]] >>
----
//Wolfgang van der Stille @ EMSR DATA d.o.o. - Post-Quantum Cryptography Professional//
{{tag>monitoring prometheus grafana expiry operator}}