Monitoring

Zielgruppe: DevOps, SRE
Inhalt: Metriken, Dashboards, Alerting
Tools: Prometheus, Grafana, Alertmanager

Überwachung des Data Gateway für proaktive Fehlererkennung.

Workflow

flowchart LR subgraph GATEWAY["🌐 DATA GATEWAY"] G1[/metrics Endpoint] G2[/health Endpoint] end subgraph COLLECT["📊 SAMMLUNG"] P[Prometheus] end subgraph VISUAL["📈 VISUALISIERUNG"] GR[Grafana] end subgraph ALERT["🚨 ALERTING"] AM[Alertmanager] E[E-Mail/Slack] end G1 --> P G2 --> P P --> GR P --> AM AM --> E style G1 fill:#e3f2fd style P fill:#fff3e0 style GR fill:#e8f5e9 style AM fill:#ffebee

Runbooks

Runbook	Beschreibung	Dauer
Prometheus	Metriken sammeln, Scrape-Config	~15 Min
Grafana Dashboard	Visualisierung, vorgefertigte Dashboards	~20 Min
Alerting	Schwellwerte, Benachrichtigungen	~15 Min

Wichtige Metriken

Metrik	Beschreibung	Schwellwert
——–	————–	————-
`http_requests_total`	Anzahl HTTP-Anfragen	-
`http_request_duration_seconds`	Response-Zeit	< 1s
`http_requests_in_progress`	Aktive Anfragen	< 100
`dotnet_gc_memory_total_available_bytes`	Verfügbarer Speicher	> 100MB
`process_cpu_seconds_total`	CPU-Nutzung	< 80%

Schnelltest

# Health Check
curl http://localhost:5000/health
 
# Metrics (wenn aktiviert)
curl http://localhost:5000/metrics

Inhaltsverzeichnis

Monitoring

Workflow

Runbooks

Wichtige Metriken

Schnelltest

Verwandte Runbooks