Target audience: DevOps, SRE
Content: Metrics, dashboards, alerting
Tools: Prometheus, Grafana, Alertmanager
Monitoring the Data Gateway for proactive error detection.
| Runbook | Description | Duration |
|---|---|---|
| Prometheus | Collect metrics, scrape config | ~15 min |
| Grafana Dashboard | Visualization, pre-built dashboards | ~20 min |
| Alerting | Thresholds, notifications | ~15 min |
| Metric | Description | Threshold |
| ——– | ————- | ———– |
http_requests_total | Number of HTTP requests | - |
http_request_duration_seconds | Response time | < 1s |
http_requests_in_progress | Active requests | < 100 |
dotnet_gc_memory_total_available_bytes | Available memory | > 100MB |
process_cpu_seconds_total | CPU usage | < 80% |
# Health Check curl http://localhost:5000/health # Metrics (when enabled) curl http://localhost:5000/metrics
« <- Operator Handbook | -> Prometheus »
Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional