Runbook: Prometheus

Trajanje: ~15 minuta
Uloga: DevOps, SRE
Preduvjet: Prometheus Server, Gateway pokrenut

Prikupljanje metrika s Data Gatewaya pomocu Prometheusa.


Tijek rada

flowchart TD A[Start] --> B[Metrics aktivirati] B --> C[Prometheus Config] C --> D[Scrape-Job dodati] D --> E[Prometheus reload] E --> F[Targets provjeriti] F --> G{Up?} G -->|Da| H[Gotovo] G -->|Ne| I[Firewall/Endpoint provjeriti] style H fill:#e8f5e9 style I fill:#ffebee


1. Metrics u Gatewayu aktivirati

appsettings.json:

{
  "Metrics": {
    "Enabled": true,
    "Endpoint": "/metrics"
  }
}

Ili preko NuGet (ako nije ugradeno):

# prometheus-net.AspNetCore
dotnet add package prometheus-net.AspNetCore

Program.cs:

// Metrics Middleware
app.UseHttpMetrics();
app.MapMetrics(); // /metrics Endpoint

2. Metrics-Endpoint testirati

curl http://localhost:5000/metrics
 
# Ocekivani izlaz (Prometheus format):
# HELP http_requests_total Total HTTP requests
# TYPE http_requests_total counter
# http_requests_total{method="GET",endpoint="/api/v1/dsn/demo/tables",status="200"} 42

3. Prometheus konfiguracija

/etc/prometheus/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  # Data Gateway
  - job_name: 'data-gateway'
    static_configs:
      - targets: ['gateway.example.com:5000']
    metrics_path: /metrics
    scheme: http  # ili https
 
  # Vise instanci
  - job_name: 'data-gateway-cluster'
    static_configs:
      - targets:
          - 'gateway-1.example.com:5000'
          - 'gateway-2.example.com:5000'
          - 'gateway-3.example.com:5000'

4. Prometheus ponovno ucitati

# Config-Reload (bez restarta)
curl -X POST http://localhost:9090/-/reload
 
# Ili Restart
sudo systemctl restart prometheus

5. Targets provjeriti

Web UI: http://prometheus:9090/targets

Ili preko API:

curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | {job: .labels.job, health: .health}'

Ocekivani izlaz:

{
  "job": "data-gateway",
  "health": "up"
}

6. Vazni upiti

PromQL primjeri:

# Request-Rate (po sekundi)
rate(http_requests_total{job="data-gateway"}[5m])

# Prosjecno vrijeme odgovora
rate(http_request_duration_seconds_sum{job="data-gateway"}[5m])
/
rate(http_request_duration_seconds_count{job="data-gateway"}[5m])

# Error-Rate (5xx)
sum(rate(http_requests_total{job="data-gateway",status=~"5.."}[5m]))
/
sum(rate(http_requests_total{job="data-gateway"}[5m]))

# Memory potrosnja
process_resident_memory_bytes{job="data-gateway"}

# Aktivne konekcije
http_requests_in_progress{job="data-gateway"}

7. Kontrolna lista

# Provjera Da/Ne
———–
1 Metrics-Endpoint aktiviran -
2 /metrics dostupan -
3 Prometheus-Config azuriran -
4 Prometheus reloadiran -
5 Target „up“ u Prometheusu -
6 Metrike vidljive u Grafani -

Rjesavanje problema

Problem Uzrok Rjesenje
————————–
Target „down“ Endpoint nije dostupan Firewall, URL provjeriti
connection refused Gateway ne radi Gateway pokrenuti
404 Not Found Metrics nije aktiviran appsettings.json provjeriti
Nema metrika Pogresan put metrics_path provjeriti

Kubernetes ServiceMonitor

Za Prometheus Operator:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: data-gateway
  namespace: monitoring
  labels:
    release: prometheus
spec:
  selector:
    matchLabels:
      app: data-gateway
  namespaceSelector:
    matchNames:
      - data-gateway
  endpoints:
    - port: http
      path: /metrics
      interval: 15s

Povezani runbookovi


« <- Monitoring | -> Grafana Dashboard »


Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional

Zuletzt geändert: 29.01.2026. u 23:39