====== Runbook: Grafana nadzorna plošča ====== **Trajanje:** ~20 minut \\ **Vloga:** DevOps, SRE \\ **Predpogoj:** Grafana, Prometheus kot Datasource Vizualizacija metrik Gateway v Grafana. ---- ===== Potek dela ===== flowchart TD A[Začetek] --> B[Dodaj Datasource] B --> C[Uvozi nadzorno ploščo] C --> D[Prilagodi panele] D --> E[Konfiguriraj spremenljivke] E --> F[Shrani nadzorno ploščo] F --> G[Končano] style G fill:#e8f5e9 ---- ===== 1. Prometheus Datasource ===== **Grafana UI:** Configuration -> Data Sources -> Add data source Name: Prometheus Type: Prometheus URL: http://prometheus:9090 Access: Server (default) Ali prek Provisioning: # /etc/grafana/provisioning/datasources/prometheus.yaml apiVersion: 1 datasources: - name: Prometheus type: prometheus access: proxy url: http://prometheus:9090 isDefault: true ---- ===== 2. JSON nadzorne plošče ===== **Uvoz nadzorne plošče:** Create -> Import -> Paste JSON { "title": "Data Gateway", "uid": "data-gateway", "timezone": "browser", "panels": [ { "title": "Stopnja zahtev", "type": "timeseries", "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}, "targets": [{ "expr": "sum(rate(http_requests_total{job=\"data-gateway\"}[5m])) by (endpoint)", "legendFormat": "{{endpoint}}" }] }, { "title": "Odzivni čas (p95)", "type": "timeseries", "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}, "targets": [{ "expr": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job=\"data-gateway\"}[5m])) by (le))", "legendFormat": "p95" }] }, { "title": "Stopnja napak", "type": "stat", "gridPos": {"h": 4, "w": 6, "x": 0, "y": 8}, "targets": [{ "expr": "sum(rate(http_requests_total{job=\"data-gateway\",status=~\"5..\"}[5m])) / sum(rate(http_requests_total{job=\"data-gateway\"}[5m])) * 100", "legendFormat": "Napake %" }], "fieldConfig": { "defaults": { "unit": "percent", "thresholds": { "steps": [ {"color": "green", "value": null}, {"color": "yellow", "value": 1}, {"color": "red", "value": 5} ] } } } }, { "title": "Poraba pomnilnika", "type": "gauge", "gridPos": {"h": 4, "w": 6, "x": 6, "y": 8}, "targets": [{ "expr": "process_resident_memory_bytes{job=\"data-gateway\"} / 1024 / 1024", "legendFormat": "Pomnilnik MB" }], "fieldConfig": { "defaults": { "unit": "decmbytes", "max": 512, "thresholds": { "steps": [ {"color": "green", "value": null}, {"color": "yellow", "value": 300}, {"color": "red", "value": 450} ] } } } }, { "title": "Aktivne zahteve", "type": "stat", "gridPos": {"h": 4, "w": 6, "x": 12, "y": 8}, "targets": [{ "expr": "http_requests_in_progress{job=\"data-gateway\"}", "legendFormat": "Aktivne" }] }, { "title": "Čas delovanja", "type": "stat", "gridPos": {"h": 4, "w": 6, "x": 18, "y": 8}, "targets": [{ "expr": "time() - process_start_time_seconds{job=\"data-gateway\"}", "legendFormat": "Uptime" }], "fieldConfig": { "defaults": {"unit": "s"} } } ], "templating": { "list": [{ "name": "instance", "type": "query", "query": "label_values(http_requests_total{job=\"data-gateway\"}, instance)", "multi": true, "includeAll": true }] }, "refresh": "10s" } ---- ===== 3. Pomembni paneli ===== | Panel | Poizvedba | Namen | |-------|-------|-------| | Stopnja zahtev | ''sum(rate(http_requests_total[5m]))'' | Prepustnost | | Odzivni čas | ''histogram_quantile(0.95, ...)'' | Zakasnitev | | Stopnja napak | ''...status=~"5.."... * 100'' | Kvota napak | | Pomnilnik | ''process_resident_memory_bytes'' | Poraba RAM | | Procesor | ''rate(process_cpu_seconds_total[5m])'' | Obremenitev procesorja | | Aktivne zahteve | ''http_requests_in_progress'' | Vzporednost | ---- ===== 4. Spremenljivke nadzorne plošče ===== Za nastavitve z več instancami: Name: instance Type: Query Query: label_values(http_requests_total{job="data-gateway"}, instance) Multi-value: enabled Include All: enabled Nato v poizvedbah: ''http_requests_total{instance=~"$instance"}'' ---- ===== 5. Kontrolni seznam ===== | # | Točka preverjanja | V | |---|-----------|---| | 1 | Prometheus Datasource konfiguriran | | | 2 | Nadzorna plošča uvožena | | | 3 | Metrike se prikazujejo | | | 4 | Spremenljivke delujejo | | | 5 | Nadzorna plošča shranjena | | ---- ===== Odpravljanje težav ===== | Težava | Vzrok | Rešitev | |---------|---------|--------| | ''No data'' | Napačno ime opravila | Preveri ''job="data-gateway"'' | | ''Datasource error'' | Prometheus ni dosegljiv | Preveri URL | | Prazni grafi | Ni prometa | Uporabi Gateway | | Napačne vrednosti | Napačna poizvedba | Preveri PromQL sintakso | ---- ===== Izvoz nadzorne plošče ===== # Izvozi nadzorno ploščo kot JSON curl -s -H "Authorization: Bearer $GRAFANA_TOKEN" \ "http://grafana:3000/api/dashboards/uid/data-gateway" | jq '.dashboard' > dashboard.json # Uvozi nadzorno ploščo curl -X POST -H "Content-Type: application/json" \ -H "Authorization: Bearer $GRAFANA_TOKEN" \ -d @dashboard.json \ "http://grafana:3000/api/dashboards/db" ---- ===== Povezani Runbooks ===== * [[.:prometheus|Prometheus]] - Vir podatkov * [[.:alerting|Opozarjanje]] - Obvestila * [[..:tagesgeschaeft:health-check|Health Check]] - Ročno preverjanje ---- << [[.:prometheus|<- Prometheus]] | [[.:alerting|-> Opozarjanje]] >> ---- //Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional// {{tag>operator runbook grafana dashboard vizualizacija}}