====== Runbook: Grafana Dashboard ======
**Trajanje:** ~20 minuta \\
**Uloga:** DevOps, SRE \\
**Preduvjet:** Grafana, Prometheus kao Datasource
Vizualizacija Gateway metrika u Grafani.
----
===== Tijek rada =====
flowchart TD
A[Start] --> B[Datasource dodati]
B --> C[Dashboard importirati]
C --> D[Panele prilagoditi]
D --> E[Varijable konfigurirati]
E --> F[Dashboard spremiti]
F --> G[Gotovo]
style G fill:#e8f5e9
----
===== 1. Prometheus Datasource =====
**Grafana UI:** Configuration -> Data Sources -> Add data source
Name: Prometheus
Type: Prometheus
URL: http://prometheus:9090
Access: Server (default)
Ili preko Provisioninga:
# /etc/grafana/provisioning/datasources/prometheus.yaml
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
----
===== 2. Dashboard JSON =====
**Dashboard importirati:** Create -> Import -> Paste JSON
{
"title": "Data Gateway",
"uid": "data-gateway",
"timezone": "browser",
"panels": [
{
"title": "Request Rate",
"type": "timeseries",
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
"targets": [{
"expr": "sum(rate(http_requests_total{job=\"data-gateway\"}[5m])) by (endpoint)",
"legendFormat": "{{endpoint}}"
}]
},
{
"title": "Response Time (p95)",
"type": "timeseries",
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0},
"targets": [{
"expr": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{job=\"data-gateway\"}[5m])) by (le))",
"legendFormat": "p95"
}]
},
{
"title": "Error Rate",
"type": "stat",
"gridPos": {"h": 4, "w": 6, "x": 0, "y": 8},
"targets": [{
"expr": "sum(rate(http_requests_total{job=\"data-gateway\",status=~\"5..\"}[5m])) / sum(rate(http_requests_total{job=\"data-gateway\"}[5m])) * 100",
"legendFormat": "Error %"
}],
"fieldConfig": {
"defaults": {
"unit": "percent",
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 1},
{"color": "red", "value": 5}
]
}
}
}
},
{
"title": "Memory Usage",
"type": "gauge",
"gridPos": {"h": 4, "w": 6, "x": 6, "y": 8},
"targets": [{
"expr": "process_resident_memory_bytes{job=\"data-gateway\"} / 1024 / 1024",
"legendFormat": "Memory MB"
}],
"fieldConfig": {
"defaults": {
"unit": "decmbytes",
"max": 512,
"thresholds": {
"steps": [
{"color": "green", "value": null},
{"color": "yellow", "value": 300},
{"color": "red", "value": 450}
]
}
}
}
},
{
"title": "Active Requests",
"type": "stat",
"gridPos": {"h": 4, "w": 6, "x": 12, "y": 8},
"targets": [{
"expr": "http_requests_in_progress{job=\"data-gateway\"}",
"legendFormat": "Active"
}]
},
{
"title": "Uptime",
"type": "stat",
"gridPos": {"h": 4, "w": 6, "x": 18, "y": 8},
"targets": [{
"expr": "time() - process_start_time_seconds{job=\"data-gateway\"}",
"legendFormat": "Uptime"
}],
"fieldConfig": {
"defaults": {"unit": "s"}
}
}
],
"templating": {
"list": [{
"name": "instance",
"type": "query",
"query": "label_values(http_requests_total{job=\"data-gateway\"}, instance)",
"multi": true,
"includeAll": true
}]
},
"refresh": "10s"
}
----
===== 3. Vazni paneli =====
| Panel | Query | Svrha |
|-------|-------|-------|
| Request Rate | ''sum(rate(http_requests_total[5m]))'' | Propusnost |
| Response Time | ''histogram_quantile(0.95, ...)'' | Latencija |
| Error Rate | ''...status=~"5.."... * 100'' | Stopa gresaka |
| Memory | ''process_resident_memory_bytes'' | RAM potrosnja |
| CPU | ''rate(process_cpu_seconds_total[5m])'' | CPU opterecenje |
| Active Requests | ''http_requests_in_progress'' | Paralelnost |
----
===== 4. Dashboard varijable =====
Za Multi-Instance setupove:
Name: instance
Type: Query
Query: label_values(http_requests_total{job="data-gateway"}, instance)
Multi-value: enabled
Include All: enabled
Zatim u Query: ''http_requests_total{instance=~"$instance"}''
----
===== 5. Kontrolna lista =====
| # | Provjera | Da/Ne |
|---|-----------|---|
| 1 | Prometheus Datasource konfiguriran | - |
| 2 | Dashboard importiran | - |
| 3 | Metrike se prikazuju | - |
| 4 | Varijable rade | - |
| 5 | Dashboard spremljen | - |
----
===== Rjesavanje problema =====
| Problem | Uzrok | Rjesenje |
|---------|---------|--------|
| ''No data'' | Pogresan Job-name | ''job="data-gateway"'' provjeriti |
| ''Datasource error'' | Prometheus nije dostupan | URL provjeriti |
| Prazni grafovi | Nema prometa | Gateway koristiti |
| Pogresne vrijednosti | Pogresan Query | PromQL sintaksu provjeriti |
----
===== Dashboard-Export =====
# Dashboard kao JSON eksportirati
curl -s -H "Authorization: Bearer $GRAFANA_TOKEN" \
"http://grafana:3000/api/dashboards/uid/data-gateway" | jq '.dashboard' > dashboard.json
# Dashboard importirati
curl -X POST -H "Content-Type: application/json" \
-H "Authorization: Bearer $GRAFANA_TOKEN" \
-d @dashboard.json \
"http://grafana:3000/api/dashboards/db"
----
===== Povezani runbookovi =====
* [[.:prometheus|Prometheus]] - Izvor podataka
* [[.:alerting|Alerting]] - Obavijesti
* [[..:tagesgeschaeft:health-check|Health Check]] - Rucna provjera
----
<< [[.:prometheus|<- Prometheus]] | [[.:alerting|-> Alerting]] >>
----
//Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional//
{{tag>operator runbook grafana dashboard visualisierung}}