====== Monitoring ======
**Ciljna skupina:** DevOps, SRE \\
**Sadrzaj:** Metrike, Dashboardi, Alerting \\
**Alati:** Prometheus, Grafana, Alertmanager
Nadzor Data Gatewaya za proaktivno prepoznavanje gresaka.
----
===== Tijek rada =====
flowchart LR
subgraph GATEWAY["DATA GATEWAY"]
G1[/metrics Endpoint]
G2[/health Endpoint]
end
subgraph COLLECT["PRIKUPLJANJE"]
P[Prometheus]
end
subgraph VISUAL["VIZUALIZACIJA"]
GR[Grafana]
end
subgraph ALERT["ALERTING"]
AM[Alertmanager]
E[E-Mail/Slack]
end
G1 --> P
G2 --> P
P --> GR
P --> AM
AM --> E
style G1 fill:#e3f2fd
style P fill:#fff3e0
style GR fill:#e8f5e9
style AM fill:#ffebee
----
===== Runbookovi =====
^ Runbook ^ Opis ^ Trajanje ^
| [[.:prometheus|Prometheus]] | Metrike prikupljati, Scrape-Config | ~15 Min |
| [[.:grafana-dashboard|Grafana Dashboard]] | Vizualizacija, gotovi dashboardi | ~20 Min |
| [[.:alerting|Alerting]] | Pragovi, Obavijesti | ~15 Min |
----
===== Vazne metrike =====
| Metrika | Opis | Prag |
|--------|--------------|-------------|
| ''http_requests_total'' | Broj HTTP zahtjeva | - |
| ''http_request_duration_seconds'' | Vrijeme odgovora | < 1s |
| ''http_requests_in_progress'' | Aktivni zahtjevi | < 100 |
| ''dotnet_gc_memory_total_available_bytes'' | Dostupna memorija | > 100MB |
| ''process_cpu_seconds_total'' | CPU potrosnja | < 80% |
----
===== Brzi test =====
# Health Check
curl http://localhost:5000/health
# Metrics (ako je aktivirano)
curl http://localhost:5000/metrics
----
===== Povezani runbookovi =====
* [[..:tagesgeschaeft:health-check|Health Check]] - Rucna provjera
* [[..:tagesgeschaeft:logs-pruefen|Provjera logova]] - Analiza gresaka
* [[..:sicherheit:start|Sigurnost]] - TLS za Metrics
----
<< [[..:start|<- Operatorski prirucnik]] | [[.:prometheus|-> Prometheus]] >>
----
//Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional//
{{tag>operator monitoring prometheus grafana}}