====== Monitoring ====== **Ciljna skupina:** DevOps, SRE \\ **Sadrzaj:** Metrike, Dashboardi, Alerting \\ **Alati:** Prometheus, Grafana, Alertmanager Nadzor Data Gatewaya za proaktivno prepoznavanje gresaka. ---- ===== Tijek rada ===== flowchart LR subgraph GATEWAY["DATA GATEWAY"] G1[/metrics Endpoint] G2[/health Endpoint] end subgraph COLLECT["PRIKUPLJANJE"] P[Prometheus] end subgraph VISUAL["VIZUALIZACIJA"] GR[Grafana] end subgraph ALERT["ALERTING"] AM[Alertmanager] E[E-Mail/Slack] end G1 --> P G2 --> P P --> GR P --> AM AM --> E style G1 fill:#e3f2fd style P fill:#fff3e0 style GR fill:#e8f5e9 style AM fill:#ffebee ---- ===== Runbookovi ===== ^ Runbook ^ Opis ^ Trajanje ^ | [[.:prometheus|Prometheus]] | Metrike prikupljati, Scrape-Config | ~15 Min | | [[.:grafana-dashboard|Grafana Dashboard]] | Vizualizacija, gotovi dashboardi | ~20 Min | | [[.:alerting|Alerting]] | Pragovi, Obavijesti | ~15 Min | ---- ===== Vazne metrike ===== | Metrika | Opis | Prag | |--------|--------------|-------------| | ''http_requests_total'' | Broj HTTP zahtjeva | - | | ''http_request_duration_seconds'' | Vrijeme odgovora | < 1s | | ''http_requests_in_progress'' | Aktivni zahtjevi | < 100 | | ''dotnet_gc_memory_total_available_bytes'' | Dostupna memorija | > 100MB | | ''process_cpu_seconds_total'' | CPU potrosnja | < 80% | ---- ===== Brzi test ===== # Health Check curl http://localhost:5000/health # Metrics (ako je aktivirano) curl http://localhost:5000/metrics ---- ===== Povezani runbookovi ===== * [[..:tagesgeschaeft:health-check|Health Check]] - Rucna provjera * [[..:tagesgeschaeft:logs-pruefen|Provjera logova]] - Analiza gresaka * [[..:sicherheit:start|Sigurnost]] - TLS za Metrics ---- << [[..:start|<- Operatorski prirucnik]] | [[.:prometheus|-> Prometheus]] >> ---- //Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional// {{tag>operator monitoring prometheus grafana}}