====== Runbook: Health Check ====== **Trajanje:** ~2 minute \\ **Uloga:** Gateway-Operator \\ **Ucestalost:** Dnevno / Automatizirano Provjera dostupnosti i funkcionalnosti Gatewaya. ---- ===== Tijek rada ===== flowchart TD A[Health Check pokrenuti] --> B[/health Endpoint] B --> C{Healthy?} C -->|Da| D[API test] C -->|Ne| E[Logove provjeriti] D --> F{Podaci vraceni?} F -->|Da| G[OK - Gotovo] F -->|Ne| H[DSN provjeriti] E --> I[Server ponovno pokrenuti] H --> I style G fill:#e8f5e9 style E fill:#ffebee style H fill:#fff3e0 ---- ===== 1. Basic Health Check ===== # Jednostavan Health Check curl -s -o /dev/null -w "%{http_code}" http://localhost:5000/health # Ocekivani odgovor: 200 # S Response-Body curl -s http://localhost:5000/health # Ocekivani odgovor: "Healthy" ---- ===== 2. Extended Health Check ===== # Swagger dostupan? curl -s -o /dev/null -w "%{http_code}" http://localhost:5000/swagger # API verziju provjeriti curl -s http://localhost:5000/api/v1/info | jq ---- ===== 3. DSN konnektivnost ===== # Sve DSN proci for dsn in demo produktion reporting; do echo "Testing $dsn..." curl -s -o /dev/null -w "$dsn: %{http_code}\n" \ "http://localhost:5000/api/v1/dsn/$dsn/tables" done **PowerShell verzija:** $dsns = @("demo", "produktion", "reporting") foreach ($dsn in $dsns) { $result = Invoke-WebRequest -Uri "http://localhost:5000/api/v1/dsn/$dsn/tables" -UseBasicParsing Write-Host "$dsn : $($result.StatusCode)" } ---- ===== 4. Vrijeme odgovora mjeriti ===== # Pojedinacni upit curl -s -o /dev/null -w "Vrijeme: %{time_total}s\n" \ "http://localhost:5000/api/v1/dsn/demo/tables/Products?\$top=10" # Vise prolaza for i in {1..5}; do curl -s -o /dev/null -w "%{time_total}\n" \ "http://localhost:5000/api/v1/dsn/demo/tables/Products?\$top=10" done | awk '{sum+=$1} END {print "Prosjek: " sum/NR "s"}' ---- ===== 5. Kontrolna lista ===== | # | Provjera | Ocekivanje | Da/Ne | |---|-----------|-----------|---| | 1 | /health | 200 + "Healthy" | - | | 2 | /swagger | 200 | - | | 3 | DSN "demo" dostupna | 200 | - | | 4 | Vrijeme odgovora | < 1s | - | | 5 | Nema gresaka u logovima | Nema ERROR | - | ---- ===== Automatizirani Health Check ===== **Cron (Linux):** # /etc/cron.d/gateway-health */5 * * * * root curl -sf http://localhost:5000/health || systemctl restart data-gateway **Scheduled Task (Windows):** # health-check.ps1 $response = Invoke-WebRequest -Uri "http://localhost:5000/health" -UseBasicParsing -TimeoutSec 5 if ($response.StatusCode -ne 200) { Restart-Service -Name "DataGateway" -Force Send-MailMessage -To "admin@example.com" -Subject "Gateway Restart" -Body "Gateway je automatski ponovno pokrenut" } ---- ===== Rjesavanje problema ===== | Problem | Uzrok | Rjesenje | |---------|---------|--------| | ''Connection refused'' | Gateway nije pokrenut | Server pokrenuti | | ''503 Service Unavailable'' | Startup jos nije zavrsen | 30s pricekati, ponovno provjeriti | | ''500 Internal Server Error'' | Config greska | Logove provjeriti | | Timeout | Gateway preopterecen | Opterecenje smanjiti, resurse povecati | ---- ===== Pragovi ===== | Metrika | Zeleno | Zuto | Crveno | |--------|------|------|-----| | Vrijeme odgovora | < 500ms | 500ms-2s | > 2s | | Error-Rate | < 1% | 1-5% | > 5% | | CPU | < 50% | 50-80% | > 80% | | Memory | < 70% | 70-90% | > 90% | ---- ===== Povezani runbookovi ===== * [[.:server-starten|Server pokrenuti]] - Kod ispada * [[.:logs-pruefen|Provjera logova]] - Analiza gresaka * [[..:monitoring:prometheus|Prometheus]] - Automatski monitoring ---- << [[.:dsn-verwalten|<- DSN upravljanje]] | [[.:logs-pruefen|-> Provjera logova]] >> ---- //Wolfgang van der Stille @ EMSR DATA d.o.o. - Data Gateway Professional// {{tag>operator runbook health check monitoring}}