fe8338c2c5
feat(monitoring): add VM Overview Grafana dashboard
...
12-panel dashboard auto-provisioned via /var/lib/grafana/dashboards:
- 4 stat tiles (disk %, RAM avail, swap used, CPU steal) with
threshold colouring matching vm-health-check.sh
- 4 time-series (disk %, RAM trend, steal, sda write GB/hr) — 7d default
- 2 bargauge top-10 by RAM and CPU (cAdvisor container_memory_working_set,
container_cpu_usage)
- Load average (1/5/15) + network throughput (RX/TX, host interfaces)
uid: vm-overview. Picked up on next Grafana boot.
Closes Phase 5: "Add Grafana" item from VM observability roadmap.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 21:26:35 +00:00
saravanakumardb1
b8c0a73e89
feat(extraction): Phase 5 observability + error handling (5.7-5.12)
...
- 5.7: Enhanced structured logging with userId, productId, cacheHit, tokenCount
- 5.8: Metrics module (counters + histograms) + /extract/metrics endpoint
- 5.9: Grafana dashboard config for extraction-service (Loki queries)
- 5.10: Error mapping — sidecar errors → proper HTTP status codes (408, 429, 502, 503)
- 5.11: Circuit breaker for Python sidecar (5 failures → 30s OPEN)
- 5.12: Graceful degradation — circuit open returns 503, cached results still served
- 46 TS tests passing
2026-02-14 14:04:59 -08:00