Commit Graph

11 Commits

Author SHA1 Message Date
fe8338c2c5 feat(monitoring): add VM Overview Grafana dashboard
12-panel dashboard auto-provisioned via /var/lib/grafana/dashboards:
  - 4 stat tiles (disk %, RAM avail, swap used, CPU steal) with
    threshold colouring matching vm-health-check.sh
  - 4 time-series (disk %, RAM trend, steal, sda write GB/hr) — 7d default
  - 2 bargauge top-10 by RAM and CPU (cAdvisor container_memory_working_set,
    container_cpu_usage)
  - Load average (1/5/15) + network throughput (RX/TX, host interfaces)

uid: vm-overview. Picked up on next Grafana boot.

Closes Phase 5: "Add Grafana" item from VM observability roadmap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 21:26:35 +00:00
e9a70edb8b chore(monitoring): document health-check output
What changed:
- Documented the monitoring health-check script as a CLI/standalone output surface.
- Kept console output unchanged because it is the command's user interface.

Warning impact:
- @lysnrai/monitoring scoped warnings: 5 -> 0.
- Workspace warning total: 155 -> 150.

Verification:
- pnpm --filter @lysnrai/monitoring exec tsc --noEmit
- pnpm --filter @lysnrai/monitoring exec eslint . --ext .ts,.tsx
- pnpm lint
2026-05-04 16:34:27 -07:00
root
b8661392c6 feat(observability): add phase 2 monitoring and valkey services 2026-03-31 06:57:12 +00:00
saravanakumardb1
6f7299aa7a fix(monitoring): update health-check endpoints for consolidated services
- Remove defunct growth-service (4001), billing-service (4002), tracker-service (4004)
- Add backend API (8000), extraction sidecar (4006), all 3 dashboards (3001-3003)
- Reorder: backend → services → dashboards → infra
2026-02-17 20:53:37 -08:00
saravanakumardb1
fb3bc750eb fix: update .env.example comments, Grafana dashboard, and debug-service.md for consolidated services 2026-02-14 22:01:55 -08:00
saravanakumardb1
81609e9358 fix: remove stale port references from monitoring, docs, and AI.dev skills 2026-02-14 21:48:21 -08:00
16bc06d84a Add local health-check script; mark health verification 2026-02-14 18:59:01 -08:00
e9b33fb518 feat(monitoring): add @bytelyst/monitoring package 2026-02-14 15:57:41 -08:00
saravanakumardb1
b8c0a73e89 feat(extraction): Phase 5 observability + error handling (5.7-5.12)
- 5.7: Enhanced structured logging with userId, productId, cacheHit, tokenCount
- 5.8: Metrics module (counters + histograms) + /extract/metrics endpoint
- 5.9: Grafana dashboard config for extraction-service (Loki queries)
- 5.10: Error mapping — sidecar errors → proper HTTP status codes (408, 429, 502, 503)
- 5.11: Circuit breaker for Python sidecar (5 failures → 30s OPEN)
- 5.12: Graceful degradation — circuit open returns 503, cached results still served
- 46 TS tests passing
2026-02-14 14:04:59 -08:00
saravanakumardb1
90b9cf93d8 fix(common): configure ESLint 9 and fix lint issues
- Added @eslint/js dependency
- Updated eslint.config.js for ESLint 9 compatibility
- Added required globals (crypto, localStorage, React, etc.)
- Fixed unused imports and variables
- Disabled sort-imports temporarily
- Formatted all files with Prettier
2026-02-12 16:37:30 -08:00
saravanakumardb1
c97e697097 feat(services): add monitoring (Loki + Grafana config, health-check)
- Copied as-is from learning_voice_ai_agent/services/monitoring
- Grafana dashboards + provisioning for Loki datasource
- health-check.ts for service health polling
- Updated pnpm-workspace.yaml to include services/*
2026-02-12 11:39:24 -08:00