Exports fleet observability to Prometheus/Grafana (previously JSON-only).
- GET /api/fleet/metrics/prom: global, product-labelled Prometheus exposition
(queue depth, blocked/active, per-stage histogram, factory health/seats/
utilization, active alerts, budget spent/ceiling/projected) plus process-wide
reaper/GC counters and engine circuit-breaker state. Pure renderer
(renderFleetMetricsProm) is unit-tested; route auth accepts a FLEET_METRICS_TOKEN
bearer (scrape path) or an admin JWT — never world-readable by default.
- Infra: add a prometheus container to docker-compose + a platform-service-fleet
scrape job; pin the Prometheus Grafana datasource uid; add a provisioned
"Fleet Overview" dashboard (breakers, dead-letter, stale factories, alerts,
queue depth, utilization, budget burn, reaper rate) with a product template var.
- Document FLEET_METRICS_TOKEN + the fleet feature flags in .env.example.
No default behavior change: the endpoint is additive and the new container is
opt-in via the compose stack.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Verified: full workspace build (tsc) green across all packages/services/dashboards;
fleet+items tests pass. Compile-time only.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Copied as-is from learning_voice_ai_agent/services/monitoring
- Grafana dashboards + provisioning for Loki datasource
- health-check.ts for service health polling
- Updated pnpm-workspace.yaml to include services/*