Commit Graph

8 Commits

Author SHA1 Message Date
Saravanakumar D
4ac499f301 feat: add multi-reviewer human gate (review-policy routing)
Implements the §14 Phase 3 review gate. requestReview() routes a building
job into the review stage (fencing any worker), carrying a normalized policy
(requiredApprovals + reviewer allowlist) and clearing prior decisions.
submitReview() records one decision per reviewer (last-write-wins, identity-
normalized), advances the job to testing once distinct approvals reach the
quorum, and treats any reject as a veto that returns the job to queued for
rework. Adds POST /fleet/jobs/:id/review/request and POST /fleet/jobs/:id/review,
a typed client, and a review-gate card on the job-detail page.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 19:04:24 -07:00
Saravanakumar D
c9c2c174db feat: add fleet metrics + alerting (GET /fleet/metrics)
Adds coordinator.fleetMetrics() computing queue depth, stage histogram,
oldest-queued age (starvation signal), factory health and seat utilization,
plus derived alerts (no_live_capacity, all_factories_down, queue_starvation,
saturated, stale_factories). Exposed via GET /fleet/metrics and surfaced as a
metrics+alerts panel on the fleet overview. Thresholds injectable for tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:51:59 -07:00
Saravanakumar D
d780739cbe test: add fleet control-plane Playwright e2e coverage
New e2e/fleet.spec.ts with a method- and URL-aware /api/fleet/** mock that
holds mutable state so operator actions and budget toggles reflect in
follow-up GETs. Covers: fleet overview (factory cards + recent jobs), jobs
table + stage filter, job detail requeue (stage building->queued) with the
SSE-driven Live badge, and budget pause/resume. All 4 specs green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:44:53 -07:00
Saravanakumar D
ea42602407 feat: add resumable SSE live event stream for fleet jobs
Backend: GET /fleet/jobs/:id/events/stream emits a snapshot (seq > Last-Event-ID)
then long-polls the append-only event log, closing after a bounded window so
EventSource-style clients reconnect cleanly. Honors Last-Event-ID resume,
keepalive comments, and a terminal error frame.

Frontend: subscribeJobEvents uses fetch streaming (to send auth + product
headers) with parseSseFrames, Last-Event-ID resume, reconnect backoff, and a
fatal-on-error-frame fallback to polling. Job detail page subscribes live
(deduped by seq), falls back to 4s polling on failure, and shows a Live badge;
refresh() now merges events so a slow snapshot can't clobber streamed ones.

Tests: +3 route (snapshot, resume cursor, append-after-connect), +5 client
(parseSseFrames x2, subscribe deliver/error/resume/error-frame). fleet 150, web 222.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:38:50 -07:00
Saravanakumar D
1ae15a7755 docs: mark cost burndown complete
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:25:46 -07:00
Saravanakumar D
3f850b7b6f docs: mark scoring explainability complete
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:21:35 -07:00
Saravanakumar D
69f553d432 docs: mark operator job actions complete in TASKS_TO_COMPLETE
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:15:53 -07:00
Saravanakumar D
0f903b935a audit: document current Gigafactory completion state
- ROADMAP_COMPLETION_AUDIT.md: verified state vs GIGAFACTORY_ROADMAP source of truth
- TASKS_TO_COMPLETE.md: prioritized remaining work with acceptance criteria
- Key finding: roadmap §0 tracker is stale (P2 ~95%, P3 ~70% actual vs 80%/0% claimed)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 18:06:33 -07:00