docs(roadmap): tick verified-done Phase 3 boxes (395-400,402)

Phase 3 fleet control plane is implemented in learning_ai_common_plat:
fleet API client, fleet map page, job table/detail/DAG/SSE/actions, cost
burndown + multi-reviewer gate, scoring explainability, preemption, and
Playwright fleet e2e. Box 401 (TUI re-point) remains open.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Saravanakumar D 2026-05-30 19:13:25 -07:00
parent 4326001650
commit f1fe66fd4d

View File

@ -392,14 +392,14 @@ Each phase: **Goal → checklist → Exit criteria**. Don't start a phase until
### Phase 3 — Fleet control plane in tracker-web + DAG + budgets + scoring router
**Goal:** one browser control plane; smart routing + budgets live.
- [ ] `fleet` API client in `tracker-web` (reuse `/api/tracker`-style proxy → `/fleet`).
- [ ] Fleet map page (factories, load, health, capabilities) on `@bytelyst/*` components.
- [ ] Job table + job detail + **DAG view**; live log via **SSE**; approve/ship/reject/requeue actions.
- [ ] Cost burndown + budget kill-switch UI; multi-reviewer routing.
- [ ] Scoring router with configurable weights + explainability surfaced in UI.
- [ ] Preemption of low-priority by critical jobs (checkpoint + requeue).
- [x] `fleet` API client in `tracker-web` (reuse `/api/tracker`-style proxy → `/fleet`). *(common-plat `dashboards/tracker-web/src/lib/fleet-client.ts`: typed client over `/api/fleet`.)*
- [x] Fleet map page (factories, load, health, capabilities) on `@bytelyst/*` components. *(common-plat `app/dashboard/fleet/page.tsx`: health badges, load, capabilities, fleet metrics + alerts.)*
- [x] Job table + job detail + **DAG view**; live log via **SSE**; approve/ship/reject/requeue actions. *(common-plat `app/dashboard/fleet/jobs/page.tsx` + `jobs/[id]/page.tsx`: stage-filtered table, DAG via `getJobDag`, SSE event stream, ship/requeue/reject/requestReview.)*
- [x] Cost burndown + budget kill-switch UI; multi-reviewer routing. *(common-plat `app/dashboard/fleet/budget/page.tsx` burndown + pause/resume; `ReviewGateCard` multi-reviewer quorum gate via `requestReview`/`submitReview`.)*
- [x] Scoring router with configurable weights + explainability surfaced in UI. *(common-plat `fleet/scheduler.ts` tunable weights + `GET /fleet/jobs/:id/explain`; `ExplainPanel` breakdown in job detail.)*
- [x] Preemption of low-priority by critical jobs (checkpoint + requeue). *(common-plat `fleet/scheduler.ts` `selectPreemptionVictim` + coordinator eviction under `FLEET_PREEMPTION`; victim requeued with checkpoint + bumped epoch, `preempted` event.)*
- [ ] TUI dashboard re-pointed at `/fleet` API (parity).
- [ ] Web e2e (Playwright): fleet map, live log, ship, budget-pause.
- [x] Web e2e (Playwright): fleet map, live log, ship, budget-pause. *(common-plat `dashboards/tracker-web/e2e/fleet.spec.ts`: fleet overview, metrics, job detail, ship, budget-pause, review-gate specs green.)*
- **Exit criteria:** all boxes ✅; web `verify` (typecheck+lint+test+e2e) green; an operator runs the whole 3-repo parallel workload from the browser, including a budget pause + resume.
### Phase 4 — Message bus + autoscaling + cross-OS capability marketplace