Implements the §14 Phase 3 review gate. requestReview() routes a building job into the review stage (fencing any worker), carrying a normalized policy (requiredApprovals + reviewer allowlist) and clearing prior decisions. submitReview() records one decision per reviewer (last-write-wins, identity- normalized), advances the job to testing once distinct approvals reach the quorum, and treats any reject as a veto that returns the job to queued for rework. Adds POST /fleet/jobs/:id/review/request and POST /fleet/jobs/:id/review, a typed client, and a review-gate card on the job-detail page. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
87 lines
6.0 KiB
Markdown
87 lines
6.0 KiB
Markdown
# Gigafactory — Tasks to Complete
|
|
|
|
> Companion to `ROADMAP_COMPLETION_AUDIT.md`. Ordered by priority. Update checkboxes as work lands.
|
|
|
|
---
|
|
|
|
- [x] **Operator job actions — requeue / reject / cancel**
|
|
- Priority: P0 (highest-impact safe slice; completes Phase-3 §14 "approve/ship/reject/requeue")
|
|
- Current status: ✅ DONE — `operatorAction` + route + client + UI buttons + 8 tests; fleet 141 green
|
|
- Files involved:
|
|
- `services/platform-service/src/modules/fleet/coordinator.ts` (new `operatorAction`)
|
|
- `services/platform-service/src/modules/fleet/routes.ts` (new `POST /fleet/jobs/:id/actions/:action`)
|
|
- `services/platform-service/src/modules/fleet/coordinator.test.ts` (tests)
|
|
- `dashboards/tracker-web/src/lib/fleet-client.ts` (client fn)
|
|
- `dashboards/tracker-web/src/app/dashboard/fleet/jobs/[id]/page.tsx` (buttons)
|
|
- Implementation plan: operator action does NOT require a held lease; it bumps `leaseEpoch`
|
|
to fence any current holder (mirrors the reaper), preserves checkpoint, sets stage
|
|
(requeue→queued/blocked, reject→dead_letter, cancel→failed), appends an event.
|
|
- Acceptance criteria: requeue a building job → stage queued, epoch+1, zombie report fenced (409);
|
|
reject → dead_letter; cancel → failed; unknown action → 400; flag-independent; all prior tests green.
|
|
- Verification command: `pnpm --filter @lysnrai/platform-service exec vitest run src/modules/fleet`
|
|
|
|
- [x] **Scoring explainability surfaced in UI**
|
|
- Priority: P1 (data already computed; Phase-3 §14)
|
|
- Current status: ✅ DONE — `explainJob` + `GET /fleet/jobs/:id/explain` + ExplainPanel; fleet 144 green
|
|
- Files involved: `scheduler.ts`, `coordinator.ts`, `routes.ts`, `fleet-client.ts`, fleet job detail page
|
|
- Implementation plan: add `GET /fleet/jobs/:id/explain` returning the would-be score breakdown
|
|
against current factories; render a "why this routes here" panel.
|
|
- Acceptance criteria: endpoint returns per-factor contributions; UI shows them; degrade if absent.
|
|
- Verification command: `pnpm --filter @lysnrai/platform-service exec vitest run src/modules/fleet`
|
|
|
|
- [x] **Cost burndown chart**
|
|
- Priority: P1
|
|
- Current status: ✅ DONE — `costBurndown` + `GET /fleet/budgets/:id/burndown` + BurndownChart; fleet 147 green
|
|
- Files involved: `dashboards/tracker-web/src/app/dashboard/fleet/budget/page.tsx`, new client fn
|
|
- Implementation plan: aggregate run cost by day from events/runs; render burndown vs ceiling overlay.
|
|
- Acceptance criteria: per-day spend visible with ceiling line; empty state when no data.
|
|
- Verification command: `pnpm --filter @bytelyst/tracker-web test`
|
|
|
|
- [x] **SSE live log streaming**
|
|
- Priority: P2 (larger; §17 single-stream contract)
|
|
- Current status: ✅ DONE — `GET /fleet/jobs/:id/events/stream` (resumable SSE) + `subscribeJobEvents`
|
|
fetch-streaming consumer with Last-Event-ID resume, polling fallback, and a Live badge; fleet 150,
|
|
web 222 green
|
|
- Files involved: `services/platform-service/src/modules/fleet/routes.ts` (stream route + clampInt/delay),
|
|
`dashboards/tracker-web/src/lib/fleet-client.ts` (`parseSseFrames`, `subscribeJobEvents`),
|
|
job detail page (live subscribe + fallback + Live indicator), route + client tests
|
|
- Implementation plan: `GET /fleet/jobs/:id/events/stream` (SSE) emitting appended events;
|
|
UI subscribes via fetch streaming (auth headers) with polling fallback.
|
|
- Acceptance criteria: new events appear without refresh; reconnect + fallback work.
|
|
- Verification command: `pnpm --filter @lysnrai/platform-service test`
|
|
|
|
- [x] **Fleet Playwright e2e**
|
|
- Priority: P2 (Phase-3 exit gate)
|
|
- Current status: ✅ DONE — `e2e/fleet.spec.ts`, 4 specs (overview, jobs table, job-detail requeue +
|
|
live badge, budget pause/resume) against a method/URL-aware mocked `/api/fleet/**`; all green
|
|
- Files involved: `dashboards/tracker-web/e2e/fleet.spec.ts`
|
|
- Implementation plan: cover fleet map render, jobs table, job detail action, budget pause/resume
|
|
against a mocked fleet API.
|
|
- Acceptance criteria: e2e green in CI config.
|
|
- Verification command: `pnpm --filter @bytelyst/tracker-web exec playwright test fleet`
|
|
|
|
- [ ] **Phase-1 `budget.wall` enforcement** — P3 — `agent-queue.sh` — wall-clock ceiling extending timeout.
|
|
- [ ] **Node `dash` tag surfacing** — P3 — `dashboard.mjs` — profile/priority/caps/tracker-item link.
|
|
- [ ] **Roadmap §14 reconciliation** — P3 — tick Phase-2/3 boxes in `learning_ai_devops_tools`.
|
|
- [x] **Fleet metrics + alerting** — P3 — ✅ DONE — `GET /fleet/metrics` (`coordinator.fleetMetrics`):
|
|
queue depth, stage histogram, oldest-queued age (starvation), factory health/seat utilization, and
|
|
derived alerts (`no_live_capacity`, `all_factories_down`, `queue_starvation`, `saturated`,
|
|
`stale_factories`). Surfaced as a metrics+alerts panel on the fleet overview (`getFleetMetrics`).
|
|
Files: `coordinator.ts`, `routes.ts`, `fleet-client.ts`, `dashboard/fleet/page.tsx` + tests + e2e.
|
|
- [x] **Multi-reviewer routing** — P3 — ✅ DONE — review-policy human gate (§14). `requestReview`
|
|
routes a building job into `review` (fences worker); `submitReview` records per-reviewer
|
|
approve/reject (last-write-wins, identity-normalized), advances to `testing` once distinct
|
|
approvals reach the quorum, or vetoes any reject back to `queued` for rework. Routes:
|
|
`POST /fleet/jobs/:id/review/request`, `POST /fleet/jobs/:id/review`. UI: review-gate card on
|
|
job detail (`requestReview`/`submitReview`). Files: `types.ts`, `coordinator.ts`, `routes.ts`,
|
|
`fleet-client.ts`, `dashboard/fleet/jobs/[id]/page.tsx` + coordinator/route/client tests + e2e.
|
|
- [ ] **TUI re-point at `/fleet`** — P3 — Phase-3 §14.
|
|
|
|
### Phase 4 / 5 (post-MVP, tracked only)
|
|
|
|
- [ ] Message broker (NATS/Redis) push dispatch + backpressure
|
|
- [ ] Autoscaling hooks (ephemeral factories)
|
|
- [ ] Capability marketplace + cross-product fairness
|
|
- [ ] Load + chaos suite
|
|
- [ ] Outcome feature capture · offline eval harness · A/B weight tuning · recommendations
|