--- engine: devin cwd: /Users/sd9235/code/mygh/learning_ai_devops_tools yolo: true lock: agent-queue timeout: 4h --- ROLE: Senior bash + distributed-systems engineer. Implement PHASE 2 — FLEET FEATURE FLAGS + SHADOW / DUAL-RUN for the agent-queue runner: a safe, reversible path to validate the fleet coordinator against the proven single-host (P1) behavior BEFORE any real cutover. PARALLEL-SAFETY (another Devin is running in a DIFFERENT repo — learning_ai_common_plat — on enrollment/tokens; no file overlap with you. Stay within the agent-queue repo): - You OWN: agent-queue/lib/fleet-client.sh, agent-queue/agent-queue.sh (the fleet hook points only), agent-queue/selftest.sh, agent-queue/README.md, agent-queue/docs/GIGAFACTORY/GIGAFACTORY_ROADMAP.md. - Keep the offline git-queue path unchanged when fleet is off. All 60 existing selftest checks MUST stay green. READ FIRST: - agent-queue/lib/fleet-client.sh — the P2-S3 client: fleet_enabled, fleet_api, fleet_claim, fleet_report, lease renew/release, fleet_quarantine. You EXTEND this. - agent-queue/agent-queue.sh — the run loop + the existing fleet hook points + the offline path (cmd_add/run_worker/ship). Study how AQ_FLEET gates everything today. - agent-queue/docs/GIGAFACTORY/GIGAFACTORY_ROADMAP.md §9 (split-brain / offline degrade), §16/§17 (feature flags fleet.enabled / fleet.route_via_service), §27 (cutover & rollback). PREREQUISITE / BRANCHING: branch off CURRENT main → feat/gigafactory-p2-flags-shadow. Push + open PR. DO NOT merge. FLAG MODEL (three explicit, independently-toggleable levels; document precedence): - AQ_FLEET=0|1 master switch (exists). 0 ⇒ pure offline, zero coordinator calls. - AQ_FLEET_ROUTE=0|1 route_via_service: when 1 (and AQ_FLEET=1) the coordinator is AUTHORITATIVE for claim/assignment (today's P2-S3 behavior). When 0, the LOCAL inbox is authoritative (coordinator not used to source work) — this is the pre-cutover state. - AQ_FLEET_SHADOW=0|1 shadow/dual-run: when 1 (requires AQ_FLEET=1, AQ_FLEET_ROUTE=0) the runner does its normal OFFLINE/local processing as the authoritative path, and IN PARALLEL queries the coordinator (shadow claim + shadow report) WITHOUT acting on its responses — purely to compare decisions and record divergence. Shadow NEVER ships, quarantines, or mutates real job state. DELIVERABLES 1. fleet-client.sh additions (all guarded; no-ops unless their flag is on): - fleet_route_enabled / fleet_shadow_enabled helpers (precedence: SHADOW only meaningful when ROUTE=0; if both ROUTE=1 and SHADOW=1, ROUTE wins and a warning is logged). - fleet_shadow_claim — asks the coordinator what it WOULD assign for this factory's caps, without claiming a lease for real (read-only / dry-run; if the API has no dry-run, claim then immediately lease/release, or use a shadow factoryId — pick the least-invasive and document it). Returns the would-be job id (or none). - fleet_shadow_compare — given the LOCAL decision (the job the offline path actually ran) and the coordinator's would-be decision, classify AGREE / DIVERGE / COORD_EMPTY / LOCAL_EMPTY and append a structured line to a shadow log (agent-queue/queue/.state/fleet-shadow.log: ts, localJob, coordJob, verdict). - fleet_shadow_report — mirrors stage transitions to the coordinator as shadow events (clearly flagged shadow=1) so reporting is exercised, but divergence in the coordinator response is logged, never acted on. 2. agent-queue.sh wiring (minimal, flag-gated): - run loop: if SHADOW on, after the local authoritative decision each iteration, call fleet_shadow_claim + fleet_shadow_compare (best-effort, error-swallowed — shadow must NEVER fail a real job). - ROUTE flag: thread it so claim sourcing honors it (ROUTE=1 ⇒ coordinator-sourced as today; ROUTE=0 ⇒ local inbox authoritative even when AQ_FLEET=1). - new subcommand `aq fleet-shadow-report` — summarize the shadow log (counts of AGREE/DIVERGE/…, last N divergences). Add to dispatch + help. - surface the three flags' resolved state in `aq status` / `aq fleet-status`. 3. Cutover safety: document the recommended rollout ladder in README — (1) AQ_FLEET=1, ROUTE=0, SHADOW=1 (observe, zero risk) → (2) inspect agreement rate → (3) flip ROUTE=1 once agreement is high → rollback = set ROUTE=0 (and/or AQ_FLEET=0) at any time. TESTS — extend selftest.sh (stub the coordinator like the P2-S3 fleet stub; all 60 prior checks stay green): - flags off: AQ_FLEET=0 ⇒ zero coordinator calls (incl. shadow); offline flow identical. - shadow agree: stub returns the same job the local path runs ⇒ shadow log records AGREE; the real job still ships via the offline/local path; coordinator state NOT mutated for real. - shadow diverge: stub returns a different/empty job ⇒ DIVERGE/COORD_EMPTY logged; real job still completes; nothing quarantined. - shadow is non-fatal: coordinator 5xx/timeout during shadow ⇒ real job still completes, exit 0, a shadow-error noted. - ROUTE precedence: ROUTE=1 + SHADOW=1 ⇒ ROUTE path taken, warning logged, no shadow compare. - ROUTE=0 + AQ_FLEET=1 ⇒ local inbox is authoritative (coordinator not used to source work). - fleet-shadow-report summarizes the log counts correctly. VERIFY GATE: - bash agent-queue/selftest.sh (60 prior + new shadow/flag cases; none weakened) - bash -n agent-queue/agent-queue.sh && bash -n agent-queue/lib/fleet-client.sh - shellcheck --severity=error agent-queue/agent-queue.sh agent-queue/lib/fleet-client.sh - node --check agent-queue/dashboard.mjs (if unchanged) CONSTRAINTS: bash + curl + POSIX awk only (no jq/new deps); reuse P2-S3 helpers; shadow must be strictly side-effect-free on real job state; offline path unchanged when AQ_FLEET=0; never hardcode tokens; conventional commits (feat(agent-queue): ...); never weaken a test; do not edit the common-plat repo. FINAL OUTPUT — report in EXACTLY this format: ## Implementation Report — Phase 2 Feature Flags + Shadow/Dual-run ### Branch & commits / PR ### Files changed ### What was implemented (flag model + precedence, shadow claim/compare/report, cutover ladder) ### Tests added (+ selftest summary = 60 prior + N new; esp. flags-off no-op, shadow non-fatal, ROUTE precedence) ### Verify gate results ### Deviations / assumptions (how shadow claim avoids real lease mutation) ### Suggested next slice