docs(gigafactory): consolidate gigafactory docs into docs/gigafactory/

Move GIGAFACTORY_ROADMAP.md and GIGAFACTORY_SYSTEM_OVERVIEW.md under
agent-queue/docs/gigafactory/ so the scattered top-level docs are easy to
discover. Update the README links, the overview code-map, and all phase
job-spec source-of-truth paths to the new location. Pure docs move; no
behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Saravanakumar D 2026-05-30 21:00:50 -07:00
parent 32162312a9
commit 257efcb4bc
18 changed files with 37 additions and 35 deletions

View File

@ -8,7 +8,9 @@ and they get executed (in auto-approve mode) one slot at a time, moving through
> **Vision & roadmap:** where this is headed — a distributed multi-machine "gigafactory"
> (fleet of factories × tools × profiles, scheduler-routed, built on platform-service +
> tracker-web) — is specified as a checklist-driven implementation roadmap in
> [`docs/GIGAFACTORY_ROADMAP.md`](docs/GIGAFACTORY_ROADMAP.md).
> [`docs/gigafactory/GIGAFACTORY_ROADMAP.md`](docs/gigafactory/GIGAFACTORY_ROADMAP.md).
> A full architecture overview, diagrams, code map and onboarding live alongside it in
> [`docs/gigafactory/`](docs/gigafactory/).
**Build/ship lifecycle — auto-QA, manual ship:**
@ -95,7 +97,7 @@ already have a `---` block.
### Manifest fields (Gigafactory Phase 1)
The runner parses the richer [gigafactory manifest](docs/GIGAFACTORY_ROADMAP.md#5-the-evolved-job-manifest-feature)
The runner parses the richer [gigafactory manifest](docs/gigafactory/GIGAFACTORY_ROADMAP.md#5-the-evolved-job-manifest-feature)
**backward-compatibly** — a legacy `engine`/`cwd`/`yolo`-only `.md` behaves exactly as before.
Fields marked **RESERVED** are parsed, stored in `.state/<job>.meta`, and shown in `status`, but
are otherwise **no-ops until a later phase** (they do not yet affect execution).

View File

@ -293,7 +293,7 @@ dashboards/tracker-web/src/
app/api/fleet/[...path]/route.ts proxy
e2e/fleet.spec.ts Playwright specs
lib/cosmos-init.ts container registration
docs/gigafactory-phase3-progress.md / docs/FLEET_CONTROL_PLANE.md
docs/gigafactory/gigafactory-phase3-progress.md / docs/gigafactory/FLEET_CONTROL_PLANE.md
```
**`learning_ai_devops_tools` (the factory agent + TUI + spec):**
@ -306,8 +306,8 @@ agent-queue/
profiles/*.md persona+capability catalog
demo/two-factory-demo.sh + coordinator-stub.sh parallel-fleet demo
selftest.sh ~75 dependency-light checks
docs/GIGAFACTORY_ROADMAP.md source-of-truth spec & checklists
docs/GIGAFACTORY_SYSTEM_OVERVIEW.md (this file)
docs/gigafactory/GIGAFACTORY_ROADMAP.md source-of-truth spec & checklists
docs/gigafactory/GIGAFACTORY_SYSTEM_OVERVIEW.md (this file)
```
---

View File

@ -8,12 +8,12 @@ timeout: 3h
ROLE: Senior engineer. Implement Phase 1 — Slice 1 of the Agent Gigafactory roadmap.
SOURCE OF TRUTH: agent-queue/docs/GIGAFACTORY_ROADMAP.md (read §4, §5, §6, §7, §14 Phase 1
SOURCE OF TRUTH: agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md (read §4, §5, §6, §7, §14 Phase 1
first). This slice implements ONLY the items listed below.
STRICT SCOPE:
- Edit ONLY files under agent-queue/ (primarily agent-queue.sh, selftest.sh, README.md,
docs/GIGAFACTORY_ROADMAP.md). DO NOT touch any other repo.
docs/gigafactory/GIGAFACTORY_ROADMAP.md). DO NOT touch any other repo.
- DO NOT modify, move, or delete anything under agent-queue/queue/ — there are LIVE jobs
running there. DO NOT run `agent-queue.sh run`. selftest.sh uses its own temp queue
(AGENT_QUEUE_ROOT) — that is the only execution allowed.
@ -64,7 +64,7 @@ TESTS (selftest.sh — tests are sacred; only ADD, never weaken existing ones).
DOCS:
- README.md frontmatter table: add the new fields, clearly marking ACTIVE (Phase 1) vs RESERVED.
- docs/GIGAFACTORY_ROADMAP.md: tick ONLY the Phase 1 checklist boxes you fully completed and
- docs/gigafactory/GIGAFACTORY_ROADMAP.md: tick ONLY the Phase 1 checklist boxes you fully completed and
update the §0 progress % for Phase 1 (do not tick incomplete items).
CONSTRAINTS:

View File

@ -8,7 +8,7 @@ timeout: 3h
ROLE: Senior engineer. Implement Phase 1 — Slice 2 (Profiles + deps/DAG, single host).
SOURCE OF TRUTH: agent-queue/docs/GIGAFACTORY_ROADMAP.md (read §5 deps, §6 profiles,
SOURCE OF TRUTH: agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md (read §5 deps, §6 profiles,
§14 Phase 1). This slice implements ONLY the items below.
PREREQUISITE / BRANCHING:
@ -21,7 +21,7 @@ PREREQUISITE / BRANCHING:
STRICT SCOPE:
- Edit ONLY under agent-queue/ (agent-queue.sh, selftest.sh, README.md, new
profiles/ dir, docs/GIGAFACTORY_ROADMAP.md). No other repo.
profiles/ dir, docs/gigafactory/GIGAFACTORY_ROADMAP.md). No other repo.
- DO NOT modify/delete anything under agent-queue/queue/ (live jobs). DO NOT run
`agent-queue.sh run`. selftest.sh uses its own temp AGENT_QUEUE_ROOT only.
- bash, single host. No service/Cosmos work (that is Phase 2).
@ -76,7 +76,7 @@ TESTS (selftest.sh — tests are sacred; only ADD):
DOCS:
- README: profiles section (catalog + resolution precedence) + deps/blocked
semantics.
- docs/GIGAFACTORY_ROADMAP.md: tick the §6 boxes you fully completed and the §5
- docs/gigafactory/GIGAFACTORY_ROADMAP.md: tick the §6 boxes you fully completed and the §5
`deps` box; bump §0 Phase 1 %.
CONSTRAINTS: bash style consistent with the existing script; no new runtime deps;

View File

@ -10,7 +10,7 @@ ROLE: Senior engineer. Implement Phase 1 — Slice 3: RESILIENCE & INSIGHTS (sin
This is a LARGE, fully self-contained slice (git + log parsing only — NO network,
NO external service, NO credentials) so it runs end-to-end without blockers.
SOURCE OF TRUTH: agent-queue/docs/GIGAFACTORY_ROADMAP.md (read §11 lifecycle/retry,
SOURCE OF TRUTH: agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md (read §11 lifecycle/retry,
§25 durability/crash-recovery, §26 execution insights, §17 observability, §14 Phase 1).
Implement the SINGLE-HOST bash equivalents of §25 and §26.
@ -24,7 +24,7 @@ PREREQUISITE / BRANCHING:
STRICT SCOPE:
- Edit ONLY under agent-queue/ (agent-queue.sh, selftest.sh, README.md,
docs/GIGAFACTORY_ROADMAP.md). No other repo.
docs/gigafactory/GIGAFACTORY_ROADMAP.md). No other repo.
- DO NOT modify/delete anything under agent-queue/queue/ (live jobs). DO NOT run
`agent-queue.sh run` against the real queue. selftest.sh uses its own temp
AGENT_QUEUE_ROOT and temp git repos only.
@ -128,7 +128,7 @@ DOCS
and "Insights" section (metrics, `aq insights`, token caveat) + document the
`retry` frontmatter (now active) and the new result= values
(retries_exhausted). Update the manifest table: move `retry` from RESERVED to ACTIVE.
- docs/GIGAFACTORY_ROADMAP.md: tick the single-host items you fully completed in
- docs/gigafactory/GIGAFACTORY_ROADMAP.md: tick the single-host items you fully completed in
§11 (retry/dead-letter stand-in), §25 (orphan/WIP/retry — note "single-host
subset"), §26 (capture/insights — single-host subset); bump §0 Phase 1 %.

View File

@ -10,7 +10,7 @@ ROLE: Senior engineer. Implement Phase 1 — Slice 4: TRACKER ADAPTER (single ho
This CLOSES Phase 1: a task in the tracker can become a job, and job outcomes echo
back to the tracker — the task<->job round-trip (§10, the last Phase-1 §14 item).
SOURCE OF TRUTH: agent-queue/docs/GIGAFACTORY_ROADMAP.md (read §10 tracker
SOURCE OF TRUTH: agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md (read §10 tracker
integration, §5 manifest incl. tracker-item + idempotency-key, §24.5 one-way echo).
PREREQUISITE / BRANCHING:
@ -22,7 +22,7 @@ PREREQUISITE / BRANCHING:
STRICT SCOPE:
- Edit ONLY under agent-queue/ (agent-queue.sh, selftest.sh, README.md,
docs/GIGAFACTORY_ROADMAP.md). No other repo is modified.
docs/gigafactory/GIGAFACTORY_ROADMAP.md). No other repo is modified.
- You MAY READ (not edit) ../learning_ai_common_plat/services/platform-service/
src/modules/items/{types,routes}.ts to match the real Item API contract
(paths, fields, auth header). Do not change that repo.
@ -93,7 +93,7 @@ DOCS:
- README: "Tracker integration" section — from-tracker/to-tracker, the env config,
label→manifest mapping table, the one-way-echo rule, AQ_TRACKER_AUTO, and a note
that real use needs platform-service running + a token.
- docs/GIGAFACTORY_ROADMAP.md: tick the §10 single-host items + the §14 Phase-1
- docs/gigafactory/GIGAFACTORY_ROADMAP.md: tick the §10 single-host items + the §14 Phase-1
"tracker adapter" item; set §0 Phase 1 → complete (or note the exact remaining %).
CONSTRAINTS: bash style consistent with the script; curl-only HTTP through the one

View File

@ -28,7 +28,7 @@ READ FIRST:
- packages/blob (@bytelyst/blob) — the Azure Blob client + SAS token helpers. Learn the
exact API (upload, container/key conventions, SAS generation, the memory/dev fallback).
Use it the same way other consumers do (grep for existing @bytelyst/blob usage).
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §13 (fleet_artifacts
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §13 (fleet_artifacts
bullet) + §26 (insights/artifacts).
PREREQUISITE / BRANCHING: branch off CURRENT main → feat/gigafactory-p2-artifacts.

View File

@ -23,7 +23,7 @@ CONTEXT TO READ FIRST:
- packages/datastore — the shared datastore abstraction + its Memory and Cosmos
providers. Find the update/replace method and how (if at all) it exposes
optimistic concurrency.
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §4 (atomic
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §4 (atomic
claim is THE core contract) + §13 (maps to Cosmos _etag / If-Match).
PREREQUISITE / BRANCHING:

View File

@ -27,7 +27,7 @@ READ FIRST:
- modules/auth/** in platform-service AND ../../packages/auth — reuse the EXISTING token/
hashing primitives (bcrypt/sha-256 recovery-code pattern). Do NOT invent new crypto.
Tokens are stored HASHED at rest; the plaintext is returned exactly once at enroll/rotate.
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §12 (enrollment,
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §12 (enrollment,
scoped tokens, rotation, revocation) + §18 (trust boundary).
PREREQUISITE / BRANCHING: branch off CURRENT main → feat/gigafactory-p2-enrollment.

View File

@ -14,7 +14,7 @@ PARALLEL-SAFETY (another Devin is running in a DIFFERENT repo — learning_ai_co
on enrollment/tokens; no file overlap with you. Stay within the agent-queue repo):
- You OWN: agent-queue/lib/fleet-client.sh, agent-queue/agent-queue.sh (the fleet hook
points only), agent-queue/selftest.sh, agent-queue/README.md,
agent-queue/docs/GIGAFACTORY_ROADMAP.md.
agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md.
- Keep the offline git-queue path unchanged when fleet is off. All 60 existing selftest
checks MUST stay green.
@ -23,7 +23,7 @@ READ FIRST:
fleet_claim, fleet_report, lease renew/release, fleet_quarantine. You EXTEND this.
- agent-queue/agent-queue.sh — the run loop + the existing fleet hook points + the offline
path (cmd_add/run_worker/ship). Study how AQ_FLEET gates everything today.
- agent-queue/docs/GIGAFACTORY_ROADMAP.md §9 (split-brain / offline degrade), §16/§17
- agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §9 (split-brain / offline degrade), §16/§17
(feature flags fleet.enabled / fleet.route_via_service), §27 (cutover & rollback).
PREREQUISITE / BRANCHING: branch off CURRENT main → feat/gigafactory-p2-flags-shadow.

View File

@ -28,7 +28,7 @@ READ FIRST (this is NOT the platform-service you may assume — verify conventio
memory vs cosmos provider is selected (DB_PROVIDER).
- packages/cosmos container registry — how containers are registered.
- The fleet spec lives in the sibling devops-tools repo (read-only):
../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md
../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md
§4 (core contract: idempotency/atomic-claim/fencing/lease), §7 (scheduler/claim),
§8 (factory/lease/heartbeat), §13 (containers + fields), §18 (failure model),
§25 (durability/recovery), §26 (insights). Match these field names + semantics.

View File

@ -27,7 +27,7 @@ READ FIRST:
tryClaimJob CAS exactly as-is).
- types.ts (read-only) — FleetJobDoc (priority, capabilities, budget, createdAt, deps,
stage), FleetFactoryDoc (capabilities, health, load, seatLimit).
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §7 (the formula
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §7 (the formula
+ tie-breaks + phasing note: Phase 2 = fixed weights; Phase 3 = tunable + preemption).
PREREQUISITE / BRANCHING: branch off CURRENT main → feat/gigafactory-p2-scheduler.

View File

@ -20,7 +20,7 @@ READ FIRST (this is NOT the platform-service you may assume — verify conventio
this module pattern EXACTLY (types.ts -> repository.ts -> routes.ts, Zod schemas,
the cloud-agnostic datastore, productId on every doc, req.log/app.log).
- packages/cosmos (container registry) + how existing modules register containers.
- The fleet container spec in the roadmap: agent-queue/docs/GIGAFACTORY_ROADMAP.md
- The fleet container spec in the roadmap: agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md
§13 lives in the devops-tools repo at ../learning_ai_devops_tools — read it for
the field lists (fleet_jobs incl. bodyMd + checkpoint; fleet_runs incl. token/
cost/tool/diff insights; fleet_leases incl. leaseEpoch; fleet_factories;

View File

@ -43,7 +43,7 @@ READ FIRST (verify the real contract — do not guess):
is the heartbeat upsert, and `fleet_events` are written SERVER-SIDE by the coordinator on
each PATCH/claim. The coordinator owns `leaseEpoch` fencing: a PATCH/renew carrying a stale
epoch is rejected (409/conflict).
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §7 (claim loop),
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §7 (claim loop),
§8 (factory/heartbeat/claim/report/drain), §9 (split-brain/offline-degrade), §18 (fencing).
PREREQUISITE / BRANCHING:

View File

@ -32,7 +32,7 @@ READ FIRST (understand the contracts before writing):
API contract you mirror to: Item fields (id, productId, title/description, status,
labels[]), the status vocabulary, and the comment/note mechanism. Call the items
repository DIRECTLY in-process (no HTTP/curl) — this is the whole point of "direct wiring".
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §10 (tracker
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §10 (tracker
integration), §24.5 (echo rule), §14 Phase-2 checklist (the §10 box you will tick).
PREREQUISITE / BRANCHING: branch off CURRENT main -> feat/gigafactory-p2-tracker-wiring.
@ -105,7 +105,7 @@ do NOT change claimNextJob or the scheduler; conventional commits
(feat(platform-service): ...); do not edit the agent-queue repo.
DOCS: tick §10 "direct tracker->module calls" in
../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §14 Phase-2 (note the
../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §14 Phase-2 (note the
flag name + that it is the in-process round-trip; the agent-queue shell adapter remains the
single-host path).

View File

@ -17,7 +17,7 @@ the tracker-wiring slice) — no overlap. In THIS repo you OWN a NEW demo direct
additive selftest/docs only:
- You OWN (create/edit): agent-queue/demo/two-factory-demo.sh (NEW),
agent-queue/demo/README.md (NEW), additive checks in agent-queue/selftest.sh,
and the §14 Phase-2 demo/exit-criteria ticks in agent-queue/docs/GIGAFACTORY_ROADMAP.md.
and the §14 Phase-2 demo/exit-criteria ticks in agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md.
- You MUST NOT change the behavior of agent-queue.sh or lib/fleet-client.sh. You may READ
them and CALL them; if a tiny additive hook is unavoidable, keep it flag-gated and prove
all 68 existing selftest checks still pass byte-for-byte.
@ -33,7 +33,7 @@ READ FIRST:
responder pattern). Reuse that exact stub style so the demo's selftest needs NO live service.
- ../learning_ai_common_plat/services/platform-service/src/modules/fleet/coordinator.ts —
the claim/lease/fence/reaper contract you are demonstrating (read-only; do not edit).
- agent-queue/docs/GIGAFACTORY_ROADMAP.md §14 Phase-2 "Two-factory demo" + "Exit criteria".
- agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §14 Phase-2 "Two-factory demo" + "Exit criteria".
DELIVERABLES

View File

@ -36,7 +36,7 @@ GLOBAL GUARDRAILS (unattended danger mode — obey strictly):
and the tracker-web logger/telemetry pattern); every Cosmos doc carries `productId`;
reuse @bytelyst/* packages and existing module patterns (types.ts -> repository.ts ->
routes.ts). Do NOT hardcode colors/URLs/secrets.
- CHECKPOINTING: maintain docs/gigafactory-phase3-progress.md on the branch. After each
- CHECKPOINTING: maintain docs/gigafactory/gigafactory-phase3-progress.md on the branch. After each
slice, record: slice name, status (DONE/WIP/FAILED), commit sha, verify-gate result,
and any follow-ups. Commit it WITH the slice. If you resume after an interruption, read
it first and continue from the first not-DONE slice.
@ -48,7 +48,7 @@ progress.md with the exact failing output, and MOVE ON to the next slice that do
depend on it (dependencies noted per slice). Never thrash; never fake green.
READ FIRST:
- ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md — §7 (scoring;
- ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md — §7 (scoring;
Phase-3 = tunable weights + preemption), §5/§6 (DAG/deps), §11/§13 (budgets), §14
Phase-3 checklist + Exit criteria, §16 Definition-of-Done.
- services/platform-service/src/modules/fleet/{scheduler,coordinator,repository,routes,
@ -137,7 +137,7 @@ equivalent). Do not weaken its lint/e2e config.
--------------------------------------------------------------------------------
SLICE 5 — Docs + roadmap + Phase-3 exit criteria (depends on: S1-S4 outcomes)
--------------------------------------------------------------------------------
- Update ../learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY_ROADMAP.md §14 Phase-3
- Update ../learning_ai_devops_tools/agent-queue/docs/gigafactory/GIGAFACTORY_ROADMAP.md §14 Phase-3
checkboxes for every box you actually completed, with a one-line note + the flag names
(FLEET_PREEMPTION/FLEET_BUDGETS) and which are default-OFF. Tick the Phase-3 Exit-criteria
line ONLY if its conditions are genuinely met; otherwise note the exact remaining %.
@ -145,9 +145,9 @@ SLICE 5 — Docs + roadmap + Phase-3 exit criteria (depends on: S1-S4 outcomes)
learning_ai_devops_tools, OR include the roadmap delta as a patch file under
docs/ in THIS branch and note it for the operator — do NOT entangle the two repos'
git history. Prefer the patch-file note if a clean cross-repo PR isn't trivial.)
- Update dashboards/tracker-web/README + a short docs/FLEET_CONTROL_PLANE.md (how to use
- Update dashboards/tracker-web/README + a short docs/gigafactory/FLEET_CONTROL_PLANE.md (how to use
the new UI, the flags, the endpoints consumed).
- Finalize docs/gigafactory-phase3-progress.md with the end-state of every slice.
- Finalize docs/gigafactory/gigafactory-phase3-progress.md with the end-state of every slice.
FINAL OUTPUT — print ONE consolidated report in EXACTLY this format:
## Implementation Report — Phase 3 (overnight)