From a9c69b1dceeebd8274d2ae3a6b98bdde67711695 Mon Sep 17 00:00:00 2001 From: saravanakumardb1 Date: Fri, 29 May 2026 17:44:37 -0700 Subject: [PATCH] docs(agent-queue): manifest field table (active vs reserved) + tick Phase 1 Slice 1 (P1-S1) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - README: new "Manifest fields (Gigafactory Phase 1)" table marking ACTIVE vs RESERVED, capability-grammar table, idempotency-key semantics, copilot engine mapping, COPILOT_BIN, and capability_mismatch/no_engine result values. - GIGAFACTORY_ROADMAP: tick only the fully-completed P1 boxes (frontmatter parsing, capability detect+match, priority, backward-compat, capability grammar, engine-class taxonomy, idempotency-key semantics, README/progress), annotate partials, and bump §0 Phase 1 to in-progress 35%. --- agent-queue/README.md | 53 +++++++++++++++++++++++-- agent-queue/docs/GIGAFACTORY_ROADMAP.md | 26 ++++++------ 2 files changed, 64 insertions(+), 15 deletions(-) diff --git a/agent-queue/README.md b/agent-queue/README.md index 073aacd..f91437b 100644 --- a/agent-queue/README.md +++ b/agent-queue/README.md @@ -76,7 +76,7 @@ which directory to run in, and whether to auto-approve: ```md --- -engine: devin # devin | claude | codex (default: $AGENT_QUEUE_ENGINE) +engine: devin # devin | claude | codex | copilot (default: $AGENT_QUEUE_ENGINE) cwd: /abs/path/to/repo # where the agent executes (default: cwd when added) yolo: true # auto-approve ALL tools (default: true) lock: my-repo # optional mutex key (default: cwd). Jobs sharing a key run serially @@ -93,6 +93,50 @@ verify: pnpm -s test # optional auto-QA gate. Runs in cwd after rc=0: `add --engine/--cwd/--yolo` will inject this frontmatter for you if the file doesn't already have a `---` block. +### Manifest fields (Gigafactory Phase 1) + +The runner parses the richer [gigafactory manifest](docs/GIGAFACTORY_ROADMAP.md#5-the-evolved-job-manifest-feature) +**backward-compatibly** — a legacy `engine`/`cwd`/`yolo`-only `.md` behaves exactly as before. +Fields marked **RESERVED** are parsed, stored in `.state/.meta`, and shown in `status`, but +are otherwise **no-ops until a later phase** (they do not yet affect execution). + +| Field | Status | Default | Meaning | +| ----- | ------ | ------- | ------- | +| `engine` | active | `$AGENT_QUEUE_ENGINE` | explicit engine (`devin\|claude\|codex\|copilot`) — always wins over `engine-class` | +| `cwd` / `yolo` / `lock` / `timeout` / `verify` | active | see above | Phase-0 behavior, unchanged | +| `priority` | **active** | `medium` | `critical\|high\|medium\|low`. Inbox is picked **highest-priority first, then oldest** (was pure FIFO) | +| `engine-class` | **active** | _(none)_ | used only when `engine` is unset: `agentic-coder`→`devin,claude,codex`; `chat-coder`→`copilot`. Picks the first **available** engine. No engine available → job fails `result=no_engine` | +| `prefers-engine` | **active** | _(none)_ | optional order hint for `engine-class` resolution, e.g. `[claude, devin]` | +| `capabilities` | **active** | _(none)_ | hard host requirements, e.g. `[os:any, node>=20, has:git]`. If the host can't satisfy them the job is sent to `failed/` with `result=capability_mismatch` **and the agent is never launched** (grammar below) | +| `idempotency-key` | **active** | _(none)_ | dedupe on `add` (semantics below) | +| `profile` | RESERVED | _(none)_ | role/persona + caps (profiles land in a later slice) | +| `prefers` | RESERVED | _(none)_ | soft routing/affinity hints (e.g. `[factory:mac-2]`) | +| `budget` | RESERVED | _(none)_ | `{ usd, tokens, wall }` ceilings (`wall` enforcement is a later slice) | +| `deps` / `deps-mode` | RESERVED | _(none)_ | DAG dependencies (single-host blocking is a later slice) | +| `retry` | RESERVED | _(none)_ | `{ max, backoff, on }` retry policy | +| `review-policy` | RESERVED | _(none)_ | `auto\|manual\|reviewers:[…]` | +| `artifacts` | RESERVED | _(none)_ | extra outputs to capture (coverage, screenshots) | +| `tracker-item` | RESERVED | _(none)_ | link back to the originating tracker task | + +**Capability grammar** (a job matches a host iff **every** required token is satisfied): + +| Token form | Example | Satisfied when | +| ---------- | ------- | -------------- | +| `key` (bare presence) | `gpu` | the host advertises `key` in any form | +| `key:value` (exact) | `os:mac`, `engine:devin`, `has:git` | the host advertises that exact token | +| `key:any` (wildcard) | `os:any` | the host advertises any `key:*` (so `os:any` matches every host) | +| `keyversion` (`>=` `>` `=` `<=` `<`) | `node>=20` | numeric/semver-major compare vs the host's `key:` | + +The host advertises (via `detect_capabilities`): `os:`, `engine:`, +`node:`, and `has:` when present. + +**`idempotency-key` semantics** (on `add`, hashing the frontmatter-stripped body): + +- same key **+ same body** → **no-op** (logged `duplicate, skipped`). +- same key **+ different body**, prior job still in `inbox/` → **supersedes** it (replaces the queued file). +- same key **+ different body**, prior job already past `inbox/` (building/review/testing/shipped) → + **rejected** with a clear error (use a new key, or requeue the existing job). + ## Engine mapping | `engine:` | Command run | Auto-approve flag (`yolo: true`) | @@ -100,6 +144,7 @@ already have a `---` block. | `devin` | `devin -p --prompt-file ` | `--permission-mode dangerous` | | `claude` | `claude -p` (body on **stdin**) | `--dangerously-skip-permissions` | | `codex` | `codex exec` (body on **stdin**) | `--dangerously-bypass-approvals-and-sandbox` | +| `copilot` | `copilot -p` (body on **stdin**) | `--allow-all-tools` _(best-effort; chat-coder class target)_ | The frontmatter is **stripped** before the body reaches the agent, and claude/codex receive it on **stdin** so a body starting with `--` is never @@ -178,7 +223,9 @@ queue/ ``` **`result=` values** written to `.meta`: `review`, `testing`, `shipped`, -`failed`, `timeout`, `verify_failed`, `rejected`, `requeued`. +`failed`, `timeout`, `verify_failed`, `rejected`, `requeued`, `capability_mismatch` +(host missing a required capability — agent never launched), `no_engine` +(an `engine-class` had no available engine). ## Config (env overrides) @@ -190,7 +237,7 @@ queue/ | `AGENT_QUEUE_POLL` | `3` | inbox poll interval (seconds) | | `AGENT_QUEUE_VERIFY` | _(empty)_ | default auto-QA verify command; per-job `verify:` overrides it | | `AGENT_QUEUE_STALL_MIN` | `10` | minutes of unchanged log before a worker is `⚠ stalled` | -| `DEVIN_BIN` / `CLAUDE_BIN` / `CODEX_BIN` | autodetected | override CLI binary paths | +| `DEVIN_BIN` / `CLAUDE_BIN` / `CODEX_BIN` / `COPILOT_BIN` | autodetected | override CLI binary paths | | `FLOCK_BIN` / `TIMEOUT_BIN` | autodetected | `flock` (lock hardening) and `timeout`/`gtimeout` (hard timeouts); absent on stock macOS — see notes | ## ⚠️ Safety diff --git a/agent-queue/docs/GIGAFACTORY_ROADMAP.md b/agent-queue/docs/GIGAFACTORY_ROADMAP.md index 818d029..6f9f4f2 100644 --- a/agent-queue/docs/GIGAFACTORY_ROADMAP.md +++ b/agent-queue/docs/GIGAFACTORY_ROADMAP.md @@ -11,7 +11,7 @@ | Phase | Theme | Status | % | Gate | | ----- | ----- | ------ | - | ---- | | **0** | Baseline (today) | ✅ shipped | 100% | `selftest.sh` green | -| **1** | Manifest + profiles + capabilities + tracker adapter (single host) | ☐ not started | 0% | adapter e2e + selftest | +| **1** | Manifest + profiles + capabilities + tracker adapter (single host) | ◐ in progress | 35% | adapter e2e + selftest | | **2** | Coordinator as platform-service module + Cosmos + multi-factory leasing | ☐ not started | 0% | fleet e2e + module tests | | **3** | Fleet control plane in tracker-web + DAG deps + budgets + scoring router | ☐ not started | 0% | web e2e + router tests | | **4** | Message bus + autoscaling + cross-OS capability marketplace | ☐ not started | 0% | load/chaos suite | @@ -138,10 +138,10 @@ tracker-item: ITEM-789 # link back to the originating tracker task ``` - [ ] Define the manifest schema (Zod in the service; documented YAML for `.md`). -- [ ] Backward-compat: a Phase-0 `.md` (only `engine/cwd/yolo`) parses with all new fields defaulted. -- [ ] **Capability grammar** defined: tokens are `key` (presence, e.g. `has:xcode`), `key:value` (e.g. `os:mac`, `engine:devin`), or `keyversion` with `op ∈ {>=,>,=,<=,<}` (e.g. `node>=20`). `os:any` is a wildcard that matches every factory. A job matches a factory iff every required token is satisfied by the factory descriptor. -- [ ] **`engine-class` taxonomy** defined as an enum (`agentic-coder`, `chat-coder`, `review-only`) with a documented engine→class map (`devin,claude,codex → agentic-coder`; `copilot → chat-coder`). If `engine` is set it wins; else the scheduler picks any free engine in the class honoring `prefers-engine`. -- [ ] **`idempotency-key` semantics:** `key + content-hash` identical ⇒ no-op (returns existing job). Same `key`, **different** content ⇒ **rejected with 409** unless the prior job is still `queued`/`blocked` (then it is superseded). A re-`run`/`retry` of an existing job is **not** a new submit and never trips dedupe. +- [x] Backward-compat: a Phase-0 `.md` (only `engine/cwd/yolo`) parses with all new fields defaulted. *(P1-S1: bash runner; Zod schema still P2. selftest backward-compat case green.)* +- [x] **Capability grammar** defined: tokens are `key` (presence, e.g. `has:xcode`), `key:value` (e.g. `os:mac`, `engine:devin`), or `keyversion` with `op ∈ {>=,>,=,<=,<}` (e.g. `node>=20`). `os:any` is a wildcard that matches every factory. A job matches a factory iff every required token is satisfied by the factory descriptor. *(P1-S1: `caps_match`/`detect_capabilities` in `agent-queue.sh`.)* +- [x] **`engine-class` taxonomy** defined as an enum (`agentic-coder`, `chat-coder`, `review-only`) with a documented engine→class map (`devin,claude,codex → agentic-coder`; `copilot → chat-coder`). If `engine` is set it wins; else the scheduler picks any free engine in the class honoring `prefers-engine`. *(P1-S1: `resolve_engine`; `review-only` mapping reserved.)* +- [x] **`idempotency-key` semantics:** `key + content-hash` identical ⇒ no-op (returns existing job). Same `key`, **different** content ⇒ **rejected with 409** unless the prior job is still `queued`/`blocked` (then it is superseded). A re-`run`/`retry` of an existing job is **not** a new submit and never trips dedupe. *(P1-S1: add-time dedupe; bash maps "409" → clear error, `queued` → still in `inbox/` ⇒ superseded.)* - [ ] **`deps` semantics:** a dep is satisfied when it reaches `shipped` (default) or `testing` if `deps-mode: soft`. Submit-time **cycle detection** rejects cyclic graphs; unmet deps put the job in `blocked` (not `queued`). Cross-factory deps require the coordinator (P2); single-host deps work in P1. - **Acceptance:** a manifest fixture suite parses/validates; invalid manifests fail with precise errors; capability-grammar + dep-cycle + idempotency-conflict cases covered. - **Verify gate:** schema unit tests (≥ 1 per field incl. defaults + 5 invalid cases + grammar/cycle/409 cases). @@ -340,17 +340,19 @@ Each phase: **Goal → checklist → Exit criteria**. Don't start a phase until ### Phase 1 — Manifest + profiles + capabilities + tracker adapter (single host) **Goal:** richer single-host runner that understands profiles/capabilities and bridges to tracker — no distributed infra yet. -- [ ] Extend `agent-queue.sh` frontmatter parsing for all new manifest fields (§5), defaulted + backward-compatible. +> **Slice progress — P1-S1 (this commit):** manifest parsing (all §5 fields, defaulted + backward-compatible), `priority` ordering, capability detection+match gate, `engine-class` resolution, and `idempotency-key` dedupe are **done** on the bash runner. Profiles, `deps` DAG, `retry`/`budget.wall`, `allowed-scope`, the tracker adapter, and dashboard surfacing remain **for later slices**. + +- [x] Extend `agent-queue.sh` frontmatter parsing for all new manifest fields (§5), defaulted + backward-compatible. *(P1-S1)* - [ ] Add `profiles/` directory + profile resolution (persona injection, default verify/caps/scope) (§6). -- [ ] Local capability detection + a job/factory capability match check before launch (§8 subset). -- [ ] `priority` ordering in the inbox pick (replace pure FIFO with priority-then-age). -- [ ] `deps` (DAG) blocking on a single host; `idempotency-key` dedupe on `add`. +- [x] Local capability detection + a job/factory capability match check before launch (§8 subset). *(P1-S1: `detect_capabilities` + `caps_match`; mismatch ⇒ `failed/` `result=capability_mismatch`, agent never launched.)* +- [x] `priority` ordering in the inbox pick (replace pure FIFO with priority-then-age). *(P1-S1: `inbox_sorted`; per-lock serialization preserved.)* +- [ ] `deps` (DAG) blocking on a single host; `idempotency-key` dedupe on `add`. *(P1-S1: `idempotency-key` dedupe DONE; `deps` DAG blocking still pending.)* - [ ] `retry` with backoff into `failed`/requeue; `budget.wall` enforced (extends `timeout`). - [ ] `allowed-scope` guardrail (warn-only this phase) + post-run diff report. - [ ] **Tracker adapter** `aq from-tracker ` + `aq to-tracker` event poster (§10 P1). -- [ ] Dashboard shows profile + priority + capability tags + tracker-item link. -- [ ] Update `selftest.sh` with: manifest parse fixtures, profile resolution, priority order, dep-block, idempotency, adapter round-trip (mock). -- [ ] Update README + this doc's progress table. +- [ ] Dashboard shows profile + priority + capability tags + tracker-item link. *(P1-S1: `status` shows priority/profile/caps/tracker-item; Node `dash` surfacing pending.)* +- [ ] Update `selftest.sh` with: manifest parse fixtures, profile resolution, priority order, dep-block, idempotency, adapter round-trip (mock). *(P1-S1: added backward-compat, priority, capability-mismatch, engine-class, idempotency cases; profile/dep-block/adapter pending.)* +- [x] Update README + this doc's progress table. *(P1-S1)* - **Exit criteria:** all boxes ✅; `selftest.sh` green; a tracker task → executed → tracker `done` with SHA comment, fully on one host; no regression to Phase-0 `.md` files. ### Phase 2 — Coordinator as platform-service module + Cosmos + multi-factory leasing