bytelyst-devops-tools

Author	SHA1	Message	Date
saravanakumardb1	14308fc382	fix(agent-queue): explicit-engine availability check, shutdown lease release, cache GC Three runner-side robustness fixes (behavior-preserving, opt-out where relevant): - resolve_engine now availability-checks an EXPLICIT engine (mirroring the engine-class path): if the requested engine's binary isn't installed it emits the no-engine signal so the job is marked no_engine, instead of invoking a missing binary and surfacing a generic crash. - The run-loop INT/TERM trap now best-effort releases leases for in-flight building/ jobs (new fleet_release_all_active) so a stopped factory's jobs are reclaimable immediately rather than waiting out the ~900s lease TTL. - _cache_prune GCs cached repo checkouts under $STATE/repos not accessed in AQ_FLEET_CACHE_TTL_DAYS days (default 14; 0 disables), run once at run-loop startup, to stop unbounded disk growth. Guards against rm on an empty base path. bash -n passes on both files; ./selftest.sh PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 11:51:56 -07:00
saravanakumardb1	79e6a8db00	feat(agent-queue): honor a job's explicit engine on fleet claim When materializing a claimed fleet job, write `engine: <pick>` into the job frontmatter (resolve_engine then runs it). Only a KNOWN engine (devin/claude/codex/copilot) is honored — never the run's 'unknown'/class placeholder — so an engineless job still falls back to the factory default (AGENT_QUEUE_ENGINE). No behavior change for existing jobs. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 02:30:38 -07:00
saravanakumardb1	70c6d47a75	feat(agent-queue): report concrete engine + Devin session id on release parse_usage now always emits engine=<engine>, and the devin arm extracts the ATIF export's session_id. fleet-client includes engine + sessionId in the run insights it reports, so the coordinator/UI can show the real engine (not 'unknown') and a session handle for traceability/recovery (devin --resume <id>). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 02:19:52 -07:00
saravanakumardb1	f9be343e32	feat(agent-queue): enable PR mode in the example fleet launcher Without AQ_FLEET_PR=1 + AQ_FLEET_REPO_BASE a job's repo is ignored and the agent just runs the prompt in the sandbox cwd (no PR). Add both (PR on by default, REPO_BASE = the repos' parent dir; FLEET_PR=0 to opt out) + a PRODUCTS subset-restart note so a busy factory can be left running. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 01:24:39 -07:00
saravanakumardb1	d574f5dda3	feat(agent-queue): macOS LaunchAgent boot-persistence (auto-start + KeepAlive) Adds agent-queue-boot.sh (PATH repair + ~/.agent-queue.env overrides + caffeinate wrap) and launchd/ (install.sh + README) so the run loop auto-starts on login and survives reboot/crash — the persistence layer tmux+caffeinate alone cannot give. No secrets tracked (host config lives in untracked ~/.agent-queue.env). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:25:16 -07:00
saravanakumardb1	38d8e8e5cf	feat(agent-queue): add tracked example multi-product fleet launcher The operational _start_fleet.sh lives in a local (untracked) sandbox, so the gate + heartbeat-cadence settings weren't version-controlled anywhere. Add demo/start-fleet.example.sh: a parameterized, sanitized launcher (one agent-queue.sh run daemon per product against a live platform-service) that ships the two settings you must get right — AQ_FLEET_GATE=1 (M0 RU gate) and AQ_FLEET_LEASE_RENEW_SEC=30 (heartbeat cadence < the 90s stale threshold). No hardcoded paths/secrets; everything env-overridable. Documented in demo/README. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:18:26 -07:00
saravanakumardb1	2993994273	docs(gigafactory): reconcile overview + roadmap to current reality - System overview: mark Phase 4 in-progress (M0 RU gate shipped), add fleet_queue_state container + GET /fleet/queue-state, document the heartbeat cadence vs 90s stale gotcha, the tracker-web caps=build form bug, the missing deregister API, and the ended=-race fix; drop the now-false "roadmap §0 stale" and "boxes 384/386 unticked" claims (both reconciled); link the redesign doc. - Roadmap: §0 Phase 4 -> in progress (M0); align the Phase-2 §8 spec endpoint sketches to the as-built API (/fleet/factories/enroll, /factories/heartbeat, /fleet/claim) + note the heartbeat cadence and the M0 gate. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:02:45 -07:00
saravanakumardb1	fa1f1d1b30	fix(agent-queue): write ended= after PR/report so --once can't exit early run_worker marked the job ended (testing) right after moving to testing/, BEFORE opening/merging the PR and reporting to the coordinator. Once ended= is written, _meta_active returns false, active_workers drops to 0, and "run --once" could drain-exit (and callers could observe completion) while the background worker was still opening the PR — a real race that made the PR-mode selftest flaky and could free a concurrency slot prematurely in production. Move the ended= write to the end of the success path (after PR open/merge + testing/shipped reports). No behavior change on the autoship/ship path. Full selftest now passes deterministically across repeated runs. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 23:36:51 -07:00
saravanakumardb1	41d8067724	feat(agent-queue): M0 RU gate — skip the claim when the queue is unchanged Adds AQ_FLEET_GATE (default OFF): the run loop point-reads the cheap per-product queue version (GET /fleet/queue-state) and SKIPS the expensive /fleet/claim while the version is unchanged and it is not mid-drain, with a periodic safety backstop and fail-open-on-read-error so work is never stranded. Keeps POLL_SECONDS for local job responsiveness rather than raising it globally. selftest 39b covers the gate decisions; reconciles the M0 section of the dispatch redesign doc. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 23:19:01 -07:00
saravanakumardb1	29afe59604	docs(gigafactory): v4 coverage audit — roadmap maps 1:1 to design, no gaps Adds a coverage matrix + M-prep (decisions/§10, schema, containers, RBAC) and closes plan gaps: correlation filter + dispatcher budget (M1); small messages, token re-check, alerting (M2); plus Testing and Rollback & flags blocks. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:47:54 -07:00
saravanakumardb1	9f24a7fdd0	docs(gigafactory): add error-handling & cleanup section + v3 review fixes Adds §5.5 (lease-release-on-failure, branch/worktree GC, same-repo worktree clobber) with target invariants, plus a §12 checklist block. v3 review: unify targetFactoryId, reconcile §5.3 with complete-on-claim, align §6 token scoping with per-factory subscriptions, M0 wording. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:45:21 -07:00
saravanakumardb1	8217a864e9	docs(gigafactory): add Phase-4 fleet dispatch redesign (broker + on-demand) Proposes moving fleet work-dispatch off Cosmos busy-polling onto Azure Service Bus in a coordinator-owns-scheduling / broker-owns-delivery hybrid, fixing the product-as-queue routing smell and the idle-poll RU cost. Includes phased migration (M0 RU quick win -> shadow -> cutover -> scale-to-zero) with a ticked checklist. Self-reviewed (v2) for the outbox/change-feed, message-size, long-job lock, idempotency, and routing-model consistency issues. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:24:55 -07:00
saravanakumardb1	b8f0369f63	feat(agent-queue): approximate Devin run cost from tokens (model price map) Devin's export has tokens but no USD cost; estimate cost_usd from a per-model $/1M price map (Opus/Sonnet/Haiku) and flag usage_estimated so the dashboard shows it as approx.	2026-05-31 15:58:35 -07:00
saravanakumardb1	c2dbbaf188	feat(agent-queue): report PR state (open/merged) on the run	2026-05-31 13:56:46 -07:00
saravanakumardb1	d6fa1d9e28	feat(agent-queue): PR mode uses existing local repo via git worktree (no clone) When AQ_FLEET_REPO_BASE/<repo> is an existing checkout, create a git worktree off it for branch aq/job/<id> (shares objects + remotes, leaves the main checkout untouched) instead of cloning. Falls back to clone for remote-only repos. selftest exercises the worktree path. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 06:28:36 -07:00
saravanakumardb1	b442b95728	feat(agent-queue): per-repo verify + opt-in auto-merge for PR jobs Claim now carries verify (drives the existing verify gate -> PR opens only if it passes) and autoMerge (squash-merge via gh pr merge after the PR opens, non-fatal). selftest covers both. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 06:17:28 -07:00
saravanakumardb1	e634d4915f	feat(agent-queue): agent authors PR title + description (.aq_pr.md) In PR mode the agent is asked to write .aq_pr.md (line 1 = PR title, then a markdown description) based on the task + the diff it produced. The factory reads it for `gh pr create` (via --body-file) and removes it before committing (never part of the PR). Falls back to a derived title if absent. selftest asserts the authored title is used and .aq_pr.md is not committed. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:48:37 -07:00
saravanakumardb1	d0e800247c	feat(agent-queue): PR mode clones from local repo base (AQ_FLEET_REPO_BASE) MVP: when AQ_FLEET_REPO_BASE/<repo> is an existing local checkout, use it as the clone source (fast, no network) and push/PR to its GitHub origin — embedded creds in the local origin URL are stripped (gh credential helper handles auth). Selftest PASS (full-path bare-repo fallback unchanged). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:36:46 -07:00
saravanakumardb1	cfbcc2da9d	feat(agent-queue): PR mode — open a PR per fleet job (AQ_FLEET_PR) When AQ_FLEET_PR=1 and a claimed fleet job carries a `repo`, run the agent in an isolated checkout on branch aq/job/<fleetJobId> (off baseBranch), then on a passing verify commit/push and `gh pr create`. The PR URL + branch are recorded in the meta and reported on lease release (-> the coordinator stores them on the run). - fleet-client: parse repo/baseBranch from the claim, carry them in frontmatter; fleet_report_insights now sends prUrl/branch. - _fleet_pr_prepare (clone/fetch + branch, local-path aware, identity fallback) and _fleet_pr_open (commit/push/gh pr create). WIP checkpointing is skipped for PR jobs (the pushed branch is the durable artifact). - New flags: AQ_FLEET_PR, AQ_FLEET_REPOS_DIR, GH_BIN. README documented. - selftest: +1 case (bare-repo origin + gh stub) — branch pushed, PR opened, prUrl reported on release. Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:27:41 -07:00
saravanakumardb1	315e9317cc	docs(agent-queue): document AQ_FLEET_AUTOSHIP (testing -> shipped) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 04:32:04 -07:00
saravanakumardb1	df65b7a245	feat(agent-queue): report testing + optional autoship to the fleet (close testing->shipped) Previously the factory reported up to `review` and "shipping is always manual", so a coordinator job never reached a terminal stage autonomously. - On a passing local verify, always report `testing` to the coordinator so its stage reflects that QA passed (was stuck at `review`). - New AQ_FLEET_AUTOSHIP=1: the factory's verify gate IS the test phase, so advance the coordinator job testing -> shipped and land it in shipped/ locally. This closes the testing->shipped gap for an autonomous submit -> shipped pipeline. Default off keeps the human review gate authoritative (job rests at testing). selftest: +2 cases (autoship reports testing+shipped + lands in shipped/; autoship OFF reports testing but withholds shipped). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 04:21:44 -07:00
saravanakumardb1	8085501506	feat(agent-queue): extract Devin token usage from the conversation export Devin does not surface token/cost in its stdout or local log, so parse_usage previously emitted nothing for the devin engine (runs showed no metrics). Devin DOES expose per-step usage in its ATIF conversation export. - build_agent_cmd: pass `--export <path>` for the devin engine (path derived from the job log path so parse_usage can find it; harmless 4th arg for other engines). - parse_usage devin: read the export and sum per-step metadata.metrics input_tokens / output_tokens / cache_read_tokens; take model from agent.model_name. Pure grep/awk, no new dependency. USD cost is left unset (the export carries token counts but not cost) — the dashboard shows tokens + model, cost stays blank. These feed fleet_report_insights, so live devin fleet runs now report tokens + model to the coordinator (verified live: model "Claude Opus 4.8", tokensIn/out + cache populated on a real run). selftest: +1 case (parse_usage devin sums per-step tokens + model from --export). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 02:55:11 -07:00
saravanakumardb1	57831e3e7a	feat(agent-queue): report run insights to the fleet + normalize API base #1 fleet_report_insights: on a successful fleet run the factory now reports the parsed cost/token/effort metrics (model, tokensIn/Out/cached, costUsd, turns, toolCalls) plus the run result onto the coordinator run via POST .../lease/release (which also frees the lease). parse_usage already extracted these into the job meta; they were never sent. Engines that do not expose usage locally (devin) still land result + endedAt. #2 normalize AQ_FLEET_API: platform-service mounts fleet under /api, so a base without it silently returned 404 on every call. Strip a trailing slash and append /api unless already present, so AQ_FLEET_API=http://host:4003 works too. selftest: +2 cases (insights reported via lease/release; API-base normalization). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 02:27:51 -07:00
saravanakumardb1	dcf017a0de	docs(agent-queue): add run policy (isolated worktrees, least-privilege) Document how the daemon + agents must run after a review found jobs executing in --yolo/dangerous mode directly against live working trees (the root cause of repo dirtiness + duplicate commits). Policy: per-job worktree off origin/main, branch-per-task + PR, yolo:false by default (dangerous only in disposable sandboxes), clean-tree contract, one writer per repo. Linked from the README. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 23:47:46 -07:00
Saravanakumar D	237481247e	docs(gigafactory): uppercase GIGAFACTORY folder + add index README Rename agent-queue/docs/gigafactory/ to docs/GIGAFACTORY/ and update every reference (README, system-overview code-map, and all phase job specs). Add an index README that lists the docs and points to the companion docs in learning_ai_common_plat. Docs-only; no behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 21:21:31 -07:00
Saravanakumar D	257efcb4bc	docs(gigafactory): consolidate gigafactory docs into docs/gigafactory/ Move GIGAFACTORY_ROADMAP.md and GIGAFACTORY_SYSTEM_OVERVIEW.md under agent-queue/docs/gigafactory/ so the scattered top-level docs are easy to discover. Update the README links, the overview code-map, and all phase job-spec source-of-truth paths to the new location. Pure docs move; no behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 21:01:23 -07:00
saravanakumardb1	1bcea394f5	chore(agent-queue): gitignore transient queue runtime state Jobs move through .state/inbox/building/testing/review/failed/shipped/logs at runtime, which constantly dirtied the repo and blocked clean rebases. Ignore the per-job lifecycle files (keeping each dir via .gitkeep) and stop tracking the consumed inbox job instances. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 20:29:49 -07:00
Saravanakumar D	71e5ad6923	docs(gigafactory): add system overview with architecture diagrams; sync roadmap status Add GIGAFACTORY_SYSTEM_OVERVIEW.md — a current-state companion to the roadmap spec covering: what the Agent Gigafactory is, a completion snapshot, three Mermaid diagrams (component architecture, job-lifecycle state machine, atomic claim + lease-fencing sequence), the Cosmos data model, the scoring router, subsystem map, full /fleet REST surface, feature flags, the two control planes, a cross-repo code map, test coverage, next steps (Phase 4/5), and an honest bugs/gaps/risks section. All three Mermaid blocks validated with mermaid.parse. Also correct documentation drift in GIGAFACTORY_ROADMAP.md found during the review: - §0 progress table showed Phase 3 as "0% not started" while every Phase-3 box is ticked; updated phases 1-3 to done with realistic percentages. - Phase-2 boxes "scheduler/router wired into assignment", "tracker adapter direct call", and "factory enrollment + scoped tokens" are implemented in common-plat (coordinator.ts uses selectJob; routes.ts enforces enrollment.enforceFactoryToken; tracker-bridge.ts) but were left unticked — ticked with evidence and refreshed the stale "remaining for 100%" notes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 20:11:02 -07:00
Saravanakumar D	66c91233da	feat(agent-queue): re-point TUI dashboard at /fleet API (parity) Add an opt-in fleet mode to the dashboard so an operator can drive the coordinator fleet from the same TUI used for the local folder queue. - lib/fleet-dash.mjs: dependency-injectable read/act adapter over the platform-service /fleet REST surface (jobs, metrics, factories, events, ship/requeue/reject). Pure-ish + fully unit-testable without a live service. - dashboard.mjs: render + act in fleet mode when AQ_FLEET_DASH=1 — board with counts, factories (per-factory rows or metrics aggregate), alerts, running (by lease/factory), actionable JOBS with manifest tags, recent, and a per-job events log. Single-flight async refresh keeps the last good board on failure; ship re-GETs a fresh leaseEpoch before PATCH; run/stop/promote are disabled (no safe server contract). Local mode is byte-for-byte unchanged. - lib/fleet-dash.test.mjs: 22 node:assert assertions (config, stage mapping, toBoard, fetch headers/timeout/errors, board assembly + graceful degradation, events, job actions) wired into selftest.sh. - docs: tick the Phase 3 "TUI re-pointed at /fleet" roadmap boxes. Verified: selftest.sh green (incl. new fleet-dash checks); live non-TTY render smoke against a stub /fleet server (both factories and metrics-aggregate paths); local mode unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:47:56 -07:00
Saravanakumar D	8a2270e0a6	feat(dashboard): surface manifest tags (priority/profile/caps/tracker) on the board Render a per-job tags line on the RUNNING workers and JOBS lists showing the routing inputs operators care about: priority, profile, capabilities, and the tracker-item reference. Tags come from the launched meta, falling back to the job's .md frontmatter for never-launched inbox jobs (new readManifest parser). The tracker-item becomes a clickable terminal hyperlink when AQ_TRACKER_WEB is set. Also renders the new budget_exceeded result as a failed RECENT row. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:27:41 -07:00
Saravanakumar D	7f77e9abc7	feat(agent-queue): enforce budget.wall as a hard wall-clock ceiling Parse the wall ceiling from the budget manifest map (budget: { wall: <dur> }) and arm it alongside the per-run timeout. Whichever ceiling fires first binds; the kill is recorded as result=timeout or result=budget_exceeded accordingly. budget.wall extends timeout: a job with only a budget.wall (no timeout) is now hard-killed at the ceiling. budget_exceeded is a terminal, non-retryable class by default and maps to the failed tracker status. Adds _budget_wall_secs + _effective_kill helpers (pure, unit-tested) and live selftest coverage; usd/tokens remain best-effort and are not enforced here. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:21:49 -07:00
Saravanakumar D	f1fe66fd4d	docs(roadmap): tick verified-done Phase 3 boxes (395-400,402) Phase 3 fleet control plane is implemented in learning_ai_common_plat: fleet API client, fleet map page, job table/detail/DAG/SSE/actions, cost burndown + multi-reviewer gate, scoring explainability, preemption, and Playwright fleet e2e. Box 401 (TUI re-point) remains open. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:13:25 -07:00
saravanakumardb1	a075a6ff30	Merge: Phase 2 two-factory parallel demo — exit criteria (§14) (#demo)	2026-05-30 01:58:55 -07:00
saravanakumardb1	0cde7def6a	feat(agent-queue): two-factory parallel demo — Phase 2 exit criteria (§14) Close the final Phase-2 exit-criteria box: >=2 factories executing jobs in parallel through one coordinator, proving the concurrency guarantees end-to-end. This is a DEMO HARNESS over the existing runtime — agent-queue.sh and lib/fleet-client.sh are unchanged (read + called, not modified). demo/two-factory-demo.sh: starts two real `agent-queue.sh run` daemons (mac-1 + ubuntu-1, separate queues/cwds) that compete ONLY through the coordinator, then asserts: (a) no double-assign — each of 3 jobs executed by exactly one factory; (b) fencing + reclaim — kill a factory mid-job, the reaper returns its job, the survivor reclaims + completes it, and the dead worker's late/zombie report (stale leaseEpoch) is FENCED (HTTP 409, never shipped); (c) parallelism — both factories hold active jobs concurrently. Dual-mode: CI-safe stateful stub by default; live platform-service when AQ_FLEET_API/AQ_FLEET_TOKEN set. demo/coordinator-stub.sh: stateful, mkdir-lock-guarded, file-backed coordinator implementing claim/lease/fence/renew/release + reaper-reclaim via the existing AQ_FLEET_API_CMD seam — the selftest stub pattern extended with shared state so >=2 processes coordinate through one coordinator. demo/README.md: stub + real invocations, env knobs, what each guarantee proves, what-to-watch guide. selftest.sh: +3 headless stub-mode checks (existing 68 unchanged byte-for-byte -> 71 total green). docs/GIGAFACTORY_ROADMAP.md: tick the §14 two-factory-demo box; annotate Phase-2 exit criteria; bump §0 Phase 2 to 80% (remaining: scheduler-core wiring [common-plat PR #31], tracker-direct call, factory enrollment). bash 3.2 + awk/sed/grep/pgrep only; mac+linux safe; no new runtime deps. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 01:53:36 -07:00
saravanakumardb1	2d76af916d	docs(agent-queue): add Phase 3 overnight (10h) job — tunable scoring+preemption, DAG, budgets, tracker-web control plane	2026-05-30 01:48:39 -07:00
saravanakumardb1	08d8d715a1	docs(agent-queue): add Dependabot dependency-triage prompt for common-plat	2026-05-30 00:56:55 -07:00
saravanakumardb1	24fe1567f6	docs(agent-queue): draft Phase 2 next prompts — direct tracker->module wiring (§10) + two-factory parallel demo (exit criteria)	2026-05-30 00:40:21 -07:00
saravanakumardb1	fbecbe82b6	feat(agent-queue): fleet feature flags + shadow/dual-run (Phase 2) Add a safe, reversible path to validate the fleet coordinator against the proven single-host path BEFORE cutover, via three independently-toggleable flags: AQ_FLEET=0 pure offline (zero coordinator calls; offline path unchanged) AQ_FLEET_ROUTE=1 route_via_service: coordinator authoritative for claim (default = P2-S3) AQ_FLEET_ROUTE=0 local inbox authoritative (coordinator not used to source work) AQ_FLEET_SHADOW=1 dual-run (needs AQ_FLEET=1 + ROUTE=0): query coordinator in parallel, record divergence, NEVER act on it Precedence: SHADOW only when ROUTE=0; if ROUTE=1 + SHADOW=1, ROUTE wins (one-shot warning). lib/fleet-client.sh: fleet_route_enabled / fleet_shadow_enabled / fleet_flags_warn_once / fleet_flags_state; fleet_shadow_claim (read-only — isolated `-shadow` factoryId + dryRun, releases any real lease, never materializes), fleet_shadow_compare (AGREE/DIVERGE/COORD_EMPTY/LOCAL_EMPTY → .state/fleet-shadow.log), fleet_shadow_report (shadow:true, response never acted on), cmd_fleet_shadow_report (counts + agreement rate). agent-queue.sh: ROUTE-gate claim sourcing (claim only when route_via_service); shadow hook after the local authoritative decision each iteration (best-effort, error-swallowed — shadow can never fail a real job); `fleet-shadow-report` subcommand + help; resolved flags surfaced in `status`/`fleet-status`. tryClaim/fence/offline paths unchanged. Strictly side-effect-free on real job state: shadow never ships, quarantines, or mutates real jobs. Offline path byte-for-byte unchanged when AQ_FLEET=0. selftest.sh: +8 checks (shadow AGREE/DIVERGE/COORD_EMPTY, non-fatal 5xx, ROUTE precedence, ROUTE=0 local-authoritative, fleet-shadow-report summary, shadow_report unit). 60 prior checks unchanged → 68 total green. README + GIGAFACTORY_ROADMAP document the flag model + cutover ladder. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 00:22:48 -07:00
saravanakumardb1	5c0ae020c0	docs(agent-queue): draft P2 prompts — factory enrollment+tokens (§12) + feature flags/shadow-dualrun	2026-05-29 23:52:14 -07:00
saravanakumardb1	21ebf8b1b7	docs(agent-queue): fleet integration section + roadmap P2-S3 ticks README: "Fleet integration (Phase 2)" — AQ_FLEET flag, env table, claim/heartbeat/ report/fence/renew protocol, offline-degrade + quarantine, offline-vs-fleet explainer. Roadmap: tick the Phase-2 §14 factory-agent item, add a P2-S3 slice note, bump §0 Phase 2 -> 55%.	2026-05-29 22:45:44 -07:00
saravanakumardb1	064dbf3d8f	test(agent-queue): fleet integration selftest cases (P2-S3) Adds 7 stub-driven fleet cases (AQ_FLEET_API_CMD stub, no live coordinator); never weakens the prior 53 (full suite now 60 green): - flag OFF (default): zero coordinator calls; offline job completes unchanged - register(heartbeat)+claim -> coordinator job materialized + executed to review/ - report+checkpoint: PATCH carries stage+leaseEpoch (+ wipBranch on building) - FENCING: stale-epoch 409 -> self-abort + quarantine (never shipped) - lease renew (unit): POST .../lease/renew with current leaseEpoch - offline-degrade: coordinator 5xx -> job completes locally (degraded), not quarantined - no-leak: bodyMd/token never appear in report payloads	2026-05-29 22:45:44 -07:00
saravanakumardb1	1d84712b47	feat(agent-queue): wire runner to fleet coordinator at minimal hook points (P2-S3) Sources lib/fleet-client.sh and adds a few fleet_enabled-gated hooks so the offline git-queue path is byte-for-byte unchanged when AQ_FLEET is unset/0: - cmd_run: register at loop start; per-iteration heartbeat (cadence) + lease renew for in-flight fleet jobs + claim one coordinator job into inbox when capacity. - meta: persist fleet_job_id + fleet_lease_epoch (from claim frontmatter). - run_worker: report `building` (with WIP checkpoint) after WIP setup and `review` before accepting the agent's output — a FENCED (stale-epoch/409) report self-aborts and quarantines (never ships); 5xx/unreachable degrades (finish locally). - _auto_echo: for fleet jobs route the outcome echo through the coordinator (fleet_events) instead of the direct tracker echo; offline jobs unchanged. - cmd_ship: fence-check before shipping a fleet job; release lease after. - status: show factory id + per-job fleet=<id>@e<epoch>; insights lists fleet_* fields. - dispatch + help: `fleet-status` command + a FLEET env section.	2026-05-29 22:45:44 -07:00
saravanakumardb1	a10d4003e6	feat(agent-queue): fleet coordinator client library (lib/fleet-client.sh, P2-S3) New sourced library implementing the factory side of the Phase-2 `fleet` coordinator contract — curl-only + POSIX awk, reusing the Slice-4 HTTP/JSON helper patterns, no new deps. Every function is a no-op unless AQ_FLEET=1. - fleet_enabled / fleet_api (AQ_FLEET_API_CMD test seam) / _fleet_call - fleet_detect_caps (reuses detect_capabilities) -> JSON caps array - fleet_heartbeat (+ _maybe cadence): registration == first heartbeat - fleet_claim: POST /fleet/claim, parse job id/bodyMd/leaseEpoch, materialize a transient local .md (fleet-job-id + fleet-lease-epoch in frontmatter) - fleet_report: PATCH fenced stage transition {stage, leaseEpoch, checkpoint?}; returns ok / FENCED(2, stale epoch -> self-abort) / degraded(1, unreachable) - fleet_lease_renew / fleet_lease_release / fleet_renew_active (fenced) - fleet_quarantine: park a reclaimed (fenced) job in failed/ for human triage - cmd_fleet_status: register + print factory identity/caps Report payloads carry only stage/epoch/checkpoint — never prompt/bodyMd/token.	2026-05-29 22:45:44 -07:00
saravanakumardb1	10395983e7	docs(agent-queue): draft parallel P2 prompts — scheduler/router core (§7) + fleet artifacts blob wiring (§13)	2026-05-29 22:32:41 -07:00
saravanakumardb1	9a073ef225	docs(agent-queue): draft P2-S3 factory-agent integration prompt (claim/heartbeat/report/fence behind AQ_FLEET)	2026-05-29 22:03:12 -07:00
saravanakumardb1	8ae504ca30	docs(agent-queue): tracker integration + close Phase 1 §10/§14 adapter (P1-S4) README: Tracker integration section (from-tracker/to-tracker, env config, label->manifest table, one-way-echo rule, AQ_TRACKER_AUTO, real-use note). Roadmap: tick §10 Phase-1 adapter items + the §14 tracker-adapter item; add P1-S4 slice note; §0 Phase 1 -> 95% (remaining: budget.wall + Node dash surfacing).	2026-05-29 21:35:16 -07:00
saravanakumardb1	1e0a17bbc0	test(agent-queue): tracker adapter selftest cases (P1-S4) Adds (never weakens) 7 stub-driven cases (AQ_TRACKER_API_CMD stub, no live service): from-tracker create + label mapping + idempotent; to-tracker shipped echo (PATCH done + metrics comment, asserts NO prompt body sent) + idempotent; HTTP 500 non-fatal; AQ_TRACKER_AUTO auto-echo on run. Full suite green (53 checks).	2026-05-29 21:35:16 -07:00
saravanakumardb1	b7a9ea1b7a	feat(agent-queue): tracker adapter — task <-> job round-trip (P1-S4) Implements §10 single-host tracker integration, closing the last Phase-1 §14 item: - tracker_api: one curl-only HTTP wrapper (base URL + bearer + productId header), overridable via AQ_TRACKER_API_CMD so tests need no live service. Emits the response body + a trailing HTTP-code line; _api_call splits into API_BODY/API_CODE. - aq from-tracker <ITEM_ID>: GET the Item, map title/description -> job body, labels (engine-class:/profile:/priority:/cap:) + Item priority -> frontmatter, and stamp tracker-item + a stable idempotency-key tracker-<id>. Materializes a .md into inbox/ via cmd_add; idempotent (Slice 1 dedupe) so a re-pull never dups. JSON parsed with POSIX awk (no jq) — mac + linux safe. - aq to-tracker <job>: one-way echo (child -> tracker, §24.5). PATCHes the Item status (building/review/testing->in_progress, shipped->done, failures->wont_fix, all overridable) and posts a metrics-only comment (result/attempts/duration/ tokens/cost/diff — NEVER prompt content or secrets). Idempotent via meta tracker_echoed; an echo failure (e.g. HTTP 500) is logged and non-fatal — the tracker is downstream, never authoritative for execution. - Opt-in auto-echo (AQ_TRACKER_AUTO=1, default OFF): the worker echoes on each transition (building via cmd_run, review/testing/failed via run_worker, shipped via ship/promote); never blocks or fails a job. - status + insights surface tracker-item and the last echoed status. curl-only HTTP; no new runtime deps; conventional + backward-compatible.	2026-05-29 21:35:06 -07:00
saravanakumardb1	d0348f23de	docs(agent-queue): P0 atomic-claim resolved (PR #29 ) — tick §4/§13/§14 fleet items	2026-05-29 21:05:38 -07:00
saravanakumardb1	2e9bd4dd1e	docs(agent-queue): record P2 Foundation merged + track P0 atomic-claim hardening (§4) - §4: implementation-status note — fleet module merged (PR #28); atomic claim NOT yet concurrency-safe (rev-CAS over unconditional write, sequential-only test) - add phase2-atomic-claim-hardening.md: updateIfMatch in @bytelyst/datastore (Cosmos If-Match + process-atomic memory) + concurrent claim tests	2026-05-29 20:43:28 -07:00

1 2

87 Commits