bytelyst-devops-tools

Author	SHA1	Message	Date
saravanakumardb1	f7999fb11b	feat(scripts): deploy-gigafactory --full/--with-tracker + tracker-web launch Extends deploy-gigafactory.sh to optionally start the web tracker (tracker-web) alongside platform-service and registered factories: adds --full (backend + register + tracker), --with-tracker, --tracker-only, a per-process pid file with child-aware --stop, and waits for both to be healthy. Gitignore the new runtime tracker pid. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:25:16 -07:00
saravanakumardb1	38d8e8e5cf	feat(agent-queue): add tracked example multi-product fleet launcher The operational _start_fleet.sh lives in a local (untracked) sandbox, so the gate + heartbeat-cadence settings weren't version-controlled anywhere. Add demo/start-fleet.example.sh: a parameterized, sanitized launcher (one agent-queue.sh run daemon per product against a live platform-service) that ships the two settings you must get right — AQ_FLEET_GATE=1 (M0 RU gate) and AQ_FLEET_LEASE_RENEW_SEC=30 (heartbeat cadence < the 90s stale threshold). No hardcoded paths/secrets; everything env-overridable. Documented in demo/README. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:18:26 -07:00
saravanakumardb1	2993994273	docs(gigafactory): reconcile overview + roadmap to current reality - System overview: mark Phase 4 in-progress (M0 RU gate shipped), add fleet_queue_state container + GET /fleet/queue-state, document the heartbeat cadence vs 90s stale gotcha, the tracker-web caps=build form bug, the missing deregister API, and the ended=-race fix; drop the now-false "roadmap §0 stale" and "boxes 384/386 unticked" claims (both reconciled); link the redesign doc. - Roadmap: §0 Phase 4 -> in progress (M0); align the Phase-2 §8 spec endpoint sketches to the as-built API (/fleet/factories/enroll, /factories/heartbeat, /fleet/claim) + note the heartbeat cadence and the M0 gate. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-06-01 00:02:45 -07:00
saravanakumardb1	fa1f1d1b30	fix(agent-queue): write ended= after PR/report so --once can't exit early run_worker marked the job ended (testing) right after moving to testing/, BEFORE opening/merging the PR and reporting to the coordinator. Once ended= is written, _meta_active returns false, active_workers drops to 0, and "run --once" could drain-exit (and callers could observe completion) while the background worker was still opening the PR — a real race that made the PR-mode selftest flaky and could free a concurrency slot prematurely in production. Move the ended= write to the end of the success path (after PR open/merge + testing/shipped reports). No behavior change on the autoship/ship path. Full selftest now passes deterministically across repeated runs. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 23:36:51 -07:00
saravanakumardb1	41d8067724	feat(agent-queue): M0 RU gate — skip the claim when the queue is unchanged Adds AQ_FLEET_GATE (default OFF): the run loop point-reads the cheap per-product queue version (GET /fleet/queue-state) and SKIPS the expensive /fleet/claim while the version is unchanged and it is not mid-drain, with a periodic safety backstop and fail-open-on-read-error so work is never stranded. Keeps POLL_SECONDS for local job responsiveness rather than raising it globally. selftest 39b covers the gate decisions; reconciles the M0 section of the dispatch redesign doc. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 23:19:01 -07:00
saravanakumardb1	29afe59604	docs(gigafactory): v4 coverage audit — roadmap maps 1:1 to design, no gaps Adds a coverage matrix + M-prep (decisions/§10, schema, containers, RBAC) and closes plan gaps: correlation filter + dispatcher budget (M1); small messages, token re-check, alerting (M2); plus Testing and Rollback & flags blocks. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:47:54 -07:00
saravanakumardb1	9f24a7fdd0	docs(gigafactory): add error-handling & cleanup section + v3 review fixes Adds §5.5 (lease-release-on-failure, branch/worktree GC, same-repo worktree clobber) with target invariants, plus a §12 checklist block. v3 review: unify targetFactoryId, reconcile §5.3 with complete-on-claim, align §6 token scoping with per-factory subscriptions, M0 wording. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:45:21 -07:00
saravanakumardb1	8217a864e9	docs(gigafactory): add Phase-4 fleet dispatch redesign (broker + on-demand) Proposes moving fleet work-dispatch off Cosmos busy-polling onto Azure Service Bus in a coordinator-owns-scheduling / broker-owns-delivery hybrid, fixing the product-as-queue routing smell and the idle-poll RU cost. Includes phased migration (M0 RU quick win -> shadow -> cutover -> scale-to-zero) with a ticked checklist. Self-reviewed (v2) for the outbox/change-feed, message-size, long-job lock, idempotency, and routing-model consistency issues. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 22:24:55 -07:00
Saravanakumar D	5a22899da5	chore: enforce LF line endings via .gitattributes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-31 20:25:17 -07:00
saravanakumardb1	b8f0369f63	feat(agent-queue): approximate Devin run cost from tokens (model price map) Devin's export has tokens but no USD cost; estimate cost_usd from a per-model $/1M price map (Opus/Sonnet/Haiku) and flag usage_estimated so the dashboard shows it as approx.	2026-05-31 15:58:35 -07:00
saravanakumardb1	c2dbbaf188	feat(agent-queue): report PR state (open/merged) on the run	2026-05-31 13:56:46 -07:00
Hermes VM	09b16c4b19	fix: complete devops theme compatibility	2026-05-31 20:06:22 +00:00
saravanakumardb1	d6fa1d9e28	feat(agent-queue): PR mode uses existing local repo via git worktree (no clone) When AQ_FLEET_REPO_BASE/<repo> is an existing checkout, create a git worktree off it for branch aq/job/<id> (shares objects + remotes, leaves the main checkout untouched) instead of cloning. Falls back to clone for remote-only repos. selftest exercises the worktree path. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 06:28:36 -07:00
saravanakumardb1	b442b95728	feat(agent-queue): per-repo verify + opt-in auto-merge for PR jobs Claim now carries verify (drives the existing verify gate -> PR opens only if it passes) and autoMerge (squash-merge via gh pr merge after the PR opens, non-fatal). selftest covers both. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 06:17:28 -07:00
saravanakumardb1	e634d4915f	feat(agent-queue): agent authors PR title + description (.aq_pr.md) In PR mode the agent is asked to write .aq_pr.md (line 1 = PR title, then a markdown description) based on the task + the diff it produced. The factory reads it for `gh pr create` (via --body-file) and removes it before committing (never part of the PR). Falls back to a derived title if absent. selftest asserts the authored title is used and .aq_pr.md is not committed. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:48:37 -07:00
saravanakumardb1	d0e800247c	feat(agent-queue): PR mode clones from local repo base (AQ_FLEET_REPO_BASE) MVP: when AQ_FLEET_REPO_BASE/<repo> is an existing local checkout, use it as the clone source (fast, no network) and push/PR to its GitHub origin — embedded creds in the local origin URL are stripped (gh credential helper handles auth). Selftest PASS (full-path bare-repo fallback unchanged). Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:36:46 -07:00
saravanakumardb1	cfbcc2da9d	feat(agent-queue): PR mode — open a PR per fleet job (AQ_FLEET_PR) When AQ_FLEET_PR=1 and a claimed fleet job carries a `repo`, run the agent in an isolated checkout on branch aq/job/<fleetJobId> (off baseBranch), then on a passing verify commit/push and `gh pr create`. The PR URL + branch are recorded in the meta and reported on lease release (-> the coordinator stores them on the run). - fleet-client: parse repo/baseBranch from the claim, carry them in frontmatter; fleet_report_insights now sends prUrl/branch. - _fleet_pr_prepare (clone/fetch + branch, local-path aware, identity fallback) and _fleet_pr_open (commit/push/gh pr create). WIP checkpointing is skipped for PR jobs (the pushed branch is the durable artifact). - New flags: AQ_FLEET_PR, AQ_FLEET_REPOS_DIR, GH_BIN. README documented. - selftest: +1 case (bare-repo origin + gh stub) — branch pushed, PR opened, prUrl reported on release. Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 05:27:41 -07:00
Hermes VM	94d55a3d4a	fix: repair devops shell interactions	2026-05-31 11:52:38 +00:00
saravanakumardb1	315e9317cc	docs(agent-queue): document AQ_FLEET_AUTOSHIP (testing -> shipped) Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 04:32:04 -07:00
saravanakumardb1	df65b7a245	feat(agent-queue): report testing + optional autoship to the fleet (close testing->shipped) Previously the factory reported up to `review` and "shipping is always manual", so a coordinator job never reached a terminal stage autonomously. - On a passing local verify, always report `testing` to the coordinator so its stage reflects that QA passed (was stuck at `review`). - New AQ_FLEET_AUTOSHIP=1: the factory's verify gate IS the test phase, so advance the coordinator job testing -> shipped and land it in shipped/ locally. This closes the testing->shipped gap for an autonomous submit -> shipped pipeline. Default off keeps the human review gate authoritative (job rests at testing). selftest: +2 cases (autoship reports testing+shipped + lands in shipped/; autoship OFF reports testing but withholds shipped). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 04:21:44 -07:00
Hermes VM	2d40daf72f	fix: include shared ui styles in devops build	2026-05-31 11:19:59 +00:00
Hermes VM	076449268b	docs(interview): add Senior Agentic RAG Architect prep kit 7-doc kit mapping the JD competency matrix to the ByteLyst ecosystem: ecosystem-as-RAG-fabric architecture, competency deep-dives, STAR bank, enhancement roadmap, banking blueprints, and a glossary quick-ref. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 10:48:52 +00:00
Hermes VM	1b957cf6d9	feat: align hermes tasks with shared ui	2026-05-31 10:43:23 +00:00
saravanakumardb1	8085501506	feat(agent-queue): extract Devin token usage from the conversation export Devin does not surface token/cost in its stdout or local log, so parse_usage previously emitted nothing for the devin engine (runs showed no metrics). Devin DOES expose per-step usage in its ATIF conversation export. - build_agent_cmd: pass `--export <path>` for the devin engine (path derived from the job log path so parse_usage can find it; harmless 4th arg for other engines). - parse_usage devin: read the export and sum per-step metadata.metrics input_tokens / output_tokens / cache_read_tokens; take model from agent.model_name. Pure grep/awk, no new dependency. USD cost is left unset (the export carries token counts but not cost) — the dashboard shows tokens + model, cost stays blank. These feed fleet_report_insights, so live devin fleet runs now report tokens + model to the coordinator (verified live: model "Claude Opus 4.8", tokensIn/out + cache populated on a real run). selftest: +1 case (parse_usage devin sums per-step tokens + model from --export). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 02:55:11 -07:00
Hermes VM	1a28fd541f	fix: add devops dashboard favicon	2026-05-31 09:33:59 +00:00
saravanakumardb1	57831e3e7a	feat(agent-queue): report run insights to the fleet + normalize API base #1 fleet_report_insights: on a successful fleet run the factory now reports the parsed cost/token/effort metrics (model, tokensIn/Out/cached, costUsd, turns, toolCalls) plus the run result onto the coordinator run via POST .../lease/release (which also frees the lease). parse_usage already extracted these into the job meta; they were never sent. Engines that do not expose usage locally (devin) still land result + endedAt. #2 normalize AQ_FLEET_API: platform-service mounts fleet under /api, so a base without it silently returned 404 on every call. Strip a trailing slash and append /api unless already present, so AQ_FLEET_API=http://host:4003 works too. selftest: +2 cases (insights reported via lease/release; API-base normalization). Full self-test PASS. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 02:27:51 -07:00
Hermes VM	d04c8021ad	docs: update devops dashboard deploy timestamp	2026-05-31 09:12:43 +00:00
Hermes VM	fa71b1ff08	docs: make app bookmark URLs clickable	2026-05-31 08:42:13 +00:00
Hermes VM	02b362399b	feat: complete hermes telemetry dashboard wiring	2026-05-31 08:28:26 +00:00
saravanakumardb1	38aefb05e4	docs(deploy): v2 review pass — correct findings after full script/compose audit - D6: memory limits already exist (deploy.resources.limits); reframe as RAM right-sizing + disk hygiene rather than "limits missing" - D2: down/--force-recreate is invttrdg-only; clock/notes already differential - D4: broaden BuildKit gap to all docker compose build paths; fix accuracy - D8 (new): deploy-script drift across per-product scripts + dashboard/deploy.sh - add Phase 0 (unify scripts) as prerequisite; update quick-ref + ordering Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 00:44:52 -07:00
saravanakumardb1	f837512026	docs(deploy): add deployment optimization roadmap Document a phased roadmap for the single-VM deployment layer (build-off-VM, recreate-in-place to cut downtime, change-detection + BuildKit guarantee, image slimming + resource caps, artifact-based rollback). Scoped to deploy orchestration; defers image-build internals to docker-build-optimization-roadmap. Register the doc in repo-map. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-31 00:40:15 -07:00
Saravanakumar D	9d871282c3	docs: explain Gitea registry vs workspace package resolution + the registry-offline trap Document the two ways @bytelyst/* packages resolve (local workspace links vs Gitea npm registry for Docker/CI), the common 'registry offline' local-dev failure and its fix (sibling directory layout, not a token), and the deploy-side 'package not published' / token issues with remediation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-31 00:05:31 -07:00
saravanakumardb1	dcf017a0de	docs(agent-queue): add run policy (isolated worktrees, least-privilege) Document how the daemon + agents must run after a review found jobs executing in --yolo/dangerous mode directly against live working trees (the root cause of repo dirtiness + duplicate commits). Policy: per-job worktree off origin/main, branch-per-task + PR, yolo:false by default (dangerous only in disposable sandboxes), clean-tree contract, one writer per repo. Linked from the README. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 23:47:46 -07:00
saravanakumardb1	abc8a0f517	fix(tracker-seed): cap dedupe list at limit=100 + auto-register products Two bugs caused duplicate items on re-run: the dedupe list used limit=500 (server caps at 100 -> 400 -> silent empty set -> dupes), and meta productIds weren't registered so GET /items 400'd ("Unknown product"). Now registers every referenced product first (idempotent) and lists with limit=100; dedupe failures are logged loudly. Verified idempotent: re-run skips all 16. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 23:45:16 -07:00
saravanakumardb1	ae7909018a	feat(scripts): one-shot gigafactory deploy + product registration deploy-gigafactory.sh loads platform-service/.env, starts the fleet backend, waits for /health, and registers the ecosystem products (idempotent) so live /api/fleet/* calls resolve. Supports --stop / --register-only / --no-register. Registered the 11 ecosystem products against the configured Cosmos during a live run; note fleet metrics needs a composite index on real Azure Cosmos. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 22:51:27 -07:00
saravanakumardb1	50d6424b85	docs: surface "cut tracker items" workflow in README + CLAUDE Add a Work Tracking entry to README Primary Entry Points and a short pointer in CLAUDE.md, both routing to scripts/tracker-seed/ and the AGENTS.md section. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 21:22:43 -07:00
saravanakumardb1	6d28e1307e	docs(agents): document "cut tracker items" workflow Add a "Cutting Tracker Items" section to AGENTS.md and register scripts/tracker-seed/ in docs/repo-map.md so future "cut items to track" requests route to the seed tooling instead of ad-hoc API calls. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 21:21:36 -07:00
Saravanakumar D	237481247e	docs(gigafactory): uppercase GIGAFACTORY folder + add index README Rename agent-queue/docs/gigafactory/ to docs/GIGAFACTORY/ and update every reference (README, system-overview code-map, and all phase job specs). Add an index README that lists the docs and points to the companion docs in learning_ai_common_plat. Docs-only; no behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 21:21:31 -07:00
saravanakumardb1	eb4e755c5f	feat(tracker-seed): seed script + payloads for engineering-review work items Files the ENGINEERING_REVIEW_SCORECARD.md P0-P3 action plan as tracker items (one per affected product) via the platform-service POST /api/items API. Dependency-free Node seeder mints an HS256 token from $JWT_SECRET, dedupes by title, and supports --dry-run. No live writes performed (stack is down); run the script once the platform stack is up. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 21:14:12 -07:00
Saravanakumar D	257efcb4bc	docs(gigafactory): consolidate gigafactory docs into docs/gigafactory/ Move GIGAFACTORY_ROADMAP.md and GIGAFACTORY_SYSTEM_OVERVIEW.md under agent-queue/docs/gigafactory/ so the scattered top-level docs are easy to discover. Update the README links, the overview code-map, and all phase job-spec source-of-truth paths to the new location. Pure docs move; no behavior change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 21:01:23 -07:00
saravanakumardb1	32162312a9	docs: add workspace engineering review & scorecard Read-only, evidence-based review of the ~38-repo workspace produced from docs/prompts/engineering-review-scorecard.md: per-repo breakdown, 1-10 category scorecard (weighted overall ~7.0, beta-grade), prioritized P0-P3 action plan, safe auto-fix candidates, delegate-to-agent queue, and an agent SOP. No code changes. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 20:53:44 -07:00
saravanakumardb1	1bcea394f5	chore(agent-queue): gitignore transient queue runtime state Jobs move through .state/inbox/building/testing/review/failed/shipped/logs at runtime, which constantly dirtied the repo and blocked clean rebases. Ignore the per-job lifecycle files (keeping each dir via .gitkeep) and stop tracking the consumed inbox job instances. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 20:29:49 -07:00
saravanakumardb1	92479113d0	docs(prompts): add engineering review & scorecard master prompt Reusable evidence-based review prompt covering repos, code, architecture, DevOps, testing, security, product-readiness, and AI-agent practices, with a 1-10 scorecard and prioritized action plan output. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>	2026-05-30 20:29:49 -07:00
Saravanakumar D	71e5ad6923	docs(gigafactory): add system overview with architecture diagrams; sync roadmap status Add GIGAFACTORY_SYSTEM_OVERVIEW.md — a current-state companion to the roadmap spec covering: what the Agent Gigafactory is, a completion snapshot, three Mermaid diagrams (component architecture, job-lifecycle state machine, atomic claim + lease-fencing sequence), the Cosmos data model, the scoring router, subsystem map, full /fleet REST surface, feature flags, the two control planes, a cross-repo code map, test coverage, next steps (Phase 4/5), and an honest bugs/gaps/risks section. All three Mermaid blocks validated with mermaid.parse. Also correct documentation drift in GIGAFACTORY_ROADMAP.md found during the review: - §0 progress table showed Phase 3 as "0% not started" while every Phase-3 box is ticked; updated phases 1-3 to done with realistic percentages. - Phase-2 boxes "scheduler/router wired into assignment", "tracker adapter direct call", and "factory enrollment + scoped tokens" are implemented in common-plat (coordinator.ts uses selectJob; routes.ts enforces enrollment.enforceFactoryToken; tracker-bridge.ts) but were left unticked — ticked with evidence and refreshed the stale "remaining for 100%" notes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 20:11:02 -07:00
Saravanakumar D	66c91233da	feat(agent-queue): re-point TUI dashboard at /fleet API (parity) Add an opt-in fleet mode to the dashboard so an operator can drive the coordinator fleet from the same TUI used for the local folder queue. - lib/fleet-dash.mjs: dependency-injectable read/act adapter over the platform-service /fleet REST surface (jobs, metrics, factories, events, ship/requeue/reject). Pure-ish + fully unit-testable without a live service. - dashboard.mjs: render + act in fleet mode when AQ_FLEET_DASH=1 — board with counts, factories (per-factory rows or metrics aggregate), alerts, running (by lease/factory), actionable JOBS with manifest tags, recent, and a per-job events log. Single-flight async refresh keeps the last good board on failure; ship re-GETs a fresh leaseEpoch before PATCH; run/stop/promote are disabled (no safe server contract). Local mode is byte-for-byte unchanged. - lib/fleet-dash.test.mjs: 22 node:assert assertions (config, stage mapping, toBoard, fetch headers/timeout/errors, board assembly + graceful degradation, events, job actions) wired into selftest.sh. - docs: tick the Phase 3 "TUI re-pointed at /fleet" roadmap boxes. Verified: selftest.sh green (incl. new fleet-dash checks); live non-TTY render smoke against a stub /fleet server (both factories and metrics-aggregate paths); local mode unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:47:56 -07:00
saravanakumardb1	1a5b791e2e	fix(aliases): harden longrun error handling (missing tmux/caffeinate, dup session, bad command, unwritable log)	2026-05-30 19:36:43 -07:00
Saravanakumar D	8a2270e0a6	feat(dashboard): surface manifest tags (priority/profile/caps/tracker) on the board Render a per-job tags line on the RUNNING workers and JOBS lists showing the routing inputs operators care about: priority, profile, capabilities, and the tracker-item reference. Tags come from the launched meta, falling back to the job's .md frontmatter for never-launched inbox jobs (new readManifest parser). The tracker-item becomes a clickable terminal hyperlink when AQ_TRACKER_WEB is set. Also renders the new budget_exceeded result as a failed RECENT row. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:27:41 -07:00
saravanakumardb1	75653b7c6e	feat(aliases): add longrun helper + awake alias for overnight agent runs	2026-05-30 19:25:50 -07:00
Saravanakumar D	7f77e9abc7	feat(agent-queue): enforce budget.wall as a hard wall-clock ceiling Parse the wall ceiling from the budget manifest map (budget: { wall: <dur> }) and arm it alongside the per-run timeout. Whichever ceiling fires first binds; the kill is recorded as result=timeout or result=budget_exceeded accordingly. budget.wall extends timeout: a job with only a budget.wall (no timeout) is now hard-killed at the ceiling. budget_exceeded is a terminal, non-retryable class by default and maps to the failed tracker status. Adds _budget_wall_secs + _effective_kill helpers (pure, unit-tested) and live selftest coverage; usd/tokens remain best-effort and are not enforced here. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:21:49 -07:00
Saravanakumar D	f1fe66fd4d	docs(roadmap): tick verified-done Phase 3 boxes (395-400,402) Phase 3 fleet control plane is implemented in learning_ai_common_plat: fleet API client, fleet map page, job table/detail/DAG/SSE/actions, cost burndown + multi-reviewer gate, scoring explainability, preemption, and Playwright fleet e2e. Box 401 (TUI re-point) remains open. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-30 19:13:25 -07:00

1 2 3 4 5 ...

314 Commits