saravanakumardb1 e6611cae1a docs(fleet): runbook to run a Devin fleet job end-to-end (local)

Add docs/runbooks/FLEET_DEVIN_LOCAL_RUN.md: how developers and coding agents
spin up platform-service + tracker-web + an agent-queue factory so a submitted
job is claimed and run autonomously by the Devin CLI against a target repo
(worked example: learning_ai_notes), pushing a branch and opening a real PR.

Covers: architecture + lifecycle, prerequisites incl. fresh-machine setup
(clone both repos, .env/Cosmos, pnpm -r build so host-run resolves @bytelyst/*
from dist/), all-localhost (no Docker) path as primary + Docker as the
Grafana/Prometheus option, local JWT minting, job submit, factory launch, observe,
PR-state reconcile, safety/cost, teardown, troubleshooting, and a copy-paste
quickstart.

Calls out two gotchas learned in practice: set AQ_FLEET_LEASE_RENEW_SEC < 90 so
the factory heartbeat beats the coordinator's 90s stale-factory reclaim window
(else a busy single-slot factory's in-flight lease is reclaimed mid-run and the
final report is fenced), and a WSL-on-Windows differences section (run inside
WSL, repos off /mnt/c, LF endings, gh/devin/node in WSL, localhost forwarding).

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>

2026-06-02 00:50:29 -07:00

29 KiB

Raw Blame History

Runbook — Run a Devin Fleet Job End‑to‑End (local)

Audience: developers and coding agents. Goal: stand up platform-service + tracker-web + a fleet factory (the agent-queue runner) so a submitted job is claimed and executed autonomously by the Devin CLI against a target repo (worked example: learning_ai_notes), pushing a branch and opening a real pull request.

⚠️ This is a real, cost‑incurring, side‑effecting operation. The factory runs an autonomous coding agent (Devin) that consumes API credits, can run for a long time, pushes a branch, and opens a real PR on GitHub. Read §9 Safety & cost before launching. For unattended local prototyping only — not a production deployment guide.

1. Architecture (what talks to what)

 you ──▶ tracker-web (:3003) ─┐
                              │  REST + SSE (/api/fleet/*)
 coding agent / curl ────────┼─▶ platform-service (:4003) ──▶ Azure Cosmos (jobs/runs/leases/events)
                              │        ▲   ▲
            Prometheus (:9090)┘        │   │ claim / lease-renew / report  (Bearer JWT + X-Product-Id)
            Grafana (:3000) ───────────┘   │
                                           │
                       agent-queue FACTORY (fleet mode) ──▶ Devin CLI ──▶ git push + gh pr create
                       (learning_ai_devops_tools/agent-queue)             (target repo, e.g. learning_ai_notes)

platform-service — the fleet coordinator. Owns the job lifecycle (queued → assigned → building → review → testing → shipped|failed|dead_letter), atomic claim, leases, events, budgets, metrics. Code: services/platform-service/src/modules/fleet/.
tracker-web (:3003) — submit/inspect jobs (/dashboard/fleet/jobs/...).
factory — learning_ai_devops_tools/agent-queue in fleet mode. Polls POST /api/fleet/claim, runs the agent CLI in an isolated checkout, reports back, and (PR mode) opens the PR.
Prometheus/Grafana — fleet metrics + the "Fleet Overview" dashboard.

Lifecycle the factory drives:

queued ─▶ assigned ─▶ building ─▶ review ─▶ testing ─▶ shipped
            (claim)    (agent     (rc=0)    (verify    (manual/auto ship)
                        running)             passed)
                            └─ agent rc≠0 / timeout / verify fail ─▶ failed ─▶ (retry|dead_letter)

2. Prerequisites

Tool	Why	Check
Node ≥ 20 + `pnpm` (corepack)	host-run service, scripts, tracker-web, build	`node -v && pnpm -v`
`git` + `gh` (authenticated)	factory clones target repo, pushes branch, opens PR; `gh pr merge`/reconcile	`gh auth status`
`devin` CLI (authenticated)	the agent the factory runs	`devin --version`
Both repos cloned side‑by‑side	coordinator/dashboards + the factory	see below
repo `.env` (root of `learning_ai_common_plat`)	`JWT_SECRET`, Cosmos creds, `FLEET_METRICS_TOKEN`	`test -f .env`
Docker	optional — only for the Docker path (§3 Option B) / Grafana+Prometheus	`docker info`

Node version: the Docker image pins node 22; for the host path any Node ≥ 20 works. Use one Node (nvm/asdf) for both repos to avoid native-module surprises.

2.1 First‑time setup (fresh machine)

Clone both repos as siblings (the factory clones targets relative to a shared parent):

mkdir -p ~/code && cd ~/code
git clone <host>/learning_ai_common_plat.git
git clone <host>/learning_ai_devops_tools.git      # contains agent-queue (the factory)

Create and fill .env at the root of learning_ai_common_plat:

cd ~/code/learning_ai_common_plat
cp .env.example .env
# then edit .env — minimum for the fleet flow:
#   JWT_SECRET=<any strong secret; tokens are minted+verified with THIS value>
#   FLEET_METRICS_TOKEN=changeme-fleet-metrics-token      # only needed for Prometheus
#   COSMOS_*  / connection vars  -> see note below

JWT_SECRET — HS256 secret platform-service verifies tokens with. Any strong value; it only needs to be internally consistent on this machine (the token you mint in §5 and the running service must share it). Required.
Cosmos — the default prototype talks to a real Azure Cosmos account (no emulator in the default compose). On a new machine you must either (a) point .env at the same Cosmos account (to see/share existing jobs) or (b) point at your own DB and set COSMOS_AUTO_INIT=true so containers are created on boot. Without valid Cosmos creds the service starts but every fleet call fails.
FLEET_METRICS_TOKEN — only needed if you run Prometheus (§4); must match services/monitoring/prometheus/prometheus.yml (credentials:).

2.2 Install + build the workspace (required for the host path)

Host-run resolves @bytelyst/* workspace packages from their dist/ (the exports field points at dist), so you must build them once before tsx/Next can import them:

cd ~/code/learning_ai_common_plat
pnpm install
pnpm -r build          # builds all workspace packages (incl. @bytelyst/* → dist/)
# (faster, just the platform-service closure:)
# pnpm -r --filter @lysnrai/platform-service... build

Skipping this is the #1 fresh-machine failure: tsx watch crashes with Cannot find module '@bytelyst/...'/dist/index.js. Re-run pnpm -r build after pulling changes to shared packages.

3. Bring up platform-service + tracker-web

Two ways. Option A (all localhost, no Docker) is recommended for a single dev Mac / WSL box — everything runs on the host, so gh-backed features work out of the box. Option B (Docker) is for when you also want the Grafana/Prometheus stack.

Option A — all localhost, no Docker (recommended)

Two long‑lived processes, each in its own terminal. Both assume §2.1/§2.2 are done (.env filled, pnpm -r build run).

Terminal 1 — coordinator (platform-service, :4003):

cd ~/code/learning_ai_common_plat/services/platform-service
pnpm exec tsx watch --env-file=../../.env src/server.ts

tsx watch hot-reloads on source changes. Use the explicit --env-file=../../.env (the bare pnpm dev script does not load the root .env, so JWT_SECRET/Cosmos would be missing). FLEET_METRICS_TOKEN is already in .env if you set it in §2.1.

Terminal 2 — dashboard (tracker-web, :3003):

cd ~/code/learning_ai_common_plat/dashboards/tracker-web
pnpm dev          # serves http://localhost:3003 (proxies /api → :4003)

That's the whole coordinator + UI. Monitoring (Grafana/Prometheus) is optional on the host path — GET /api/fleet/metrics (JSON), GET /api/fleet/autoscale, and the tracker-web job pages cover observability without it. To get the Grafana "Fleet Overview" dashboard you need Prometheus + Grafana (run them via Docker — Option B — or Homebrew binaries pointed at services/monitoring/...).

Because everything is on the host, gh is on PATH → the PR‑state reconcile (§8) and ship‑time gh pr merge work (unlike the Docker container, which has no gh).

Health checks:

curl -s -o /dev/null -w '%{http_code}\n' http://localhost:4003/health   # 200
curl -s -o /dev/null -w '%{http_code}\n' http://localhost:3003          # 200

Option B — Docker (adds Grafana + Prometheus)

cd ~/code/learning_ai_common_plat
# targeted fleet subset that always builds cleanly:
docker compose up -d --build platform-service prometheus grafana
# (full stack: bash scripts/prototype-up.sh)

Starts platform-service (:4003), prometheus (:9090), grafana (:3000, admin/lysnrai) + deps. Still run tracker-web from source (Option A, Terminal 2).

Docker caveats:

prototype-up.sh may fail building the dashboard images when corepack prepare pnpm@… can't fetch pnpm on a restricted network → use the targeted subset above.

gh is NOT in the container → coordinator‑side gh pr merge and PR‑reconcile (§8) are no‑ops in Docker. Use the host path (Option A) if you need them.

Don't run both: the container and a host tsx both bind :4003 (docker compose stop platform-service before host‑running).

4. Make Prometheus auth work (only if running Prometheus)

Skip this on the host path unless you also run Prometheus. prometheus.yml scrapes /api/fleet/metrics/prom with a bearer, so the running platform-service must see the same FLEET_METRICS_TOKEN:

cd ~/code/learning_ai_common_plat
grep -q '^FLEET_METRICS_TOKEN=' .env || \
  printf '\nFLEET_METRICS_TOKEN=changeme-fleet-metrics-token\n' >> .env
# host path: restart Terminal-1 tsx so it re-reads .env
# docker path: docker compose up -d platform-service

Verify (if Prometheus is up): http://localhost:9090/api/v1/targets → platform-service-fleet is up. The value must equal credentials: in services/monitoring/prometheus/prometheus.yml.

5. Mint a local API token (dev only)

platform-service verifies HS256 JWTs signed with JWT_SECRET and requires type: "access". The tracker-web UI obtains one via login; for scripts/agents and the factory, mint one directly. Local dev only — never commit tokens or the secret.

Save mint-token.mjs (resolve jose from the workspace):

import { readFileSync } from 'node:fs';
// adjust the jose path to your checkout if needed:
import { SignJWT } from '/ABS/PATH/learning_ai_common_plat/node_modules/.pnpm/jose@5.10.0/node_modules/jose/dist/node/esm/index.js';

const env = readFileSync('/ABS/PATH/learning_ai_common_plat/.env', 'utf8');
const secret = new TextEncoder().encode(env.match(/^JWT_SECRET=(.*)$/m)[1].trim());
const ttl = process.argv[2] || '15m'; // e.g. '15m' for scripts, '24h' for a factory
process.stdout.write(
  await new SignJWT({ sub: 'local-dev', role: 'admin', type: 'access' })
    .setProtectedHeader({ alg: 'HS256' })
    .setIssuedAt()
    .setExpirationTime(ttl)
    .sign(secret)
);

node mint-token.mjs 15m > /tmp/tok        # short-lived, for API calls
node mint-token.mjs 24h > /tmp/factok     # longer-lived, for the factory daemon

Find the jose path with: find . -path '*/node_modules/jose/dist/*/esm/index.js' | head -1.

Requests must also carry the product: header X-Product-Id: <productId> (e.g. notelett). role: admin bypasses tenant ownership checks when FLEET_TENANT_ENFORCEMENT is on (it's off by default).

6. Submit a job

Via tracker-web (preferred)

Open http://localhost:3003/dashboard/fleet/jobs, "New job". Set the correct product first (the product selector) — a job is partitioned by productId, and submitting under the wrong product misattributes cost/metrics/ownership and the factory won't see it under the product it polls.

PR‑mode fields that matter:

repo — must be owner/name (e.g. saravanakumardb1/learning_ai_notes) or a clone URL, not a bare name (the factory feeds it to gh).
baseBranch — e.g. main.
engine — devin (pins the agent; otherwise the factory's default/engineClass).
autoMerge — leave false for a human merge gate (recommended for large PRs).

Via API

JOB=$(curl -s -X POST http://localhost:4003/api/fleet/jobs \
  -H "Authorization: Bearer $(cat /tmp/tok)" \
  -H "X-Product-Id: notelett" -H 'Content-Type: application/json' \
  -d '{
    "idempotencyKey": "notelett-demo-1",
    "bodyMd": "# Task\n…full prompt…",
    "priority": "high",
    "engine": "devin",
    "repo": "saravanakumardb1/learning_ai_notes",
    "baseBranch": "main",
    "autoMerge": false
  }')
echo "$JOB"   # → { outcome: "created", job: { id: "fjob_…", stage: "queued", ... } }

The job is now queued and claimable. It will not run until a factory polls for its product (next step).

7. Start the factory (agent-queue, fleet mode)

The factory lives in a separate repo: learning_ai_devops_tools/agent-queue. Run it on the host (needs devin + gh). Read its docs/RUN_POLICY.md first.

7a. Sanity‑check connectivity (safe — registers + heartbeats only)

cd learning_ai_devops_tools/agent-queue
./agent-queue.sh init   # idempotent

AQ_FLEET=1 AQ_FLEET_ROUTE=1 \
AQ_FLEET_API=http://localhost:4003/api \
AQ_PRODUCT_ID=notelett \
AQ_FLEET_TOKEN="$(cat /tmp/factok)" \
AQ_FACTORY_ID=mac-local-1 \
AQ_FLEET_CAPS=engine:devin \
AQ_FLEET_LEASE_RENEW_SEC=60 \
  ./agent-queue.sh fleet-status      # → "heartbeat OK (registered)."

7b. Launch the run loop (claims + runs the agent)

cd learning_ai_devops_tools/agent-queue
AQ_FLEET=1 AQ_FLEET_ROUTE=1 AQ_FLEET_PR=1 \
AQ_FLEET_API=http://localhost:4003/api \
AQ_PRODUCT_ID=notelett \
AQ_FLEET_TOKEN="$(cat /tmp/factok)" \
AQ_FACTORY_ID=mac-local-1 \
AQ_FLEET_CAPS=engine:devin \
AQ_FLEET_LEASE_RENEW_SEC=60 \
  ./agent-queue.sh run --max 1

⚠️ Set AQ_FLEET_LEASE_RENEW_SEC below 90 (e.g. 60). This is the heartbeat/ lease‑renew cadence. The coordinator's reaper marks a factory stale after 90s (DEFAULT_STALE_FACTORY_MS, a constant — no env knob) and reclaims its in‑flight lease. The default cadence is 300s, so a busy single‑slot factory looks stale for most of every cycle and its running job gets requeued mid‑run (leaseEpoch climbs, stage flips back to queued, and the final report is fenced so the job never tidies to review/shipped). 60s keeps it comfortably live. (Add the same env to the §7a fleet-status check for consistency.)

Key fleet env vars (see lib/fleet-client.sh):

Var	Meaning
`AQ_FLEET=1`	master switch — enable coordinator calls (0 = pure offline)
`AQ_FLEET_ROUTE=1`	coordinator is authoritative for claim (pulls work from platform-service)
`AQ_FLEET_PR=1`	PR mode — open a PR for jobs that target a `repo`
`AQ_FLEET_API`	base URL including `/api` (`http://localhost:4003/api`)
`AQ_FLEET_TOKEN`	Bearer JWT (mint per §5; ≥ run duration, e.g. 24h)
`AQ_PRODUCT_ID`	product to poll — sent as `X-Product-Id` (must match the job's product)
`AQ_FACTORY_ID`	this factory's id (registered/heartbeated)
`AQ_FLEET_CAPS`	advertised capabilities, e.g. `engine:devin`
`AQ_FLEET_LEASE_RENEW_SEC`	set `<90` (e.g. `60`) — heartbeat/renew cadence vs the 90s stale window (see warning)
`AQ_FLEET_REPO_BASE`	(optional) dir of local checkouts; if `…/<repo>/.git` exists it uses a git worktree, else it `git clone`s `https://github.com/<repo>.git` into its cache
`AQ_FLEET_AUTOSHIP=1`	(optional) auto-advance to `shipped` (skips the manual gate)

The run loop claim → assigned → building, runs Devin in an isolated checkout, heartbeats + renews the lease (lease_renewed events) so the reaper doesn't reclaim it, then on agent exit moves to review and (PR mode) opens the PR. With autoMerge:false it stops at the human merge gate.

Repo checkout: the job's repo is owner/name, so by default the factory git clones https://github.com/<owner>/<name>.git into its own cache (queue/.state/repos/…) — clean isolation, nothing touches your working copies. To reuse an existing local clone via a git worktree instead, set AQ_FLEET_REPO_BASE=<parent> where <parent>/<owner>/<name>/.git exists.

8. Observe progress

tracker-web: http://localhost:3003/dashboard/fleet/jobs/<jobId> — live event stream (SSE), runs, PR link + state.

Events/API:

curl -s http://localhost:4003/api/fleet/jobs/<jobId>/events \
  -H "Authorization: Bearer $(cat /tmp/tok)" -H "X-Product-Id: notelett"

Metrics: GET /api/fleet/metrics (JSON, per product) · GET /api/fleet/metrics/prom (Prometheus, all products; needs FLEET_METRICS_TOKEN) · Grafana Fleet Overview (http://localhost:3000/d/fleet-overview).
Autoscale signal: GET /api/fleet/autoscale (this product) / …/autoscale/all.

PR‑state reconcile (externally‑merged PRs)

If you merge the PR in the GitHub UI, the coordinator doesn't know until told. Trigger a reconcile (flips run prState → merged when gh pr view reports MERGED):

UI: "Refresh PR status" button on the job's PR section, or
API: POST /api/fleet/jobs/<jobId>/pr/reconcile.

Requires gh where platform-service runs → use the host path (§3 Option A); it's a no‑op in the Docker container (no gh).

9. Safety & cost

Billable + autonomous + long‑running. Each run consumes Devin credits and can run for a long time unattended. Scope jobs deliberately; very large multi‑workstream specs are better split into several jobs.
Real PR. PR mode pushes a branch and opens a PR on the target repo. Keep autoMerge:false so a human reviews/merges; gh pr merge (auto) only fires when the job opts in or FLEET_SHIP_MERGES_PR=1.
Isolation. The factory works in an isolated worktree/clone, never your main checkout (per agent-queue/docs/RUN_POLICY.md). Avoid blanket --yolo on live trees.
Stopping the daemon mid‑run lets the lease expire; the coordinator's reaper then reclaims and requeues the job (so partial work may be retried). Stop intentionally.
Tokens/secrets: the minted JWT and JWT_SECRET are sensitive — never commit them or paste into shared logs. .env is git‑ignored; keep it that way.

10. Teardown

# stop the factory: Ctrl-C the run loop
# host path: Ctrl-C the tsx (Terminal 1) and pnpm dev (Terminal 2)
# docker path:
#   cd ~/code/learning_ai_common_plat && docker compose down     # keep volumes
#   docker compose down -v                                       # also drop volumes
rm -f /tmp/tok /tmp/factok     # discard minted tokens

11. Troubleshooting

Symptom	Cause → Fix
`Cannot find module '@bytelyst/…/dist/index.js'` on `tsx`/Next start	workspace packages not built → `pnpm -r build` (§2.2).
`401 {"error":"Invalid or expired token"}`	JWT expired/mis‑signed → re‑mint (§5); ensure same `JWT_SECRET` as the running service.
Job claimed then flips back to `queued` mid‑run; `leaseEpoch` keeps climbing; final report fenced; PR opens but job never reaches `review`/`shipped`	factory heartbeat cadence (`AQ_FLEET_LEASE_RENEW_SEC`, default 300s) > reaper stale window (90s) → set `AQ_FLEET_LEASE_RENEW_SEC=60` (§7). To recover the record after the fact, reconcile PR state (§8).
Job stays `queued`, never claimed	No factory for that product → `fleet-status` shows it registered? `AQ_PRODUCT_ID` must equal the job's product. Check `GET /api/fleet/factories` (X‑Product‑Id) for `0 live`.
`POST …/pr/reconcile` or ship auto‑merge does nothing	`gh` not present where platform-service runs (Docker container) → run the host path (§3 Option A).
Prometheus target `platform-service-fleet` = `down (401)`	service missing `FLEET_METRICS_TOKEN` → §4 (restart host `tsx` / recreate container).
`prototype-up.sh` build fails on `corepack prepare pnpm`	dashboard image network issue → use the targeted subset, or just use the host path (Option A).
`POST …/actions/<x>` returns 500 "Body cannot be empty"	sent `Content-Type: application/json` with no body → omit the header or send `{}`.
Port `4003` conflict	host `tsx watch` and a `platform-service` container both bind `4003` → run only one.
`gh pr create` fails	`repo` is a bare name → must be `owner/name` or a clone URL; confirm `gh auth status`.
PR/cost attributed to wrong product	job submitted under the wrong `productId` partition → resubmit under the right product and cancel the stray (`POST …/actions/cancel`).
`vitest` exits non‑zero with `kill EPERM` after all suites pass	worker‑pool teardown artifact (sandbox), not a test failure → re‑run; all suites already passed.

12. Copy‑paste quickstart — all localhost (notelett → learning_ai_notes)

Assumes §2.1/§2.2 done (.env filled, pnpm -r build run). Four terminals.

# Terminal 1 — coordinator
cd ~/code/learning_ai_common_plat/services/platform-service
pnpm exec tsx watch --env-file=../../.env src/server.ts

# Terminal 2 — dashboard
cd ~/code/learning_ai_common_plat/dashboards/tracker-web && pnpm dev   # :3003

# Terminal 3 — tokens + submit (save mint-token.mjs from §5; fix ABS paths)
node mint-token.mjs 15m > /tmp/tok
node mint-token.mjs 24h > /tmp/factok
curl -s -X POST http://localhost:4003/api/fleet/jobs \
  -H "Authorization: Bearer $(cat /tmp/tok)" -H "X-Product-Id: notelett" \
  -H 'Content-Type: application/json' \
  -d '{"idempotencyKey":"notelett-demo-1","bodyMd":"# Task…","priority":"high","engine":"devin","repo":"saravanakumardb1/learning_ai_notes","baseBranch":"main","autoMerge":false}'

# Terminal 4 — factory (runs Devin → opens a real PR). NOTE the <90s heartbeat cadence.
cd ~/code/learning_ai_devops_tools/agent-queue && ./agent-queue.sh init
AQ_FLEET=1 AQ_FLEET_ROUTE=1 AQ_FLEET_PR=1 AQ_FLEET_API=http://localhost:4003/api \
AQ_PRODUCT_ID=notelett AQ_FLEET_TOKEN="$(cat /tmp/factok)" \
AQ_FACTORY_ID=mac-local-1 AQ_FLEET_CAPS=engine:devin AQ_FLEET_LEASE_RENEW_SEC=60 \
  ./agent-queue.sh run --max 1

13. WSL on Windows — differences to note

The flow is identical inside a WSL2 (Ubuntu) shell, with these adjustments. Treat WSL as "the Linux host" — install and run everything inside WSL, not Windows.

Keep repos on the WSL filesystem, not /mnt/c. Clone under e.g. ~/code inside WSL. On /mnt/c (the Windows drive over 9p) tsx watch/Next file‑watching is unreliable (inotify doesn't fire) and git/pnpm are far slower. This is the single most important difference.
Install the toolchain inside WSL (Linux builds): node/pnpm (nvm), git, gh, and the devin CLI — and run gh auth login + Devin auth inside WSL. A gh/ devin installed on Windows is not visible to the WSL bash factory.
Line endings. Clone inside WSL (don't reuse a Windows checkout with core.autocrlf=true) so the *.sh scripts stay LF — CRLF breaks agent-queue.sh (bad interpreter/\r). If needed: git config --global core.autocrlf input.
Reaching the UI from the Windows browser. WSL2 forwards localhost, so http://localhost:3003 / :4003 usually work from a Windows browser. If they don't (older Windows / mirrored‑networking off), use the WSL IP (hostname -I) or set networkingMode=mirrored in .wslconfig.
Ports. Make sure nothing on the Windows side already binds 3003/4003 (WSL2 publishes to the same localhost). Stop the Windows process or change ports.
Docker (Option B), if used. Use Docker Desktop with the WSL2 backend and run docker compose from inside the WSL shell. host.docker.internal resolves from containers to the host as on Mac.
/tmp token paths (/tmp/tok, /tmp/factok) are the WSL /tmp — fine; just keep all four terminals in the same WSL distro so they share it.
Clock skew. If WSL's clock drifts after sleep, JWT iat/exp checks can fail (Invalid or expired token) — sudo hwclock -s (or restart WSL) to resync.

Everything else — env vars, pnpm -r build, tsx --env-file, the factory env incl. AQ_FLEET_LEASE_RENEW_SEC=60, token minting — is identical to the Mac host path.

Reference

Coordinator routes: services/platform-service/src/modules/fleet/routes.ts
Coordinator logic: services/platform-service/src/modules/fleet/coordinator.ts
Factory fleet client: learning_ai_devops_tools/agent-queue/lib/fleet-client.sh
Factory runner + PR mode: learning_ai_devops_tools/agent-queue/agent-queue.sh
Gigafactory spec/roadmap: learning_ai_devops_tools/agent-queue/docs/GIGAFACTORY/
Prometheus scrape config: services/monitoring/prometheus/prometheus.yml
Grafana dashboard: services/monitoring/grafana/dashboards/fleet-overview.json

29 KiB Raw Blame History Unescape Escape