feat(agent-queue): build/ship lifecycle with auto-QA verify gate + manual ship
Redesign the kanban runner stages from inbox->doing->done/failed to inbox->building->review->testing->shipped (+ failed): - worker: agent rc=0 lands in review/, then runs the configurable verify command (frontmatter verify: / AGENT_QUEUE_VERIFY) in cwd; pass -> testing/ (QA), fail -> failed/, none -> parks in review/ - new commands: ship (testing->shipped, manual gate), promote (advance one stage), reject (review/testing->failed); requeue now also pulls from review/testing - status + dashboard.mjs render all six stages; RECENT panel labels shipped/testing/review/verify_failed/timeout/rejected - README: new lifecycle diagram, verify: frontmatter, result= glossary, command table + folder layout - selftest: assert no-verify->review, verify-pass->testing->ship->shipped, verify-fail->failed - rename queue/doing->building, queue/done->review; add testing/ shipped/
This commit is contained in:
parent
27feba36fa
commit
af1bc6904e
@ -3,7 +3,25 @@
|
||||
A zero-dependency **folder "kanban" runner** for headless coding-agent CLIs —
|
||||
**Devin**, **Claude Code**, and **OpenAI Codex**. Drop prompt `.md` files into a folder,
|
||||
and they get executed (in auto-approve mode) one slot at a time, moving through
|
||||
`inbox → doing → done/failed` with live status.
|
||||
`inbox → building → review → testing → shipped` (plus `failed`) with live status.
|
||||
|
||||
**Build/ship lifecycle — auto-QA, manual ship:**
|
||||
|
||||
```
|
||||
inbox ─▶ building ─▶ review ─▶ testing ─▶ shipped
|
||||
(queued) (agent (rc=0; (verify (you ran
|
||||
running) awaiting passed — `ship`)
|
||||
verify) QA gate)
|
||||
│
|
||||
agent rc≠0 / │ verify fails
|
||||
timeout ──────────┴──────────────▶ failed
|
||||
```
|
||||
|
||||
- **Auto:** agent exits 0 → `review/`. If a `verify:` command is configured it runs
|
||||
automatically: **pass → `testing/` (QA)**, **fail → `failed/`**. No `verify:` →
|
||||
the job parks in `review/` for a manual `promote`.
|
||||
- **Manual:** you `ship` a `testing/` job → `shipped/` (the human gate). Shipping is
|
||||
never automatic.
|
||||
|
||||
> **Why this exists:** the agent CLIs ship a minimal local interface (no built-in
|
||||
> batch/queue/dashboard — that lives in their *cloud* products). This is the
|
||||
@ -37,7 +55,7 @@ In a **second terminal**, watch progress:
|
||||
|
||||
```
|
||||
AGENT QUEUE /…/agent-queue/queue
|
||||
inbox 3 doing 2 done 5 failed 0 running 2/2
|
||||
inbox 3 building 2 review 1 testing 2 shipped 5 failed 0 running 2/2
|
||||
|
||||
RUNNING
|
||||
20260528-2130__UX-2 devin 4m12s pid 51234 ⏺ Edited src/app/dashboard/items/page.tsx
|
||||
@ -58,6 +76,9 @@ cwd: /abs/path/to/repo # where the agent executes (default: cwd when added)
|
||||
yolo: true # auto-approve ALL tools (default: true)
|
||||
lock: my-repo # optional mutex key (default: cwd). Jobs sharing a key run serially
|
||||
timeout: 45m # optional. 90s|45m|2h|1d. On expiry → failed (result=timeout)
|
||||
verify: pnpm -s test # optional auto-QA gate. Runs in cwd after rc=0:
|
||||
# pass → testing/ (QA), fail → failed/
|
||||
# (omit to park in review/ for manual promote)
|
||||
---
|
||||
|
||||
# Your task / roadmap goes here
|
||||
@ -91,10 +112,13 @@ misparsed as a flag.
|
||||
| `run [--max N] [--engine E] [--once]` | process the inbox (foreground loop) |
|
||||
| `status` | kanban counts + running-worker table (marks `⚠ stalled` workers) |
|
||||
| `watch [interval]` | live `status` (bash), redrawn every N seconds (default 2) |
|
||||
| `dash [--interval N]` | richer **Node** live dashboard — running workers (engine, elapsed, last log line, stall) + recent done/failed |
|
||||
| `dash [--interval N]` | richer **Node** live dashboard — running workers (engine, elapsed, last log line, stall) + recent shipped/failed |
|
||||
| `stop` | kill running workers + the run loop |
|
||||
| `logs <job> [-f]` | print / follow a job's log |
|
||||
| `requeue <job>` | move a failed job back to `inbox/` for a fresh run |
|
||||
| `promote <job>` | advance one stage forward: `review → testing → shipped` |
|
||||
| `ship <job>` | **manual gate:** move a `testing/` (QA) job → `shipped/` |
|
||||
| `reject <job>` | send a `review/` or `testing/` job → `failed/` |
|
||||
| `requeue <job>` | move a `failed`/`review`/`testing` job back to `inbox/` for a fresh run |
|
||||
| `clean [--keep N]` | archive finished logs+meta beyond the newest N (default 50) into `queue/.archive/` |
|
||||
|
||||
Only one `run` loop may be active per queue — a second `run` against the same
|
||||
@ -114,15 +138,20 @@ Wired into the repo's unified CLI (no GitHub token required for this subcommand)
|
||||
```
|
||||
queue/
|
||||
inbox/ # drop / queued .md files (oldest eligible picked first)
|
||||
doing/ # currently executing
|
||||
done/ # exited 0
|
||||
failed/ # non-zero exit, bad cwd, or timeout (result=timeout)
|
||||
logs/ # <job>.log — full agent output
|
||||
building/ # currently executing (agent running)
|
||||
review/ # agent exited 0 — awaiting the auto-QA verify gate (or manual promote)
|
||||
testing/ # verify passed (QA) — awaiting manual `ship`
|
||||
shipped/ # manually shipped — the terminal success stage
|
||||
failed/ # non-zero exit, bad cwd, timeout, verify failure, or manual reject
|
||||
logs/ # <job>.log — full agent + verify output
|
||||
locks/ # per-key flock files (Linux hardening; unused on macOS)
|
||||
.state/ # <job>.meta heartbeats + daemon.pid (runtime only)
|
||||
.archive/ # <ts>/ — logs+meta moved here by `clean`
|
||||
```
|
||||
|
||||
**`result=` values** written to `<job>.meta`: `review`, `testing`, `shipped`,
|
||||
`failed`, `timeout`, `verify_failed`, `rejected`, `requeued`.
|
||||
|
||||
## Config (env overrides)
|
||||
|
||||
| Var | Default | Meaning |
|
||||
@ -131,6 +160,7 @@ queue/
|
||||
| `AGENT_QUEUE_MAX` | `2` | max concurrent agents |
|
||||
| `AGENT_QUEUE_ENGINE` | `devin` | default engine when none in frontmatter |
|
||||
| `AGENT_QUEUE_POLL` | `3` | inbox poll interval (seconds) |
|
||||
| `AGENT_QUEUE_VERIFY` | _(empty)_ | default auto-QA verify command; per-job `verify:` overrides it |
|
||||
| `AGENT_QUEUE_STALL_MIN` | `10` | minutes of unchanged log before a worker is `⚠ stalled` |
|
||||
| `DEVIN_BIN` / `CLAUDE_BIN` / `CODEX_BIN` | autodetected | override CLI binary paths |
|
||||
| `FLOCK_BIN` / `TIMEOUT_BIN` | autodetected | `flock` (lock hardening) and `timeout`/`gtimeout` (hard timeouts); absent on stock macOS — see notes |
|
||||
@ -160,6 +190,7 @@ run shell commands, and commit unattended. Mitigate:
|
||||
- [x] Per-job `timeout:` with hard kill (or bash watchdog fallback).
|
||||
- [x] Stall detection in `status`/`dash`.
|
||||
- [x] `requeue` failed jobs + `clean`/archive old runs.
|
||||
- [x] Build/ship lifecycle: `review → testing → shipped` with auto-QA `verify:` gate + manual `ship`.
|
||||
- [ ] `--push` opt-in policy + commit review gate.
|
||||
- [ ] Optional notifications (Slack/desktop) on done/failed/stall.
|
||||
- [ ] Persisted run-loop as a daemon/service with auto-restart.
|
||||
|
||||
@ -4,10 +4,15 @@
|
||||
#
|
||||
# Drop a prompt .md file into queue/inbox/, and `agent-queue run` will:
|
||||
# 1. pick the oldest file (respecting --max concurrency),
|
||||
# 2. move it inbox/ -> doing/,
|
||||
# 2. move it inbox/ -> building/,
|
||||
# 3. launch the chosen agent CLI (devin | claude | codex) in --yolo mode,
|
||||
# 4. on success move doing/ -> done/, on failure -> failed/,
|
||||
# 5. write a per-job log + live state so `status`/`watch` can show progress.
|
||||
# 4. on agent rc=0 move building/ -> review/, then run the auto-QA verify gate:
|
||||
# verify pass -> testing/ verify fail -> failed/ (no verify -> stays in review/)
|
||||
# 5. on agent failure/timeout move building/ -> failed/,
|
||||
# 6. you manually `ship` testing/ -> shipped/ (the human gate),
|
||||
# 7. write a per-job log + live state so `status`/`watch` can show progress.
|
||||
#
|
||||
# Lifecycle: inbox -> building -> review -> testing -> shipped (+ failed)
|
||||
#
|
||||
# Per-task config travels in YAML-ish frontmatter at the top of the .md:
|
||||
# ---
|
||||
@ -16,7 +21,8 @@
|
||||
# yolo: true # auto-approve all tools (default: true)
|
||||
# ---
|
||||
#
|
||||
# Subcommands: init | add | run | status | watch | stop | logs | help
|
||||
# Subcommands: init | add | run | status | watch | dash | stop | logs |
|
||||
# promote | ship | reject | requeue | clean | help
|
||||
#
|
||||
set -uo pipefail
|
||||
|
||||
@ -24,8 +30,10 @@ set -uo pipefail
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
QUEUE_ROOT="${AGENT_QUEUE_ROOT:-$SCRIPT_DIR/queue}"
|
||||
INBOX="$QUEUE_ROOT/inbox"
|
||||
DOING="$QUEUE_ROOT/doing"
|
||||
DONE="$QUEUE_ROOT/done"
|
||||
BUILDING="$QUEUE_ROOT/building"
|
||||
REVIEW="$QUEUE_ROOT/review"
|
||||
TESTING="$QUEUE_ROOT/testing"
|
||||
SHIPPED="$QUEUE_ROOT/shipped"
|
||||
FAILED="$QUEUE_ROOT/failed"
|
||||
LOGS="$QUEUE_ROOT/logs"
|
||||
STATE="$QUEUE_ROOT/.state"
|
||||
@ -38,6 +46,11 @@ POLL_SECONDS="${AGENT_QUEUE_POLL:-3}"
|
||||
# A running worker is flagged "stalled" if its log has not changed in this many
|
||||
# minutes (no new agent output) — surfaced in status + dash.
|
||||
STALL_MIN="${AGENT_QUEUE_STALL_MIN:-10}"
|
||||
# Auto-QA verify command. After an agent exits 0 the job lands in review/; if a
|
||||
# verify command is set (frontmatter `verify:` overrides this default) it runs in
|
||||
# the job's cwd: pass -> testing/ (QA), fail -> failed/. Empty default = jobs park
|
||||
# in review/ for manual `promote`. Shipping (testing -> shipped) is always manual.
|
||||
DEFAULT_VERIFY="${AGENT_QUEUE_VERIFY:-}"
|
||||
|
||||
# flock is used for cross-process lock hardening when available (Linux). macOS
|
||||
# has no flock; mutual exclusion there relies on the single run-loop (see cmd_run).
|
||||
@ -63,7 +76,7 @@ err() { printf '%s[agent-queue]%s %s\n' "$C_RED" "$C_RESET" "$*" >&2; }
|
||||
die() { err "$*"; exit 1; }
|
||||
|
||||
# ── Init ────────────────────────────────────────────────────────────
|
||||
ensure_dirs() { mkdir -p "$INBOX" "$DOING" "$DONE" "$FAILED" "$LOGS" "$STATE" "$LOCKS"; }
|
||||
ensure_dirs() { mkdir -p "$INBOX" "$BUILDING" "$REVIEW" "$TESTING" "$SHIPPED" "$FAILED" "$LOGS" "$STATE" "$LOCKS"; }
|
||||
|
||||
# ── Frontmatter parsing ─────────────────────────────────────────────
|
||||
# fm_get <file> <key> <default>
|
||||
@ -291,19 +304,45 @@ run_worker() {
|
||||
fi
|
||||
rm -f "$tmo_flag"
|
||||
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "exit=$rc" >> "$metaf"
|
||||
if $timed_out; then
|
||||
mv "$doing_file" "$FAILED/" 2>/dev/null
|
||||
echo "result=timeout" >> "$metaf"
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "TIMED OUT after ${tmo}s (rc=$rc): $(date)" >> "$logf"
|
||||
elif [[ $rc -eq 0 ]]; then
|
||||
mv "$doing_file" "$DONE/" 2>/dev/null
|
||||
echo "result=done" >> "$metaf"
|
||||
echo "completed OK (rc=0): $(date)" >> "$logf"
|
||||
# Agent succeeded: land in review/, then run the auto-QA verify gate. The
|
||||
# worker is still alive here so the concurrency slot stays held through
|
||||
# verification — `ended=` is written only once we reach a resting stage.
|
||||
mv "$doing_file" "$REVIEW/" 2>/dev/null
|
||||
local review_file="$REVIEW/$job.md"
|
||||
echo "completed OK (rc=0): landed in review — $(date)" >> "$logf"
|
||||
local verify; verify=$(fm_get "$review_file" verify "$DEFAULT_VERIFY")
|
||||
if [[ -z "$verify" ]]; then
|
||||
echo "result=review" >> "$metaf"
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "no verify command — parked in review for manual promote: $(date)" >> "$logf"
|
||||
else
|
||||
echo "----- verify: $verify -----" >> "$logf"
|
||||
local vrc=0
|
||||
( cd "$cwd" && bash -c "$verify" ) >> "$logf" 2>&1 || vrc=$?
|
||||
echo "verify_exit=$vrc" >> "$metaf"
|
||||
if [[ $vrc -eq 0 ]]; then
|
||||
mv "$review_file" "$TESTING/" 2>/dev/null
|
||||
echo "result=testing" >> "$metaf"
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "VERIFY PASSED — promoted to testing (QA): $(date)" >> "$logf"
|
||||
else
|
||||
mv "$review_file" "$FAILED/" 2>/dev/null
|
||||
echo "result=verify_failed" >> "$metaf"
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "VERIFY FAILED (rc=$vrc): $(date)" >> "$logf"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
mv "$doing_file" "$FAILED/" 2>/dev/null
|
||||
echo "result=failed" >> "$metaf"
|
||||
echo "ended=$(date +%s)" >> "$metaf"
|
||||
echo "FAILED (rc=$rc): $(date)" >> "$logf"
|
||||
fi
|
||||
}
|
||||
@ -385,7 +424,7 @@ cmd_run() {
|
||||
[[ -z "$next" ]] && break
|
||||
|
||||
local job; job=$(basename "$next"); job=${job%.md}
|
||||
local doing_file="$DOING/$(basename "$next")"
|
||||
local doing_file="$BUILDING/$(basename "$next")"
|
||||
mv "$next" "$doing_file"
|
||||
local w_eng w_cwd w_yolo w_key
|
||||
w_eng=$(fm_get "$doing_file" engine "$DEFAULT_ENGINE")
|
||||
@ -420,14 +459,16 @@ _count() { ls -1 "$1"/*.md 2>/dev/null | wc -l | tr -d ' '; }
|
||||
|
||||
cmd_status() {
|
||||
ensure_dirs
|
||||
local ib dg dn fl
|
||||
ib=$(_count "$INBOX"); dg=$(_count "$DOING"); dn=$(_count "$DONE"); fl=$(_count "$FAILED")
|
||||
local ib bd rv ts sh fl
|
||||
ib=$(_count "$INBOX"); bd=$(_count "$BUILDING"); rv=$(_count "$REVIEW")
|
||||
ts=$(_count "$TESTING"); sh=$(_count "$SHIPPED"); fl=$(_count "$FAILED")
|
||||
local running; running=$(active_workers)
|
||||
echo
|
||||
printf '%s AGENT QUEUE %s %s\n' "$C_BOLD" "$C_DIM$QUEUE_ROOT$C_RESET" ""
|
||||
printf ' %sinbox%s %-3s %sdoing%s %-3s %sdone%s %-3s %sfailed%s %-3s %srunning%s %s/%s\n\n' \
|
||||
"$C_BLUE" "$C_RESET" "$ib" "$C_YEL" "$C_RESET" "$dg" \
|
||||
"$C_GREEN" "$C_RESET" "$dn" "$C_RED" "$C_RESET" "$fl" \
|
||||
printf ' %sinbox%s %-3s %sbuilding%s %-3s %sreview%s %-3s %stesting%s %-3s %sshipped%s %-3s %sfailed%s %-3s %srunning%s %s/%s\n\n' \
|
||||
"$C_BLUE" "$C_RESET" "$ib" "$C_YEL" "$C_RESET" "$bd" \
|
||||
"$C_CYAN" "$C_RESET" "$rv" "$C_CYAN" "$C_RESET" "$ts" \
|
||||
"$C_GREEN" "$C_RESET" "$sh" "$C_RED" "$C_RESET" "$fl" \
|
||||
"$C_BOLD" "$C_RESET" "$running" "$MAX_CONCURRENCY"
|
||||
|
||||
# running table
|
||||
@ -489,19 +530,77 @@ cmd_logs() {
|
||||
if [[ -n "$follow" ]]; then tail -f "$lf"; else cat "$lf"; fi
|
||||
}
|
||||
|
||||
# requeue <job> — move a failed job back to inbox/ for a fresh run.
|
||||
# _find_job <job> <dir...> — echo the first matching .md across the given dirs
|
||||
# (exact "<job>.md" preferred, else newest fuzzy match). Empty if none found.
|
||||
_find_job() {
|
||||
local job=$1; shift
|
||||
local d f
|
||||
for d in "$@"; do
|
||||
[[ -f "$d/$job.md" ]] && { printf '%s' "$d/$job.md"; return; }
|
||||
done
|
||||
for d in "$@"; do
|
||||
f=$(ls -1t "$d"/*"$job"*.md 2>/dev/null | head -1)
|
||||
[[ -f "$f" ]] && { printf '%s' "$f"; return; }
|
||||
done
|
||||
}
|
||||
|
||||
# requeue <job> — move a job back to inbox/ for a fresh run (from failed/review/testing).
|
||||
cmd_requeue() {
|
||||
ensure_dirs
|
||||
local job="${1:-}"
|
||||
[[ -n "$job" ]] || die "usage: requeue <job>"
|
||||
local f="$FAILED/$job.md"
|
||||
[[ -f "$f" ]] || f=$(ls -1t "$FAILED"/*"$job"*.md 2>/dev/null | head -1)
|
||||
[[ -f "$f" ]] || die "no failed job matching '$job'"
|
||||
local base name; base=$(basename "$f"); name=${base%.md}
|
||||
local f; f=$(_find_job "$job" "$FAILED" "$REVIEW" "$TESTING")
|
||||
[[ -n "$f" ]] || die "no failed/review/testing job matching '$job'"
|
||||
local base name from; base=$(basename "$f"); name=${base%.md}; from=$(basename "$(dirname "$f")")
|
||||
mv "$f" "$INBOX/$base"
|
||||
# drop stale state so it re-runs cleanly
|
||||
rm -f "$STATE/$name.meta" "$STATE/$name.body.md" "$STATE/$name.timedout"
|
||||
log "requeued $C_BOLD$base$C_RESET (failed → inbox)"
|
||||
log "requeued $C_BOLD$base$C_RESET ($from → inbox)"
|
||||
}
|
||||
|
||||
# ship <job> — manual promotion testing/ (QA) → shipped/. The human gate.
|
||||
cmd_ship() {
|
||||
ensure_dirs
|
||||
local job="${1:-}"
|
||||
[[ -n "$job" ]] || die "usage: ship <job>"
|
||||
local f; f=$(_find_job "$job" "$TESTING")
|
||||
[[ -n "$f" ]] || die "no job in testing/ matching '$job' (only QA-passed jobs can ship)"
|
||||
local base name; base=$(basename "$f"); name=${base%.md}
|
||||
mv "$f" "$SHIPPED/$base"
|
||||
[[ -f "$STATE/$name.meta" ]] && echo "result=shipped" >> "$STATE/$name.meta"
|
||||
log "shipped $C_BOLD$base$C_RESET (testing → shipped)"
|
||||
}
|
||||
|
||||
# promote <job> — advance one stage forward: review → testing → shipped.
|
||||
cmd_promote() {
|
||||
ensure_dirs
|
||||
local job="${1:-}"
|
||||
[[ -n "$job" ]] || die "usage: promote <job>"
|
||||
local f; f=$(_find_job "$job" "$REVIEW" "$TESTING")
|
||||
[[ -n "$f" ]] || die "no job in review/ or testing/ matching '$job'"
|
||||
local base name from dest result; base=$(basename "$f"); name=${base%.md}
|
||||
from=$(basename "$(dirname "$f")")
|
||||
case "$from" in
|
||||
review) dest="$TESTING"; result="testing";;
|
||||
testing) dest="$SHIPPED"; result="shipped";;
|
||||
*) die "promote: '$base' is in '$from' — nothing to promote";;
|
||||
esac
|
||||
mv "$f" "$dest/$base"
|
||||
[[ -f "$STATE/$name.meta" ]] && echo "result=$result" >> "$STATE/$name.meta"
|
||||
log "promoted $C_BOLD$base$C_RESET ($from → $result)"
|
||||
}
|
||||
|
||||
# reject <job> — move a review/testing job to failed/ (manual gate rejection).
|
||||
cmd_reject() {
|
||||
ensure_dirs
|
||||
local job="${1:-}"
|
||||
[[ -n "$job" ]] || die "usage: reject <job>"
|
||||
local f; f=$(_find_job "$job" "$REVIEW" "$TESTING")
|
||||
[[ -n "$f" ]] || die "no job in review/ or testing/ matching '$job'"
|
||||
local base name from; base=$(basename "$f"); name=${base%.md}; from=$(basename "$(dirname "$f")")
|
||||
mv "$f" "$FAILED/$base"
|
||||
[[ -f "$STATE/$name.meta" ]] && echo "result=rejected" >> "$STATE/$name.meta"
|
||||
log "rejected $C_BOLD$base$C_RESET ($from → failed)"
|
||||
}
|
||||
|
||||
# clean [--keep N] — archive finished jobs' logs+meta beyond the newest N
|
||||
@ -556,14 +655,19 @@ ${C_BOLD}COMMANDS${C_RESET}
|
||||
process inbox/ (foreground loop; Ctrl-C to stop)
|
||||
status show kanban counts + running workers
|
||||
watch [interval] live status (default 2s, bash)
|
||||
dash [--interval N] richer live Node dashboard (recent done/failed too)
|
||||
dash [--interval N] richer live Node dashboard (recent shipped/failed too)
|
||||
stop kill running workers + the run loop
|
||||
logs <job> [-f] print (or follow) a job's log
|
||||
requeue <job> move a failed job back to inbox/
|
||||
promote <job> advance one stage (review → testing → shipped)
|
||||
ship <job> manual gate: testing (QA) → shipped
|
||||
reject <job> send a review/testing job to failed/
|
||||
requeue <job> move a failed/review/testing job back to inbox/
|
||||
clean [--keep N] archive finished logs+meta beyond newest N (default 50)
|
||||
help this message
|
||||
|
||||
${C_BOLD}KANBAN${C_RESET} inbox → doing → done / failed (logs/ + .state/ alongside)
|
||||
${C_BOLD}KANBAN${C_RESET} inbox → building → review → testing → shipped (+ failed; logs/ + .state/ alongside)
|
||||
auto: agent rc=0 → review; verify pass → testing; verify fail → failed
|
||||
manual: ship (testing → shipped)
|
||||
|
||||
${C_BOLD}TASK FRONTMATTER${C_RESET} (top of each .md)
|
||||
---
|
||||
@ -572,11 +676,13 @@ ${C_BOLD}TASK FRONTMATTER${C_RESET} (top of each .md)
|
||||
yolo: true
|
||||
lock: my-repo # optional; defaults to cwd. Jobs sharing a key run serially
|
||||
timeout: 45m # optional; 90s|45m|2h|1d. On expiry -> failed (result=timeout)
|
||||
verify: pnpm -s test # optional; auto-QA gate. pass -> testing, fail -> failed
|
||||
---
|
||||
|
||||
${C_BOLD}ENV${C_RESET}
|
||||
AGENT_QUEUE_ROOT (=$QUEUE_ROOT) AGENT_QUEUE_MAX (=$MAX_CONCURRENCY)
|
||||
AGENT_QUEUE_ENGINE (=$DEFAULT_ENGINE) DEVIN_BIN / CLAUDE_BIN / CODEX_BIN
|
||||
AGENT_QUEUE_ENGINE (=$DEFAULT_ENGINE) AGENT_QUEUE_VERIFY (default verify cmd)
|
||||
DEVIN_BIN / CLAUDE_BIN / CODEX_BIN
|
||||
EOF
|
||||
}
|
||||
|
||||
@ -591,6 +697,9 @@ main() {
|
||||
dash|dashboard) cmd_dash "$@";;
|
||||
stop) cmd_stop "$@";;
|
||||
logs) cmd_logs "$@";;
|
||||
promote) cmd_promote "$@";;
|
||||
ship) cmd_ship "$@";;
|
||||
reject) cmd_reject "$@";;
|
||||
requeue) cmd_requeue "$@";;
|
||||
clean) cmd_clean "$@";;
|
||||
help|-h|--help) usage;;
|
||||
|
||||
@ -3,7 +3,9 @@
|
||||
//
|
||||
// Reads the same queue/ state written by agent-queue.sh and re-renders a board
|
||||
// every interval: kanban counts, running workers (engine, elapsed, last log line),
|
||||
// and the most recent done/failed jobs.
|
||||
// and the most recent shipped/failed jobs.
|
||||
//
|
||||
// Lifecycle: inbox → building → review → testing → shipped (+ failed)
|
||||
//
|
||||
// Usage: node dashboard.mjs [--interval 2] [--root /path/to/queue]
|
||||
// AGENT_QUEUE_ROOT=/path node dashboard.mjs
|
||||
@ -28,8 +30,10 @@ const STALL_MIN = Math.max(1, parseInt(process.env.AGENT_QUEUE_STALL_MIN || '10'
|
||||
|
||||
const DIRS = {
|
||||
inbox: path.join(ROOT, 'inbox'),
|
||||
doing: path.join(ROOT, 'doing'),
|
||||
done: path.join(ROOT, 'done'),
|
||||
building: path.join(ROOT, 'building'),
|
||||
review: path.join(ROOT, 'review'),
|
||||
testing: path.join(ROOT, 'testing'),
|
||||
shipped: path.join(ROOT, 'shipped'),
|
||||
failed: path.join(ROOT, 'failed'),
|
||||
logs: path.join(ROOT, 'logs'),
|
||||
state: path.join(ROOT, '.state'),
|
||||
@ -112,8 +116,9 @@ function render() {
|
||||
.sort((a, b) => Number(b.ended) - Number(a.ended));
|
||||
|
||||
const counts = {
|
||||
inbox: count(DIRS.inbox), doing: count(DIRS.doing),
|
||||
done: count(DIRS.done), failed: count(DIRS.failed),
|
||||
inbox: count(DIRS.inbox), building: count(DIRS.building),
|
||||
review: count(DIRS.review), testing: count(DIRS.testing),
|
||||
shipped: count(DIRS.shipped), failed: count(DIRS.failed),
|
||||
};
|
||||
|
||||
const out = [];
|
||||
@ -123,8 +128,10 @@ function render() {
|
||||
out.push('');
|
||||
out.push(
|
||||
` ${c('blue', '▢ inbox')} ${String(counts.inbox).padEnd(3)}` +
|
||||
` ${c('yellow', '◧ doing')} ${String(counts.doing).padEnd(3)}` +
|
||||
` ${c('green', '▣ done')} ${String(counts.done).padEnd(3)}` +
|
||||
` ${c('yellow', '◧ building')} ${String(counts.building).padEnd(3)}` +
|
||||
` ${c('cyan', '◔ review')} ${String(counts.review).padEnd(3)}` +
|
||||
` ${c('cyan', '◕ testing')} ${String(counts.testing).padEnd(3)}` +
|
||||
` ${c('green', '▣ shipped')} ${String(counts.shipped).padEnd(3)}` +
|
||||
` ${c('red', '✕ failed')} ${String(counts.failed).padEnd(3)}` +
|
||||
` ${C.bold}running ${running.length}${C.reset}`
|
||||
);
|
||||
@ -161,13 +168,24 @@ function render() {
|
||||
out.push(` ${c('dim', 'nothing finished yet')}`);
|
||||
} else {
|
||||
for (const m of recent) {
|
||||
const ok = m.result === 'done';
|
||||
const res = m.result || '';
|
||||
const failedRes = res === 'failed' || res === 'timeout' || res === 'verify_failed' || res === 'rejected';
|
||||
const ok = !failedRes;
|
||||
const mark = ok ? c('green', '▣') : c('red', '✕');
|
||||
const when = m.ended ? new Date(Number(m.ended) * 1000).toLocaleTimeString() : '';
|
||||
let label;
|
||||
if (res === 'shipped') label = c('green', 'shipped');
|
||||
else if (res === 'testing') label = c('cyan', 'testing (QA)');
|
||||
else if (res === 'review') label = c('cyan', 'review');
|
||||
else if (res === 'verify_failed') label = c('red', 'verify failed');
|
||||
else if (res === 'timeout') label = c('red', 'timeout');
|
||||
else if (res === 'rejected') label = c('red', 'rejected');
|
||||
else if (res === 'failed') label = c('red', 'failed rc=' + (m.exit || '?'));
|
||||
else label = c('gray', res || '?');
|
||||
out.push(
|
||||
` ${mark} ${trunc(m.job || '?', 34).padEnd(34)} ` +
|
||||
`${c('gray', (m.engine || '').padEnd(7))} ` +
|
||||
`${ok ? c('green', 'done') : c('red', 'failed rc=' + (m.exit || '?'))} ${c('gray', when)}`
|
||||
`${label} ${c('gray', when)}`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
0
agent-queue/queue/shipped/.gitkeep
Normal file
0
agent-queue/queue/shipped/.gitkeep
Normal file
0
agent-queue/queue/testing/.gitkeep
Normal file
0
agent-queue/queue/testing/.gitkeep
Normal file
@ -59,12 +59,45 @@ printf '%s\n' '---' 'engine: devin' "cwd: $work" 'yolo: true' '---' '' '# self-t
|
||||
DEVIN_BIN="$stub" "$AQ" add "$task" >/dev/null
|
||||
DEVIN_BIN="$stub" "$AQ" run --once >/dev/null 2>&1
|
||||
|
||||
if ls "$AGENT_QUEUE_ROOT"/done/*.md >/dev/null 2>&1; then
|
||||
pass "init/add/run --once → task landed in done/"
|
||||
if ls "$AGENT_QUEUE_ROOT"/review/*.md >/dev/null 2>&1; then
|
||||
pass "no-verify cycle → task parked in review/"
|
||||
else
|
||||
echo "--- queue state ---" >&2
|
||||
ls -R "$AGENT_QUEUE_ROOT" >&2 || true
|
||||
fail "no-op cycle did not complete (expected a file in done/)"
|
||||
fail "no-op cycle did not complete (expected a file in review/)"
|
||||
fi
|
||||
|
||||
# 5. verify-pass gate: rc=0 + passing verify → testing/, then manual ship → shipped/
|
||||
task2="$tmp/task-verify.md"
|
||||
printf '%s\n' '---' 'engine: devin' "cwd: $work" 'yolo: true' 'verify: true' '---' '' '# self-test verify-pass task' > "$task2"
|
||||
DEVIN_BIN="$stub" "$AQ" add "$task2" >/dev/null
|
||||
DEVIN_BIN="$stub" "$AQ" run --once >/dev/null 2>&1
|
||||
if ls "$AGENT_QUEUE_ROOT"/testing/*.md >/dev/null 2>&1; then
|
||||
pass "verify-pass cycle → task promoted to testing/"
|
||||
else
|
||||
echo "--- queue state ---" >&2
|
||||
ls -R "$AGENT_QUEUE_ROOT" >&2 || true
|
||||
fail "verify-pass cycle did not reach testing/ (expected a file in testing/)"
|
||||
fi
|
||||
shipjob="$(basename "$(ls -1t "$AGENT_QUEUE_ROOT"/testing/*.md | head -1)" .md)"
|
||||
"$AQ" ship "$shipjob" >/dev/null 2>&1
|
||||
if ls "$AGENT_QUEUE_ROOT"/shipped/*.md >/dev/null 2>&1; then
|
||||
pass "manual ship → task landed in shipped/"
|
||||
else
|
||||
fail "ship did not move job to shipped/"
|
||||
fi
|
||||
|
||||
# 6. verify-fail gate: rc=0 + failing verify → failed/
|
||||
task3="$tmp/task-verifyfail.md"
|
||||
printf '%s\n' '---' 'engine: devin' "cwd: $work" 'yolo: true' 'verify: false' '---' '' '# self-test verify-fail task' > "$task3"
|
||||
DEVIN_BIN="$stub" "$AQ" add "$task3" >/dev/null
|
||||
DEVIN_BIN="$stub" "$AQ" run --once >/dev/null 2>&1
|
||||
if ls "$AGENT_QUEUE_ROOT"/failed/*verifyfail*.md >/dev/null 2>&1; then
|
||||
pass "verify-fail cycle → task routed to failed/"
|
||||
else
|
||||
echo "--- queue state ---" >&2
|
||||
ls -R "$AGENT_QUEUE_ROOT" >&2 || true
|
||||
fail "verify-fail cycle did not route to failed/"
|
||||
fi
|
||||
|
||||
# status must not error
|
||||
|
||||
Loading…
Reference in New Issue
Block a user