New §24 + feature-catalog row: - two delegation modes: atomic (leaf bug/feature/task) vs composite (roadmap/epic) - introduce job kind (leaf|composite); composite routes to a planner/orchestrator that fans out child leaf jobs as a DAG across factories/agents/profiles - parentId hierarchy + rollup semantics (status/budget/verify/phase-gates) + idempotent re-run (skip shipped children) - source-of-truth/sync discipline (one record referenced by many; one-way echo) - HYBRID decision recorded: model kind/parentId/rollup in the fleet layer now, keep shared tracker ITEM_TYPES unchanged (label kind:roadmap), promote to a first-class epic type later via additive migration once proven - phasing: leaf-only P1-P2; manual composite P3; auto-decomposition planner P3->P5 |
||
|---|---|---|
| .. | ||
| docs | ||
| queue | ||
| .gitignore | ||
| agent-queue.sh | ||
| dashboard.mjs | ||
| README.md | ||
| selftest.sh | ||
agent-queue
A zero-dependency folder "kanban" runner for headless coding-agent CLIs —
Devin, Claude Code, and OpenAI Codex. Drop prompt .md files into a folder,
and they get executed (in auto-approve mode) one slot at a time, moving through
inbox → building → review → testing → shipped (plus failed) with live status.
Vision & roadmap: where this is headed — a distributed multi-machine "gigafactory" (fleet of factories × tools × profiles, scheduler-routed, built on platform-service + tracker-web) — is specified as a checklist-driven implementation roadmap in
docs/GIGAFACTORY_ROADMAP.md.
Build/ship lifecycle — auto-QA, manual ship:
inbox ─▶ building ─▶ review ─▶ testing ─▶ shipped
(queued) (agent (rc=0; (verify (you ran
running) awaiting passed — `ship`)
verify) QA gate)
│
agent rc≠0 / │ verify fails
timeout ──────────┴──────────────▶ failed
- Auto: agent exits 0 →
review/. If averify:command is configured it runs automatically: pass →testing/(QA), fail →failed/. Noverify:→ the job parks inreview/for a manualpromote. - Manual: you
shipatesting/job →shipped/(the human gate). Shipping is never automatic.
Why this exists: the agent CLIs ship a minimal local interface (no built-in batch/queue/dashboard — that lives in their cloud products). This is the zero-dependency bash glue that turns "run one prompt interactively" into "queue many and walk away."
Quick start
cd learning_ai_devops_tools/agent-queue
chmod +x agent-queue.sh
./agent-queue.sh init
# queue a roadmap for Devin, running in the tracker-web repo, auto-approving everything
./agent-queue.sh add ~/roadmaps/UX-2.md \
--engine devin \
--cwd /Users/sd9235/code/mygh/learning_ai_common_plat/dashboards/tracker-web \
--yolo
# start processing (foreground; Ctrl-C to stop). Run up to 3 agents at once (default).
./agent-queue.sh run --max 3
In a second terminal, watch progress:
./agent-queue.sh watch
AGENT QUEUE /…/agent-queue/queue
inbox 3 building 2 review 1 testing 2 shipped 5 failed 0 running 2/2
RUNNING
20260528-2130__UX-2 devin 4m12s pid 51234 ⏺ Edited src/app/dashboard/items/page.tsx
20260528-2131__UX-3 claude 1m02s pid 51290 Running: pnpm typecheck
How a task is configured
Each .md carries optional frontmatter telling the runner which engine to use,
which directory to run in, and whether to auto-approve:
---
engine: devin # devin | claude | codex (default: $AGENT_QUEUE_ENGINE)
cwd: /abs/path/to/repo # where the agent executes (default: cwd when added)
yolo: true # auto-approve ALL tools (default: true)
lock: my-repo # optional mutex key (default: cwd). Jobs sharing a key run serially
timeout: 45m # optional. 90s|45m|2h|1d. On expiry → failed (result=timeout)
verify: pnpm -s test # optional auto-QA gate. Runs in cwd after rc=0:
# pass → testing/ (QA), fail → failed/
# (omit to park in review/ for manual promote)
---
# Your task / roadmap goes here
...
add --engine/--cwd/--yolo will inject this frontmatter for you if the file doesn't
already have a --- block.
Engine mapping
engine: |
Command run | Auto-approve flag (yolo: true) |
|---|---|---|
devin |
devin -p --prompt-file <body> |
--permission-mode dangerous |
claude |
claude -p (body on stdin) |
--dangerously-skip-permissions |
codex |
codex exec (body on stdin) |
--dangerously-bypass-approvals-and-sandbox |
The frontmatter is stripped before the body reaches the agent, and
claude/codex receive it on stdin so a body starting with -- is never
misparsed as a flag.
Flags drift between CLI versions — if one changes, edit
build_agent_cmd()inagent-queue.sh(it's the single place each engine is mapped).
Commands
| Command | What it does |
|---|---|
init |
create the queue/ folders |
add <file> [--engine E] [--cwd P] [--yolo|--no-yolo] |
queue a prompt into inbox/ |
run [--max N] [--engine E] [--once] |
process the inbox (foreground loop) |
status |
kanban counts + running-worker table (marks ⚠ stalled workers) |
watch [interval] |
live status (bash), redrawn every N seconds (default 2) |
dash [--interval N] |
interactive Node dashboard — navigable numbered job list with single-key actions (see below) |
stop |
kill running workers + the run loop |
logs <job> [-f] |
print / follow a job's log |
promote <job> |
advance one stage forward: review → testing → shipped |
ship <job> |
manual gate: move a testing/ (QA) job → shipped/ |
reject <job> |
send a review/ or testing/ job → failed/ |
requeue <job> |
move a failed/review/testing job back to inbox/ for a fresh run |
clean [--keep N] |
archive finished logs+meta beyond the newest N (default 50) into queue/.archive/ |
Only one run loop may be active per queue — a second run against the same
queue is refused while the first is alive (a stale daemon.pid is cleared).
Interactive dashboard (dash)
dash is a single-script, menu-driven control panel (think a tiny "glassbox").
It shows the kanban counts, live RUNNING workers (engine, elapsed, last log
line, stall), a navigable numbered JOBS list, and RECENT finished jobs — and
lets you act on jobs without leaving the screen. Every action shells out to
agent-queue.sh, so the script stays the single source of truth.
| Key | Action |
|---|---|
↑/↓, j/k, 1–9 |
select a job in the JOBS list |
enter / l |
view the selected job's log (live, auto-refreshing) |
p |
promote (review → testing → shipped) |
s |
ship (testing/QA → shipped, the manual gate) |
x |
reject (review/testing → failed) — asks y/n |
u |
requeue (failed/review/testing → inbox) — asks y/n |
r |
start the run loop (detached → logs/run-loop.log) |
S |
stop the run loop + running workers |
g |
refresh now · ?/h help · q/Ctrl-C quit |
The header shows a ● run loop pid N / ○ run loop stopped indicator. Run it in
a TTY for the interactive mode; piped/non-TTY it falls back to a read-only live view.
Via bytelyst-cli.sh
Wired into the repo's unified CLI (no GitHub token required for this subcommand):
./bytelyst-cli.sh agent-queue run --max 3 # full passthrough
./bytelyst-cli.sh aq status # short alias
Folder layout
queue/
inbox/ # drop / queued .md files (oldest eligible picked first)
building/ # currently executing (agent running)
review/ # agent exited 0 — awaiting the auto-QA verify gate (or manual promote)
testing/ # verify passed (QA) — awaiting manual `ship`
shipped/ # manually shipped — the terminal success stage
failed/ # non-zero exit, bad cwd, timeout, verify failure, or manual reject
logs/ # <job>.log — full agent + verify output
locks/ # per-key flock files (Linux hardening; unused on macOS)
.state/ # <job>.meta heartbeats + daemon.pid (runtime only)
.archive/ # <ts>/ — logs+meta moved here by `clean`
result= values written to <job>.meta: review, testing, shipped,
failed, timeout, verify_failed, rejected, requeued.
Config (env overrides)
| Var | Default | Meaning |
|---|---|---|
AGENT_QUEUE_ROOT |
./queue |
where the kanban folders live |
AGENT_QUEUE_MAX |
3 |
max concurrent agents (override per-run with run --max N) |
AGENT_QUEUE_ENGINE |
devin |
default engine when none in frontmatter |
AGENT_QUEUE_POLL |
3 |
inbox poll interval (seconds) |
AGENT_QUEUE_VERIFY |
(empty) | default auto-QA verify command; per-job verify: overrides it |
AGENT_QUEUE_STALL_MIN |
10 |
minutes of unchanged log before a worker is ⚠ stalled |
DEVIN_BIN / CLAUDE_BIN / CODEX_BIN |
autodetected | override CLI binary paths |
FLOCK_BIN / TIMEOUT_BIN |
autodetected | flock (lock hardening) and timeout/gtimeout (hard timeouts); absent on stock macOS — see notes |
⚠️ Safety
Running agents with yolo: true means no approval prompts — they will edit files,
run shell commands, and commit unattended. Mitigate:
- Prefer scope-locked prompt files (e.g. "edit only under
dashboards/tracker-web/"). - Tell prompts not to
git push— review commits before they leave your machine. - Same-repo safety is automatic: jobs sharing a
cwd(orlock:key) are serialized, so two agents never run in one repo at once — even at--max 2+. - Set a
timeout:on long jobs so a wedged agent can't run forever. - Watch cost: each job is a full agent session.
Portability notes
- macOS has no
flock/timeout; locking relies on the single run-loop (enforced by the second-run refusal) and timeouts use a pure-bash watchdog. Install coreutils (gtimeout) for hard process-tree kills. - Linux (incl. Gitea CI) uses
flock+timeoutfor cross-process hardening.
Roadmap / nice-to-haves
- Per-repo lock to serialize same-repo jobs automatically (
lock:/ cwd). - Per-job
timeout:with hard kill (or bash watchdog fallback). - Stall detection in
status/dash. requeuefailed jobs +clean/archive old runs.- Build/ship lifecycle:
review → testing → shippedwith auto-QAverify:gate + manualship. --pushopt-in policy + commit review gate.- Optional notifications (Slack/desktop) on done/failed/stall.
- Persisted run-loop as a daemon/service with auto-restart.