Proposes moving fleet work-dispatch off Cosmos busy-polling onto Azure Service Bus in a coordinator-owns-scheduling / broker-owns-delivery hybrid, fixing the product-as-queue routing smell and the idle-poll RU cost. Includes phased migration (M0 RU quick win -> shadow -> cutover -> scale-to-zero) with a ticked checklist. Self-reviewed (v2) for the outbox/change-feed, message-size, long-job lock, idempotency, and routing-model consistency issues. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
1.3 KiB
1.3 KiB
Gigafactory — Agent-Queue Docs
Source-of-truth specs and the system overview for Agent Gigafactory, the
fleet-coordination layer that turns the single-host agent-queue runner into a
multi-host factory of autonomous coding agents.
Contents
| Doc | What it is |
|---|---|
GIGAFACTORY_ROADMAP.md |
The canonical source-of-truth spec: architecture, the evolved job manifest, scoring formula, lifecycle/retry, enrollment, and the phased checklists (§1–§17). Job specs in ../jobs/ point here. |
GIGAFACTORY_SYSTEM_OVERVIEW.md |
A narrative overview of how the pieces fit together end-to-end, with a code-map of the relevant files across both repos. |
FLEET_DISPATCH_REDESIGN.md |
Phase-4 design proposal (no code): broker-backed (Azure Service Bus) dispatch + on-demand factories that fixes the product-as-queue routing smell and the idle-poll Cosmos RU cost. Phased migration starting with a zero-infra RU quick win. |
Related docs in the other repo
The platform-service backend and the tracker-web UI live in
learning_ai_common_plat. Its Gigafactory docs (roadmap-completion audit,
remaining-task checklist, Phase-3 progress, and the fleet control-plane guide)
are under docs/GIGAFACTORY/ there.