docs(agent-queue): harden gigafactory roadmap after principal review
Fix correctness/distributed-systems bugs and fill gaps in place:
- atomic claim (optimistic concurrency/_etag), fencing token (leaseEpoch),
coordinator-authoritative time added to core contract + scheduler + factory
- lease reclaim via coordinator reaper, not Cosmos TTL (TTL only GCs rows)
- split-brain/partition safety: fencing + distributed lock + quarantine
- budget: wall is the only hard ceiling; usd/tokens best-effort (provider metering)
- SSE live logs cannot use the buffering tracker proxy; use a streaming route +
blob log storage (fleet_artifacts container)
- manifest: capability grammar, engine-class enum, idempotency 409 + deps-satisfied
semantics, dep cycle detection
- tracker status mapping table + PR-flow ship semantics (merged+green vs pr-opened)
- station/seat capacity, factory health definition, enrollment/bootstrap auth
- Cosmos RU/indexing + claim-loop poll cost; add new sections: rollout/rollback &
data migration (§21), capacity planning & cost (§22), ownership & RACI (§23)
- success metrics now carry provisional SLO targets; Phase 2 checklist + index synced