# Fleet Control Plane — Operational Guide > Phase 3 of the Agent Gigafactory. Adds tunable scoring, preemption, DAG decomposition, per-product budgets, and a tracker-web UI. ## Feature Flags All Phase 3 features are **gated behind environment variables** (default OFF) for safe rollout: | Flag | Default | Effect | | ------------------ | ------- | ----------------------------------------------------------------------------- | | `FLEET_PREEMPTION` | `""` | Enables seat-limit enforcement + critical-job preemption | | `FLEET_BUDGETS` | `""` | Enables per-product USD ceiling enforcement. Pauses jobs when budget exceeded | Set to any truthy value (`"1"`, `"true"`, `"yes"`) to enable. ## Tunable Scoring Weights Scoring determines which queued job a factory picks up next. The formula: ``` score = w.age * ageMinutes + w.priority * priorityOrder + w.retries * attempts + w.capabilities * capabilityBonus ``` ### Weight Resolution Order 1. **Per-request override** — `weights` field in `POST /fleet/jobs/:id/claim` body 2. **Product registry** — set via `setWeightRegistry({ [productId]: weights })` 3. **Defaults** — `{ age: 1, priority: 10, retries: -2, capabilities: 5 }` Each level does a **per-field merge** (not full object replacement). ## Preemption When `FLEET_PREEMPTION` is enabled and a factory is at its `seatLimit`: 1. A critical-priority job arrives in `claimNextJob` 2. `selectPreemptionVictim(runningJobs, incomingJob)` picks the lowest-scoring running job 3. The victim is evicted: its lease is released with `checkpoint: true`, ensuring the job can resume 4. The critical job takes the freed seat 5. An event `{ type: 'preempted', victim, preemptor }` is recorded **Rules:** - Only `critical` priority can trigger preemption - Never preempts jobs of equal or higher priority - Capability mismatch disqualifies a factory from preemption ## DAG Job Decomposition Submit a composite job with children for parallel fan-out: ```http POST /fleet/jobs { "idempotencyKey": "parent-job", "kind": "composite", "children": [ { "idempotencyKey": "child-1", "bodyMd": "..." }, { "idempotencyKey": "child-2", "bodyMd": "..." } ] } ``` Or add children later: ```http POST /fleet/jobs/:parentId/children { "children": [ { "idempotencyKey": "child-3", "bodyMd": "..." } ] } ``` **Behavior:** - Parent is automatically blocked until all children complete (children's idempotency keys become parent deps) - Children unblock parent via `maybeUnblockParent()` when transitioning to `shipped`/`done` - View the full DAG: `GET /fleet/jobs/:id/dag` ## Per-Product Budgets Control spend per product with USD ceilings: ```http PUT /fleet/budgets/:productId { "ceilingUsd": 100, "window": "monthly" } ``` | Endpoint | Method | Effect | | ---------------------------------- | ------ | ----------------------- | | `/fleet/budgets/:productId` | GET | Read current budget | | `/fleet/budgets/:productId` | PUT | Create/update ceiling | | `/fleet/budgets/:productId/pause` | POST | Manually pause spending | | `/fleet/budgets/:productId/resume` | POST | Resume spending | **Enforcement:** When `FLEET_BUDGETS` is enabled, `claimNextJob` checks budget status FIRST. If paused or ceiling exceeded → returns null (no job scan). **Auto-pause:** `accrueSpend(productId, amount)` auto-pauses when `spentUsd >= ceilingUsd`. ## Fleet Control Plane UI (tracker-web) Navigate to **Dashboard → Fleet** in tracker-web. ### Pages | Route | Description | | ---------------------------- | ----------------------------------------------- | | `/dashboard/fleet` | Overview — factory health cards + recent jobs | | `/dashboard/fleet/jobs` | Job list with stage filter tabs | | `/dashboard/fleet/jobs/[id]` | Job detail — events, runs, artifacts, DAG, SHIP | | `/dashboard/fleet/budget` | Budget view — spend bar, pause/resume controls | ### Graceful Degradation The UI calls platform-service fleet endpoints via `/api/fleet/[...path]` proxy. If the fleet module returns 404 (flags off), pages display informational empty states instead of errors. ### Configuration | Env Var | Default | Purpose | | ------------------ | ----------------------- | ----------------------------------- | | `PLATFORM_API_URL` | `http://localhost:4003` | Platform-service base URL for proxy | ## API Reference Summary | Endpoint | Method | Phase | Notes | | ---------------------------------- | ------ | ----- | -------------------------------------------------- | | `/fleet/jobs` | GET | 2 | List jobs (query: stage, productId, limit, offset) | | `/fleet/jobs` | POST | 2 | Submit job (+ optional children[] for DAG) | | `/fleet/jobs/:id` | GET | 2 | Get job | | `/fleet/jobs/:id` | PATCH | 2 | Update stage (fenced) | | `/fleet/jobs/:id/claim` | POST | 2 | Factory claims next job | | `/fleet/jobs/:id/children` | POST | 3 | Add children to existing job | | `/fleet/jobs/:id/dag` | GET | 3 | Get DAG subtree | | `/fleet/factories` | GET | 2 | List factories | | `/fleet/factories/:id/heartbeat` | POST | 2 | Factory heartbeat | | `/fleet/budgets/:productId` | GET | 3 | Get budget | | `/fleet/budgets/:productId` | PUT | 3 | Upsert budget | | `/fleet/budgets/:productId/pause` | POST | 3 | Pause budget | | `/fleet/budgets/:productId/resume` | POST | 3 | Resume budget | ## Architecture Decisions 1. **Feature flags default OFF** — zero breaking changes to Phase 2 behavior 2. **Budget checked first** — avoids expensive job scan when budget is exhausted 3. **DAG via deps array** — reuses existing dependency resolution; no new scheduler logic needed 4. **Preemption requires seat limit** — only triggers when factory genuinely can't take more work 5. **UI degrades gracefully** — all API calls handle 404 → null/empty; no hard failures