8.7 KiB
Ecosystem Agent Runtime Contract
Status: Phase 5 baseline implemented Owner:
learning_ai_common_platReference inputs:claw-code-oss,claw-cowork,learning_ai_trails,learning_ai_flowmonk,learning_ai_jarvis_jrPurpose: Standardize session state, task state, resume behavior, dispatch semantics, approvals, and audit hooks across agent-capable products.
1. Problem
The ecosystem already has multiple agent-runtime ideas:
claw-coderuntime sessions, todos, project memory, resume, MCP lifecycleclaw-coworktask orchestration, dispatch, scheduling, approvals, audit logging- FlowMonk planning/execution
- JarvisJr coaching/delegation concepts
- ActionTrail review and replay
Without a shared runtime contract:
- each repo reinvents session models
- handoff and resume become inconsistent
- audit/replay becomes lossy
- approvals cannot be shared cleanly
2. Goals
- Define the canonical runtime state model.
- Define session continuity and resume semantics.
- Define dispatch and handoff metadata.
- Define approval checkpoints and audit hooks.
- Allow multiple implementations while preserving one contract.
3. Non-Goals
- Forcing all agent products to use one codebase.
- Standardizing UI/UX across all agent surfaces.
- Replacing product-specific orchestration logic.
4. Core Entities
The shared runtime contract should define:
AgentSessionAgentTaskAgentTodoAgentRunAgentApprovalCheckpointAgentDispatchRequestAgentHandoffAgentActionLog
5. Minimum Session Shape
type AgentSession = {
sessionId: string;
productId: string;
userId: string;
status: 'active' | 'paused' | 'waiting-approval' | 'completed' | 'failed' | 'cancelled';
startedAt: string;
updatedAt: string;
resumable: boolean;
currentTaskId?: string | null;
memoryRefs: string[];
artifactRefs: string[];
approvalRefs: string[];
dispatchContext?: AgentDispatchContext | null;
};
type AgentTask = {
taskId: string;
sessionId: string;
title: string;
intent: string;
status: 'queued' | 'running' | 'blocked' | 'completed' | 'failed' | 'cancelled';
priority?: string;
createdAt: string;
updatedAt: string;
};
type AgentTodo = {
todoId: string;
sessionId: string;
text: string;
status: 'open' | 'in-progress' | 'done' | 'dropped';
createdAt: string;
updatedAt: string;
};
6. Required Runtime Behaviors
Every compliant implementation should support:
- session creation
- resumable state checkpoints
- todo/task updates during execution
- approval checkpoints
- action-log emission
- artifact emission
- dispatch metadata when execution originates elsewhere
- replayability in ActionTrail
7. Dispatch Model
The contract should support:
- browser-originated requests
- mobile-originated requests
- desktop-originated requests
- inter-product dispatch
- trusted desktop executor dispatch
Example:
type AgentDispatchContext = {
originSurface: 'browser' | 'mobile' | 'desktop' | 'web' | 'product-api';
originProductId: string;
dispatchMode: 'interactive' | 'queued' | 'scheduled' | 'remote';
initiatedAt: string;
};
8. First Implementations
The first conforming runtime integrations should target:
oss/learning_ai_claw-coworklearning_ai_trailslearning_ai_flowmonklearning_ai_jarvis_jr
Later:
learning_voice_ai_agenttransformation workflows- shared operator tools in
learning_ai_common_plat
9. Key Open Decisions
- How much of
claw-codetodo/session semantics should be adopted directly vs normalized? - Should scheduled runs create new sessions or new runs under one session?
- What is the minimum checkpoint payload required for resume-anywhere?
- Which runtime actions must always emit ActionTrail logs?
- How should worktree-isolated code tasks be represented vs non-code tasks?
10. Lifecycle Boundaries
The current runtime model now uses these boundaries:
AgentSessionA durable container for related work over time. Sessions can outlive individual runs and can stay resumable even after one run finishes.AgentRunA concrete execution instance. A run is the thing that can bequeued,running,paused,waiting-approval,completed,failed, orcancelled.AgentTaskA user-meaningful unit of intent inside a session. Tasks should remain stable enough to describe the work, even when execution is retried or rescheduled.AgentTodoA smaller actionable checklist item. When a product has no separate checklist model yet, it may temporarily project todos from its native task backlog, but that mapping must be called out explicitly.
Interpretation rules:
queuedmeans execution has not started yet.pausedmeans execution started and is intentionally halted or deferred.waiting-approvalmeans the run is blocked on human review.- a session may contain multiple runs over time
- a task may survive multiple runs if execution is retried, resumed, or rescheduled
- todos should never imply a separate execution history unless the product truly tracks that internally
Current product mappings:
- Cowork
- session: persisted/resumable workspace session
- task: orchestrator task
- run: orchestrator execution of that task
- todo: interim task-backed projection until a first-class todo source exists
- FlowMonk
- session: user planning workspace
- task: planning backlog task
- run: scheduled entry
- approval: agent-suggested schedule entry awaiting or receiving confirmation
- todo: task-backed projection until a distinct checklist primitive exists
11. Checkpoint And Resume Semantics
Every runtime implementation that claims resumable: true should be able to produce a stable
checkpoint envelope for the current session or run.
Minimum checkpoint shape:
type AgentCheckpoint = {
checkpointId: string;
sessionId: string;
runId?: string | null;
productId: string;
userId: string;
createdAt: string;
statusAtCapture:
| 'queued'
| 'running'
| 'paused'
| 'waiting-approval'
| 'completed'
| 'failed'
| 'cancelled';
currentTaskId?: string | null;
todoIds: string[];
artifactRefs: string[];
memoryRefs: string[];
approvalRefs: string[];
dispatchContext?: AgentDispatchContext | null;
resumeToken?: string | null;
stateSummary: {
title: string;
summary: string;
lastActionAt?: string | null;
};
};
Required semantics:
- a checkpoint captures enough context to resume work without re-deriving user intent from scratch
- a checkpoint may point at a
runId, but it must always belong to onesessionId resumeTokenis product-defined, but it must be stable enough for the same product runtime to reopen the session safely- a resumed run should preserve the same
sessionIdand should create a newrunIdonly if the product treats the resumed execution as a new execution instance - checkpoint creation should append to action/audit history instead of overwriting earlier state
- a session must not be marked
resumable: trueunless the product can actually restore from the latest checkpoint or equivalent persisted state
Current product expectations:
- Cowork
- the persisted workspace/session state is the effective checkpoint source
- a resumed task may create a new run while preserving the same session
- FlowMonk
- the schedule/planning workspace is the effective checkpoint source
- scheduled entries can resume planning context even when no long-running executor is active
12. Acceptance Criteria
- A dispatched Cowork task can be resumed after interruption without losing audit continuity.
- A FlowMonk execution can emit task/todo state using the same contract.
- ActionTrail can replay a run using the shared action-log structure.
- Approval checkpoints can be handed off to Auth App without losing run context.
- Product-specific runtimes can remain different internally while still producing the same contract externally.
13. Implementation Checklist
- finalize entity list and minimum required fields
- define run vs session vs task boundaries
- define checkpoint/resume semantics
- define dispatch payload contract
- define action-log hook points
- define ActionTrail replay requirements
- define first conforming implementation plan for Cowork and FlowMonk
Commits:
eae3409drafted the initial stub3f2482badded the baseline runtime schemas for dispatch, session, task, todo, run, approval, and action logs97b731eadded the Cowork task-backedAgentTodoprojectionfaf93ecadded FlowMonk directAgentApprovalCheckpointand task-backedAgentTodoprojectionsff8c5ebpromotedqueuedto a first-classAgentRunstate