docs(runtime): define checkpoint and resume semantics

This commit is contained in:
Saravana Achu Mac 2026-04-04 11:44:26 -07:00
parent 6eaf5980a7
commit 2ad979ce19
3 changed files with 65 additions and 5 deletions

View File

@ -203,7 +203,68 @@ Current product mappings:
---
## 11. Acceptance Criteria
## 11. Checkpoint And Resume Semantics
Every runtime implementation that claims `resumable: true` should be able to produce a stable
checkpoint envelope for the current session or run.
Minimum checkpoint shape:
```ts
type AgentCheckpoint = {
checkpointId: string;
sessionId: string;
runId?: string | null;
productId: string;
userId: string;
createdAt: string;
statusAtCapture:
| 'queued'
| 'running'
| 'paused'
| 'waiting-approval'
| 'completed'
| 'failed'
| 'cancelled';
currentTaskId?: string | null;
todoIds: string[];
artifactRefs: string[];
memoryRefs: string[];
approvalRefs: string[];
dispatchContext?: AgentDispatchContext | null;
resumeToken?: string | null;
stateSummary: {
title: string;
summary: string;
lastActionAt?: string | null;
};
};
```
Required semantics:
1. a checkpoint captures enough context to resume work without re-deriving user intent from scratch
2. a checkpoint may point at a `runId`, but it must always belong to one `sessionId`
3. `resumeToken` is product-defined, but it must be stable enough for the same product runtime to
reopen the session safely
4. a resumed run should preserve the same `sessionId` and should create a new `runId` only if the
product treats the resumed execution as a new execution instance
5. checkpoint creation should append to action/audit history instead of overwriting earlier state
6. a session must not be marked `resumable: true` unless the product can actually restore from the
latest checkpoint or equivalent persisted state
Current product expectations:
- Cowork
- the persisted workspace/session state is the effective checkpoint source
- a resumed task may create a new run while preserving the same session
- FlowMonk
- the schedule/planning workspace is the effective checkpoint source
- scheduled entries can resume planning context even when no long-running executor is active
---
## 12. Acceptance Criteria
1. A dispatched Cowork task can be resumed after interruption without losing audit continuity.
2. A FlowMonk execution can emit task/todo state using the same contract.
@ -213,11 +274,11 @@ Current product mappings:
---
## 12. Implementation Checklist
## 13. Implementation Checklist
- [x] finalize entity list and minimum required fields
- [x] define run vs session vs task boundaries
- [ ] define checkpoint/resume semantics
- [x] define checkpoint/resume semantics
- [x] define dispatch payload contract
- [x] define action-log hook points
- [ ] define ActionTrail replay requirements

View File

@ -272,7 +272,7 @@ These should be resolved before claiming the ecosystem docs are fully implementa
- Cowork: add `AgentTodo` direct projection once the product exposes first-class todo entities.
- FlowMonk: replace the current derived approval/todo projections once the product exposes first-class approval/todo primitives.
- Shared runtime: define checkpoint/resume payload semantics and ActionTrail replay requirements.
- Shared runtime: define ActionTrail replay requirements.
### 6.2 Explicit Blockers And Questions

View File

@ -112,7 +112,6 @@ Observed baseline:
- Cowork now emits shared runtime projections from cowork-service, preserves Rust-side canonical event IDs on approval/audit records, and still lacks a first-class native `AgentTodo` product source behind the current task-backed projection.
- FlowMonk now emits direct runtime projections for planning sessions, tasks, todos, runs, approvals, and action logs, but it still lacks distinct native checklist and approval entities behind those projections.
- checkpoint/resume payload semantics still need to be formalized in the contract.
- ActionTrail replay requirements still need to be formalized in the contract.
## 8. Explicit Blockers And Questions