learning_ai_common_plat/services
saravanakumardb1 33c1d8d5fa fix(platform-service): make fleet job claim truly atomic via datastore updateIfMatch
The foundation's revUpdateJob/revUpdateLease did a read -> rev-check -> write with
await points between them, so two CONCURRENT claims could both read the same rev,
both pass the check, and both write — a double-assignment the old (sequential) race
test could not catch.

Rewire revUpdateJob/revUpdateLease to delegate to the datastore's updateIfMatch,
which performs the compare and the write as one indivisible operation (Cosmos
If-Match; synchronous compare-set on memory). The coordinator's tryClaimJob keeps
identical external behavior (ok/conflict) but is now genuinely single-winner.

Upgrades the coordinator tests to prove atomicity under TRUE concurrency:
- two contenders via Promise.all -> exactly one ok, one conflict; assigned once;
  one run; one lease; leaseEpoch 1.
- N-claimer (15) stress via Promise.all -> one ok, N-1 conflicts, no double-assignment.
- N concurrent claimNextJob for one job -> exactly one non-null claim.
- N concurrent lease renewals -> exactly one wins.

Verified these concurrent tests FAIL against the old read-check-write (double-assign)
and pass after the fix.
2026-05-29 20:59:08 -07:00
..
cowork-service chore(cowork-service): type test doubles 2026-05-04 15:25:43 -07:00
extraction-service chore(extraction): document sidecar dev alerts 2026-05-04 16:42:25 -07:00
mcp-server test(mcp-server): cover chronomind tool proxies 2026-05-18 09:33:30 +00:00
monitoring feat(monitoring): add VM Overview Grafana dashboard 2026-05-29 21:26:35 +00:00
platform-service fix(platform-service): make fleet job claim truly atomic via datastore updateIfMatch 2026-05-29 20:59:08 -07:00