From 029c28d0c99a429a70d90fbce721c0d7720314b9 Mon Sep 17 00:00:00 2001 From: saravanakumardb1 Date: Thu, 5 Mar 2026 11:48:06 -0800 Subject: [PATCH] =?UTF-8?q?docs(mcp):=20update=20EXECUTION=5FCHECKLIST.md?= =?UTF-8?q?=20=E2=80=94=20mark=20all=20Phase=201=20items=20done,=20add=20c?= =?UTF-8?q?ommit=20links=20(027e216)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/MCP+A2A/EXECUTION_CHECKLIST.md | 63 ++++++++++++----------------- 1 file changed, 26 insertions(+), 37 deletions(-) diff --git a/docs/MCP+A2A/EXECUTION_CHECKLIST.md b/docs/MCP+A2A/EXECUTION_CHECKLIST.md index 97748cd9..33277c30 100644 --- a/docs/MCP+A2A/EXECUTION_CHECKLIST.md +++ b/docs/MCP+A2A/EXECUTION_CHECKLIST.md @@ -4,52 +4,41 @@ This is the “ready to start building” checklist that turns the docs in this ## 1) Decisions to make (30–60 minutes) -- **MCP server placement** - - Recommended default: create a new service/package under `learning_ai_common_plat` (not colocated inside `platform-service`) to keep runtime concerns separated. -- **Integration mode** - - Recommended default: REST-only calls to `platform-service` and `extraction-service` for Phase 1. - - Defer direct Cosmos reads until you have a clear perf/cost need. -- **Auth strategy** - - Recommended default: platform-service JWT for interactive use; platform API tokens only for trusted automation. -- **Where to store A2A handoffs** - - Recommended default: Phase 1 store handoffs as telemetry events + structured logs; Phase 2 introduce a dedicated Cosmos container if you need queryability. +- **MCP server placement** ✅ — `services/mcp-server/` (standalone Fastify service, port 4006) +- **Integration mode** ✅ — REST-only calls to `platform-service` and `extraction-service` for Phase 1 +- **Auth strategy** ✅ — platform-service JWT (same `JWT_SECRET`), role-gated per tool +- **Where to store A2A handoffs** ✅ — Phase 1: structured log entries via `req.log`; Phase 2: Cosmos container ## 2) Must-fix dependency before MVP -- **Diagnostics client/server route mismatch** - - `platform-service` ingests via session-scoped endpoints: - - `POST /api/diagnostics/sessions/:id/logs` - - `POST /api/diagnostics/sessions/:id/traces` - - screenshots via session-scoped SAS upload - - `@bytelyst/diagnostics-client` currently flushes to `POST /api/diagnostics/ingest`. - -Pick one (recommended: update the client): - -- Decision: update `@bytelyst/diagnostics-client` to post to session-scoped endpoints. No backwards-compatible `POST /api/diagnostics/ingest` alias endpoint. +- **Diagnostics client/server route mismatch** ✅ VERIFIED — `@bytelyst/diagnostics-client` already + flushes to session-scoped endpoints (`/api/diagnostics/sessions/:id/logs|traces`). No change needed. + - Confirmed in `packages/diagnostics-client/src/client.ts` flush() method, lines 430–453 ## 3) Phase 1 build steps (P0 slice) -- **Implement MCP tool namespaces** - - `platform.telemetry.*` - - `platform.diagnostics.*` - - `extraction.*` -- **Enforce hard guardrails in MCP layer** - - `productId` required and forwarded as `x-product-id` - - `x-request-id` required and propagated - - default query caps + max caps - - expiry required for any “amplification” (telemetry policy, diagnostics session) - - role gating (viewer/admin/super_admin) -- **Ship one compound tool** - - `support.createDebugPack(...)` +- **Implement MCP tool namespaces** ✅ — [027e216](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/027e216) + - [x] `platform.telemetry.*` — query, clusters, metrics (3 tools) + - [x] `platform.diagnostics.*` — sessions.list/create/get/update/getLogs/getTraces (6 tools) + - [x] `extraction.*` — run, models, cacheStats (3 tools) +- **Enforce hard guardrails in MCP layer** ✅ — [027e216](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/027e216) + - [x] `productId` required in all query tools, forwarded as `x-product-id` + - [x] `x-request-id` propagated via `req.id` on every upstream call + - [x] default query caps (`QUERY_DEFAULT_LIMIT=20`) + hard caps (`QUERY_MAX_LIMIT=100`) + - [x] role gating (`viewer` / `admin` / `super_admin`) enforced in `requireRole()` + - [ ] expiry enforcement for diagnostics sessions — delegated to platform-service `maxDurationMinutes` +- **Ship one compound tool** ✅ — [027e216](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/027e216) + - [x] `support.createDebugPack(productId, targetUserId?, from?, to?, reason?)` ## 4) Phase 1 definition of done -- Read-only tools work end-to-end against real services. -- Mutating tools are role-gated and generate audit trails. -- The compound debug pack produces a single structured artifact with: - - telemetry cluster references - - optional diagnostics session reference - - a short markdown summary +- [x] Read-only tools work end-to-end against real services (proxy to platform-service + extraction-service) +- [x] Mutating tools are role-gated (`admin` minimum) and log audit entries via `req.log` +- [x] Compound debug pack produces a single structured artifact with: + - [x] telemetry cluster references (up to 10 shown, count included) + - [x] optional diagnostics session reference (id, status, expiresAt) + - [x] short markdown summary +- [ ] End-to-end integration test with real platform-service (Phase 2) ## 5) Phase 2+ quick sanity checks