From 4b4720aebd61a14185a45227556208812fe99ff8 Mon Sep 17 00:00:00 2001 From: saravanakumardb1 Date: Sat, 14 Feb 2026 13:32:46 -0800 Subject: [PATCH] docs(extraction): update roadmap Phase 0 checkboxes with commit c292bb5 --- docs/EXTRACTION_SERVICE_ROADMAP.md | 54 +++++++++++------------------- 1 file changed, 20 insertions(+), 34 deletions(-) diff --git a/docs/EXTRACTION_SERVICE_ROADMAP.md b/docs/EXTRACTION_SERVICE_ROADMAP.md index ad47e693..f4645f18 100644 --- a/docs/EXTRACTION_SERVICE_ROADMAP.md +++ b/docs/EXTRACTION_SERVICE_ROADMAP.md @@ -52,7 +52,7 @@ A shared extraction microservice that uses Google's LangExtract library to extra ### Service scaffold (Fastify) -- [ ] **0.1** Create `services/extraction-service/` directory structure: +- [x] **0.1** Create `services/extraction-service/` directory structure: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ``` services/extraction-service/ src/ @@ -75,22 +75,16 @@ A shared extraction microservice that uses Google's LangExtract library to extra tsconfig.json Dockerfile ``` -- [ ] **0.2** Create `package.json` (`@lysnrai/extraction-service`, port 4005) matching existing service conventions -- [ ] **0.3** Create `tsconfig.json` extending `../../tsconfig.base.json` -- [ ] **0.4** Create `src/lib/config.ts` with Zod schema: - - `PORT` (default 4005), `HOST`, `CORS_ORIGIN` - - `PYTHON_SIDECAR_URL` (default `http://localhost:4006`) - - `DEFAULT_MODEL_ID` (default `gemini-2.5-flash`) - - `GEMINI_API_KEY` or `AZURE_OPENAI_API_KEY` / `AZURE_OPENAI_ENDPOINT` - - `MAX_WORKERS` (default 10), `MAX_CHAR_BUFFER` (default 2000) - - `COSMOS_ENDPOINT`, `COSMOS_KEY`, `COSMOS_DATABASE`, `JWT_SECRET` -- [ ] **0.5** Create `src/server.ts` using `createServiceApp()` + `startService()` from `@bytelyst/fastify-core` -- [ ] **0.6** Add `.env.example` with all required env vars -- [ ] **0.7** Verify: `pnpm build` passes for the new service +- [x] **0.2** Create `package.json` (`@lysnrai/extraction-service`, port 4005) matching existing service conventions [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.3** Create `tsconfig.json` (self-contained, matching tracker-service pattern) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.4** Create `src/lib/config.ts` with Zod schema (PORT, HOST, NODE*ENV, CORS_ORIGIN, SERVICE_NAME, PYTHON_SIDECAR_URL, DEFAULT_MODEL_ID, COSMOS*\*, JWT_SECRET, DEFAULT_PRODUCT_ID) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.5** Create `src/server.ts` using `createServiceApp()` + `startService()` from `@bytelyst/fastify-core` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.6** Add `.env.example` with all required env vars [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.7** Verify: `pnpm build` passes for the new service [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ### Python sidecar scaffold -- [ ] **0.8** Create `services/extraction-service/python/` directory: +- [x] **0.8** Create `services/extraction-service/python/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ``` python/ src/ @@ -102,7 +96,7 @@ A shared extraction microservice that uses Google's LangExtract library to extra requirements.txt # langextract, fastapi, uvicorn, pydantic Dockerfile # Python 3.12 slim ``` -- [ ] **0.9** Create `python/requirements.txt`: +- [x] **0.9** Create `python/requirements.txt`: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ``` langextract>=0.3.0 fastapi>=0.115.0 @@ -111,16 +105,13 @@ A shared extraction microservice that uses Google's LangExtract library to extra pydantic-settings>=2.7.0 structlog>=24.4.0 ``` -- [ ] **0.10** Create `python/src/app.py` — FastAPI app with endpoints: - - `POST /extract` — single document extraction - - `POST /extract/batch` — batch extraction - - `GET /health` — sidecar health check -- [ ] **0.11** Create `python/src/extractor.py` — wrapper around `lx.extract()` with configurable model provider -- [ ] **0.12** Verify: Python sidecar starts and `/health` returns OK +- [x] **0.10** Create `python/src/app.py` — FastAPI app with POST /extract, POST /extract/batch, GET /health [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.11** Create `python/src/extractor.py` — wrapper around `lx.extract()` with mock fallback [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [ ] **0.12** Verify: Python sidecar starts and `/health` returns OK (requires `pip install` — deferred to Phase 1) ### Package scaffold (`@bytelyst/extraction`) -- [ ] **0.13** Create `packages/extraction/` directory: +- [x] **0.13** Create `packages/extraction/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ``` packages/extraction/ src/ @@ -130,21 +121,16 @@ A shared extraction microservice that uses Google's LangExtract library to extra package.json tsconfig.json ``` -- [ ] **0.14** Create `package.json` (`@bytelyst/extraction`) with `@bytelyst/api-client` as peer dep -- [ ] **0.15** Define TypeScript types matching the extraction API: - - `ExtractionTask` — prompt description + examples + model config - - `ExtractionExample` — text + extractions (class, text, attributes) - - `ExtractionResult` — extracted entities with source grounding - - `ExtractionRequest` — task + input text/URL - - `ExtractionResponse` — results + metadata (model, duration, token count) -- [ ] **0.16** Create `createExtractionClient()` factory using `createApiClient()` pattern -- [ ] **0.17** Verify: `pnpm build` passes for the new package +- [x] **0.14** Create `package.json` (`@bytelyst/extraction`) with `@bytelyst/api-client` as peer dep [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.15** Define TypeScript types (ExtractionTask, ExtractionExample, ExtractionEntity, ExtractRequest, ExtractResponse, BatchExtractRequest, BatchExtractResponse) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.16** Create `createExtractionClient()` factory using `createApiClient()` pattern [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.17** Verify: `pnpm build` passes for the new package [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ### Workspace wiring -- [ ] **0.18** Add `extraction-service` and `extraction` to `pnpm-workspace.yaml` (already covered by `packages/*` + `services/*` globs — verify) -- [ ] **0.19** Run `pnpm install` from repo root — verify workspace resolution -- [ ] **0.20** Verify: `pnpm build` and `pnpm typecheck` pass across entire repo +- [x] **0.18** Verify `extraction-service` and `extraction` covered by `packages/*` + `services/*` globs in `pnpm-workspace.yaml` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.19** Run `pnpm install` from repo root — workspace resolution verified [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) +- [x] **0.20** Verify: `pnpm build` passes for both extraction-service and @bytelyst/extraction [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5) ---