docs(extraction): update roadmap Phase 0 checkboxes with commit c292bb5

This commit is contained in:
saravanakumardb1 2026-02-14 13:32:46 -08:00
parent c292bb5cc1
commit 4b4720aebd

View File

@ -52,7 +52,7 @@ A shared extraction microservice that uses Google's LangExtract library to extra
### Service scaffold (Fastify)
- [ ] **0.1** Create `services/extraction-service/` directory structure:
- [x] **0.1** Create `services/extraction-service/` directory structure: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
services/extraction-service/
src/
@ -75,22 +75,16 @@ A shared extraction microservice that uses Google's LangExtract library to extra
tsconfig.json
Dockerfile
```
- [ ] **0.2** Create `package.json` (`@lysnrai/extraction-service`, port 4005) matching existing service conventions
- [ ] **0.3** Create `tsconfig.json` extending `../../tsconfig.base.json`
- [ ] **0.4** Create `src/lib/config.ts` with Zod schema:
- `PORT` (default 4005), `HOST`, `CORS_ORIGIN`
- `PYTHON_SIDECAR_URL` (default `http://localhost:4006`)
- `DEFAULT_MODEL_ID` (default `gemini-2.5-flash`)
- `GEMINI_API_KEY` or `AZURE_OPENAI_API_KEY` / `AZURE_OPENAI_ENDPOINT`
- `MAX_WORKERS` (default 10), `MAX_CHAR_BUFFER` (default 2000)
- `COSMOS_ENDPOINT`, `COSMOS_KEY`, `COSMOS_DATABASE`, `JWT_SECRET`
- [ ] **0.5** Create `src/server.ts` using `createServiceApp()` + `startService()` from `@bytelyst/fastify-core`
- [ ] **0.6** Add `.env.example` with all required env vars
- [ ] **0.7** Verify: `pnpm build` passes for the new service
- [x] **0.2** Create `package.json` (`@lysnrai/extraction-service`, port 4005) matching existing service conventions [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.3** Create `tsconfig.json` (self-contained, matching tracker-service pattern) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.4** Create `src/lib/config.ts` with Zod schema (PORT, HOST, NODE*ENV, CORS_ORIGIN, SERVICE_NAME, PYTHON_SIDECAR_URL, DEFAULT_MODEL_ID, COSMOS*\*, JWT_SECRET, DEFAULT_PRODUCT_ID) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.5** Create `src/server.ts` using `createServiceApp()` + `startService()` from `@bytelyst/fastify-core` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.6** Add `.env.example` with all required env vars [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.7** Verify: `pnpm build` passes for the new service [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Python sidecar scaffold
- [ ] **0.8** Create `services/extraction-service/python/` directory:
- [x] **0.8** Create `services/extraction-service/python/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
python/
src/
@ -102,7 +96,7 @@ A shared extraction microservice that uses Google's LangExtract library to extra
requirements.txt # langextract, fastapi, uvicorn, pydantic
Dockerfile # Python 3.12 slim
```
- [ ] **0.9** Create `python/requirements.txt`:
- [x] **0.9** Create `python/requirements.txt`: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
langextract>=0.3.0
fastapi>=0.115.0
@ -111,16 +105,13 @@ A shared extraction microservice that uses Google's LangExtract library to extra
pydantic-settings>=2.7.0
structlog>=24.4.0
```
- [ ] **0.10** Create `python/src/app.py` — FastAPI app with endpoints:
- `POST /extract` — single document extraction
- `POST /extract/batch` — batch extraction
- `GET /health` — sidecar health check
- [ ] **0.11** Create `python/src/extractor.py` — wrapper around `lx.extract()` with configurable model provider
- [ ] **0.12** Verify: Python sidecar starts and `/health` returns OK
- [x] **0.10** Create `python/src/app.py` — FastAPI app with POST /extract, POST /extract/batch, GET /health [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.11** Create `python/src/extractor.py` — wrapper around `lx.extract()` with mock fallback [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [ ] **0.12** Verify: Python sidecar starts and `/health` returns OK (requires `pip install` — deferred to Phase 1)
### Package scaffold (`@bytelyst/extraction`)
- [ ] **0.13** Create `packages/extraction/` directory:
- [x] **0.13** Create `packages/extraction/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
packages/extraction/
src/
@ -130,21 +121,16 @@ A shared extraction microservice that uses Google's LangExtract library to extra
package.json
tsconfig.json
```
- [ ] **0.14** Create `package.json` (`@bytelyst/extraction`) with `@bytelyst/api-client` as peer dep
- [ ] **0.15** Define TypeScript types matching the extraction API:
- `ExtractionTask` — prompt description + examples + model config
- `ExtractionExample` — text + extractions (class, text, attributes)
- `ExtractionResult` — extracted entities with source grounding
- `ExtractionRequest` — task + input text/URL
- `ExtractionResponse` — results + metadata (model, duration, token count)
- [ ] **0.16** Create `createExtractionClient()` factory using `createApiClient()` pattern
- [ ] **0.17** Verify: `pnpm build` passes for the new package
- [x] **0.14** Create `package.json` (`@bytelyst/extraction`) with `@bytelyst/api-client` as peer dep [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.15** Define TypeScript types (ExtractionTask, ExtractionExample, ExtractionEntity, ExtractRequest, ExtractResponse, BatchExtractRequest, BatchExtractResponse) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.16** Create `createExtractionClient()` factory using `createApiClient()` pattern [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.17** Verify: `pnpm build` passes for the new package [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Workspace wiring
- [ ] **0.18** Add `extraction-service` and `extraction` to `pnpm-workspace.yaml` (already covered by `packages/*` + `services/*` globs — verify)
- [ ] **0.19** Run `pnpm install` from repo root — verify workspace resolution
- [ ] **0.20** Verify: `pnpm build` and `pnpm typecheck` pass across entire repo
- [x] **0.18** Verify `extraction-service` and `extraction` covered by `packages/*` + `services/*` globs in `pnpm-workspace.yaml` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.19** Run `pnpm install` from repo root — workspace resolution verified [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.20** Verify: `pnpm build` passes for both extraction-service and @bytelyst/extraction [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
---