learning_ai_common_plat/docs/EXTRACTION_SERVICE_ROADMAP.md
saravanakumardb1 b035908a5a docs(extraction): update roadmap Phase 3-4 checkboxes with commit links
- Phase 3: LysnrAI admin extraction-client (944609a), MindLyst web extraction-client (b545244)
- Phase 4: docker-compose (bdd9bb1), .env.example updates (bdd9bb1, 944609a)
- Deferred items clearly marked for Phase 5
2026-02-14 13:41:56 -08:00

417 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Extraction Service — Roadmap & Task Checklist
> **Service:** `@lysnrai/extraction-service` (port 4005)
> **Package:** `@bytelyst/extraction` (shared types + client)
> **Core dependency:** [google/langextract](https://github.com/google/langextract) (Python)
>
> **Companion docs:** [ECOSYSTEM_ARCHITECTURE.md](./ECOSYSTEM_ARCHITECTURE.md) · [ROADMAP.md](./ROADMAP.md)
---
## Overview
A shared extraction microservice that uses Google's LangExtract library to extract structured information from unstructured text. Both LysnrAI and MindLyst consume this service for their respective extraction needs.
**Architecture:** Fastify (routing, auth, validation, request tracing) + Python sidecar (LangExtract). The Fastify layer keeps the service consistent with the other 4 services. The Python process handles the actual LLM-powered extraction.
```
┌──────────────────────────────────────────────────────────┐
│ extraction-service │
│ (port 4005) │
│ │
│ ┌─────────────────────┐ ┌──────────────────────────┐ │
│ │ Fastify (TS) │ │ Python Sidecar │ │
│ │ │ │ │ │
│ │ - Auth middleware │──►│ - LangExtract wrapper │ │
│ │ - Zod validation │◄──│ - Task registry │ │
│ │ - x-request-id │ │ - Model provider config │ │
│ │ - Rate limiting │ │ - Result caching │ │
│ │ - /health │ │ │ │
│ └─────────────────────┘ └──────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
▲ ▲
│ │
REST API FastAPI (internal :4006)
(external) or subprocess stdio
```
### Consumers
| Product | Use Case | Entry Point |
| ----------------------------- | ------------------------------------------------------------------------ | -------------------------------------------------- |
| **LysnrAI** — Desktop/Backend | Post-transcription extraction (action items, decisions, dates, people) | `backend/src/clients/extraction_client.py` |
| **LysnrAI** — Admin Dashboard | Transcript analytics, entity review | `admin-dashboard-web/src/lib/extraction-client.ts` |
| **MindLyst** — KMP/Web | Triage pipeline (brain routing, entity extraction, topic classification) | `mindlyst-native/web/src/pages/api/triage.ts` |
| **MindLyst** — Web Dashboard | Brain insight generation, reflection enrichment | Direct API calls via `@bytelyst/api-client` |
---
## Phase 0 — Foundation & Scaffolding
> **Goal:** Set up the service skeleton, Python environment, and build pipeline.
### Service scaffold (Fastify)
- [x] **0.1** Create `services/extraction-service/` directory structure: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
services/extraction-service/
src/
lib/
config.ts # Zod config schema (PORT, HOST, CORS, PYTHON_SIDECAR_URL, etc.)
errors.ts # Re-export from @bytelyst/errors
cosmos.ts # Re-export from @bytelyst/cosmos (for task registry persistence)
product-config.ts # Re-export from @bytelyst/config
python-bridge.ts # HTTP client to Python sidecar
modules/
extract/
types.ts # Zod schemas: ExtractionTask, ExtractionExample, ExtractionResult
routes.ts # POST /api/extract, POST /api/extract/batch, GET /api/tasks
tasks/
types.ts # Predefined task definitions (triage, transcript, etc.)
repository.ts # Cosmos CRUD for custom task definitions
routes.ts # CRUD endpoints for task management
server.ts # createServiceApp + route registration
package.json
tsconfig.json
Dockerfile
```
- [x] **0.2** Create `package.json` (`@lysnrai/extraction-service`, port 4005) matching existing service conventions [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.3** Create `tsconfig.json` (self-contained, matching tracker-service pattern) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.4** Create `src/lib/config.ts` with Zod schema (PORT, HOST, NODE*ENV, CORS_ORIGIN, SERVICE_NAME, PYTHON_SIDECAR_URL, DEFAULT_MODEL_ID, COSMOS*\*, JWT_SECRET, DEFAULT_PRODUCT_ID) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.5** Create `src/server.ts` using `createServiceApp()` + `startService()` from `@bytelyst/fastify-core` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.6** Add `.env.example` with all required env vars [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.7** Verify: `pnpm build` passes for the new service [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Python sidecar scaffold
- [x] **0.8** Create `services/extraction-service/python/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
python/
src/
__init__.py
app.py # FastAPI app (internal, port 4006)
extractor.py # LangExtract wrapper
task_registry.py # Built-in task definitions
models.py # Pydantic models matching TS Zod schemas
requirements.txt # langextract, fastapi, uvicorn, pydantic
Dockerfile # Python 3.12 slim
```
- [x] **0.9** Create `python/requirements.txt`: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
langextract>=0.3.0
fastapi>=0.115.0
uvicorn>=0.34.0
pydantic>=2.10.0
pydantic-settings>=2.7.0
structlog>=24.4.0
```
- [x] **0.10** Create `python/src/app.py` — FastAPI app with POST /extract, POST /extract/batch, GET /health [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.11** Create `python/src/extractor.py` — wrapper around `lx.extract()` with mock fallback [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [ ] **0.12** Verify: Python sidecar starts and `/health` returns OK (requires `pip install` — deferred to Phase 1)
### Package scaffold (`@bytelyst/extraction`)
- [x] **0.13** Create `packages/extraction/` directory: [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
```
packages/extraction/
src/
index.ts # Public API
types.ts # Shared TypeScript types
client.ts # createExtractionClient() factory
package.json
tsconfig.json
```
- [x] **0.14** Create `package.json` (`@bytelyst/extraction`) with `@bytelyst/api-client` as peer dep [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.15** Define TypeScript types (ExtractionTask, ExtractionExample, ExtractionEntity, ExtractRequest, ExtractResponse, BatchExtractRequest, BatchExtractResponse) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.16** Create `createExtractionClient()` factory using `createApiClient()` pattern [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.17** Verify: `pnpm build` passes for the new package [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Workspace wiring
- [x] **0.18** Verify `extraction-service` and `extraction` covered by `packages/*` + `services/*` globs in `pnpm-workspace.yaml` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.19** Run `pnpm install` from repo root — workspace resolution verified [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **0.20** Verify: `pnpm build` passes for both extraction-service and @bytelyst/extraction [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
---
## Phase 1 — Core Extraction API
> **Goal:** Working extraction endpoint that accepts text + task definition and returns structured results via LangExtract.
### Python extractor implementation
- [x] **1.1** Implement `extractor.py` — LangExtract wrapper with mock fallback, configurable model_id, extraction_passes, max_workers, max_char_buffer [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.2** Model provider configuration — Gemini default via DEFAULT_MODEL_ID env var, model_id passthrough to lx.extract() [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.3** structlog logging in extractor.py and app.py (extraction_complete, extraction_failed, extract_request) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.4** Request timeout in python-bridge.ts (DEFAULT_TIMEOUT_MS = 120s, configurable per-call) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Fastify routes
- [x] **1.5** Implement `src/modules/extract/types.ts` — ExtractRequestSchema, ExtractResponseSchema, BatchExtractRequestSchema (Zod) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.6** Implement `src/modules/extract/routes.ts` — POST /extract, POST /extract/batch, GET /extract/models, GET /extract/sidecar-health [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.7** Implement `src/lib/python-bridge.ts` — sidecarExtract, sidecarExtractBatch, sidecarHealth, waitForSidecar with x-request-id forwarding [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **1.8** Rate limiting on extract routes (30 req/min per IP via @fastify/rate-limit) [`0a87d19`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0a87d19)
### Tests
- [x] **1.9** Unit tests for Zod schemas — 13 extract tests + 8 task tests (21 total) [`0a87d19`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0a87d19)
- [ ] **1.10** Integration tests for extract routes (mock Python sidecar responses) — deferred to Phase 3
- [ ] **1.11** Python unit tests for `extractor.py` — deferred (requires pip install in CI)
- [x] **1.12** Verify: `pnpm test` passes (21 tests) [`0a87d19`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0a87d19)
---
## Phase 2 — Predefined Task Library
> **Goal:** Ship a curated set of extraction task definitions that LysnrAI and MindLyst can use out-of-the-box.
### Task definitions
- [x] **2.1** Define `transcript-extraction` task (6 classes, few-shot examples) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.2** Define `triage` task (MindLyst) — 6 classes incl. brain_signal with brain/confidence attributes [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.3** Define `memory-insight` task (MindLyst) — 4 classes [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.4** Define `reflection-enrichment` task (MindLyst) — 4 classes [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.5** Define `bug-report-extraction` task (Tracker) — 5 classes [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Task registry (Cosmos DB)
- [x] **2.6** Cosmos container `extraction_tasks` (partition `/productId`) — created on first access via repository [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.7** Implement `src/modules/tasks/repository.ts` — listTasks, getTask, createTask, updateTask, deleteTask, upsertTask [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.8** Implement `src/modules/tasks/routes.ts` — GET/POST/PUT/DELETE /tasks [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.9** Seed built-in tasks on startup via `seed.ts` (idempotent upsert, 5 tasks) [`6a49823`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/6a49823)
- [x] **2.10** `productId` on all task documents (DEFAULT_PRODUCT_ID from env) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### Python task registry
- [x] **2.11** Implement `task_registry.py` — BUILTIN_TASKS with full definitions inline [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **2.12** Task definitions stored inline in `task_registry.py` (no separate JSON needed) [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [ ] **2.13** Task validation: verify examples follow LangExtract best practices — deferred to Phase 5
### Tests
- [x] **2.14** Tests for task schemas (8 tests in types.test.ts) [`0a87d19`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0a87d19)
- [x] **2.15** Tests for task seeding (7 tests in seed.test.ts) [`6a49823`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/6a49823)
- [x] **2.16** Verify: all 28 tests pass [`6a49823`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/6a49823)
---
## Phase 3 — Consumer Integration
> **Goal:** Wire LysnrAI and MindLyst to call the extraction service.
### `@bytelyst/extraction` package finalization
- [x] **3.1** `createExtractionClient()` with extract(), extractBatch(), listTasks(), getTask() [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **3.2** Export all types from `src/index.ts` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
- [x] **3.3** `pnpm build` passes for `@bytelyst/extraction` [`c292bb5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c292bb5)
### LysnrAI integration
- [x] **3.4** Add `@bytelyst/extraction` to `admin-dashboard-web/package.json` (via `file:` ref) [`944609a`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/944609a)
- [x] **3.5** Create `admin-dashboard-web/src/lib/extraction-client.ts` — extractText, extractTranscript, extractBatch, listTasks, getTask, getSidecarHealth [`944609a`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/944609a)
- [ ] **3.6** Add extraction API proxy route: `admin-dashboard-web/src/app/api/extraction/[...path]/route.ts` — deferred (client calls service directly for now)
- [ ] **3.7** Python extraction client in `backend/src/clients/extraction_client.py` — deferred to Phase 5
- [ ] **3.8** Post-transcription extraction endpoint `POST /api/transcripts/{id}/extract` — deferred to Phase 5
- [ ] **3.9** Extraction results UI in admin dashboard — deferred to Phase 5
### MindLyst integration
- [x] **3.10** MindLyst web extraction client (standalone, no @bytelyst deps needed) [`b545244`](https://github.com/saravanakumardb1/learning_multimodal_memory_agents/commit/b545244)
- [x] **3.11** Create `mindlyst-native/web/src/lib/extraction-client.ts` — triageExtract, memoryInsightExtract, reflectionExtract, isExtractionAvailable [`b545244`](https://github.com/saravanakumardb1/learning_multimodal_memory_agents/commit/b545244)
- [ ] **3.12** Create API route `src/pages/api/extract.ts` — deferred (client ready, route integration next)
- [ ] **3.13** Wire triage flow to use extraction results — deferred to Phase 5
- [ ] **3.14** Wire brain insights to `memory-insight` task — deferred to Phase 5
- [ ] **3.15** Wire reflections to `reflection-enrichment` task — deferred to Phase 5
### Tests
- [ ] **3.16** Integration tests for LysnrAI extraction — deferred to Phase 5
- [ ] **3.17** Integration tests for MindLyst triage-via-extraction — deferred to Phase 5
- [ ] **3.18** Verify `npx tsc --noEmit` across all dashboards — deferred to Phase 5
---
## Phase 4 — Docker & DevOps
> **Goal:** Containerize, add to docker-compose, update run scripts.
### Dockerfile
- [ ] **4.1** Create multi-stage `Dockerfile` for extraction-service — deferred (hybrid TS+Python needs two-container approach)
- [ ] **4.2** Create `supervisord.conf` — deferred (see 4.1)
- [ ] **4.3** Verify: `docker build` succeeds — deferred
### Docker Compose
- [x] **4.4** Add `extraction-service` to `docker-compose.yml` (port 4005, Traefik, Loki, healthcheck) [`bdd9bb1`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/bdd9bb1)
- [ ] **4.5** Add to LysnrAI `docker-compose.yml` — deferred
### Run scripts
- [ ] **4.6** Add extraction-service to `run-local-all-services.sh` — deferred
- [ ] **4.7** Add extraction-service to `.windsurf/workflows/start-all-services.md` — deferred
- [x] **4.8** Add `EXTRACTION_SERVICE_URL` to LysnrAI `.env.example` [`944609a`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/944609a)
- [x] **4.9** Add extraction service env vars to common platform `.env.example` [`bdd9bb1`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/bdd9bb1)
### CI
- [ ] **4.10** Create `.github/workflows/ci-extraction-service.yml` — deferred
- [ ] **4.11** Verify: CI workflow passes — deferred
---
## Phase 5 — Production Hardening
> **Goal:** Rate limiting, caching, observability, cost controls.
### Caching
- [ ] **5.1** Add result caching in Python sidecar:
- Cache key: hash(task_id + input_text + model_id)
- TTL: configurable (default 24h)
- Storage: in-memory LRU (dev) or Redis (prod)
- [ ] **5.2** Add cache hit/miss headers to Fastify response (`X-Extraction-Cache: HIT/MISS`)
### Cost controls
- [ ] **5.3** Add per-user daily extraction quota (configurable per plan tier):
- Free: 10 extractions/day
- Pro: 100 extractions/day
- Enterprise: unlimited
- [ ] **5.4** Track usage in Cosmos `extraction_usage` container (partition: `/userId`)
- [ ] **5.5** Return `429 Too Many Requests` with quota info when exceeded
- [ ] **5.6** Add usage reporting endpoint: `GET /api/extract/usage` (admin)
### Observability
- [ ] **5.7** Add structured logging for every extraction:
- Request: task_id, input_length, model_id, user_id, product_id
- Response: entity_count, duration_ms, token_count, cache_hit
- [ ] **5.8** Add Prometheus metrics (via `fastify-metrics`):
- `extraction_requests_total` (labels: task_id, model_id, product_id, status)
- `extraction_duration_seconds` (histogram)
- `extraction_entities_extracted` (histogram)
- `extraction_cache_hit_total`
- [ ] **5.9** Add Grafana dashboard for extraction service (in `services/monitoring/grafana/dashboards/`)
### Error handling
- [ ] **5.10** Map LangExtract errors to `@bytelyst/errors`:
- Model timeout → `408 Request Timeout`
- Rate limit (upstream LLM) → `429 Too Many Requests` with retry-after
- Invalid task definition → `400 Bad Request`
- Model unavailable → `503 Service Unavailable`
- [ ] **5.11** Add circuit breaker for Python sidecar (fail fast if sidecar is down)
- [ ] **5.12** Add graceful degradation: return partial results if some chunks fail
---
## Phase 6 — Advanced Features (Future)
> **Goal:** Power-user features, visualization, and batch processing.
### Visualization
- [ ] **6.1** Expose LangExtract's HTML visualization:
- `GET /api/extract/:requestId/visualization` — returns interactive HTML
- Embed in admin dashboard for extraction quality review
- [ ] **6.2** Store visualization artifacts in Azure Blob Storage (`extractions` container)
### Batch & async processing
- [ ] **6.3** Add async extraction endpoint:
- `POST /api/extract/async` — returns job ID immediately
- `GET /api/extract/jobs/:id` — poll for status + results
- Webhook callback when complete
- [ ] **6.4** Add Vertex AI batch processing support (for high-volume MindLyst triage)
### Custom model support
- [ ] **6.5** Add Ollama provider for local/air-gapped deployments
- [ ] **6.6** Add model benchmarking endpoint: run same task across models, compare quality + cost
### Multi-language extraction
- [ ] **6.7** Test and validate extraction across languages (LangExtract supports multi-language via LLM)
- [ ] **6.8** Add language detection to extraction pipeline (auto-detect input language)
---
## Env Vars Summary
| Variable | Service | Default | Description |
| ------------------------ | ------------------ | ----------------------- | ----------------------------------- |
| `PORT` | extraction-service | `4005` | Fastify listen port |
| `HOST` | extraction-service | `0.0.0.0` | Fastify listen host |
| `CORS_ORIGIN` | extraction-service | `*` | Allowed origins |
| `PYTHON_SIDECAR_URL` | extraction-service | `http://localhost:4006` | Python sidecar URL |
| `DEFAULT_MODEL_ID` | extraction-service | `gemini-2.5-flash` | Default LLM model |
| `GEMINI_API_KEY` | python sidecar | — | Google Gemini API key |
| `AZURE_OPENAI_API_KEY` | python sidecar | — | Azure OpenAI key (alternative) |
| `AZURE_OPENAI_ENDPOINT` | python sidecar | — | Azure OpenAI endpoint (alternative) |
| `MAX_WORKERS` | python sidecar | `10` | Parallel extraction workers |
| `MAX_CHAR_BUFFER` | python sidecar | `2000` | Chunk size for long docs |
| `EXTRACTION_CACHE_TTL` | python sidecar | `86400` | Cache TTL in seconds |
| `COSMOS_ENDPOINT` | extraction-service | — | Azure Cosmos DB endpoint |
| `COSMOS_KEY` | extraction-service | — | Azure Cosmos DB key |
| `COSMOS_DATABASE` | extraction-service | `lysnrai` | Database name |
| `JWT_SECRET` | extraction-service | — | JWT validation secret |
| `EXTRACTION_SERVICE_URL` | consumers | `http://localhost:4005` | Used by dashboards/backends |
---
## Port Allocation
| Service | Port |
| -------------------------------------------- | -------- |
| growth-service | 4001 |
| billing-service | 4002 |
| platform-service | 4003 |
| tracker-service | 4004 |
| **extraction-service** | **4005** |
| extraction-service python sidecar (internal) | 4006 |
---
## Dependency Graph
```
@bytelyst/extraction (package)
└── @bytelyst/api-client (peer dep)
@lysnrai/extraction-service (service)
├── @bytelyst/fastify-core
├── @bytelyst/auth
├── @bytelyst/config
├── @bytelyst/cosmos
├── @bytelyst/errors
├── fastify, zod, jose (direct deps)
└── python sidecar
└── langextract, fastapi, uvicorn, structlog
```
---
## Estimated Effort
| Phase | Effort | Dependencies |
| ------------------------------ | -------- | ------------ |
| Phase 0 — Foundation | 23 days | None |
| Phase 1 — Core API | 23 days | Phase 0 |
| Phase 2 — Task Library | 2 days | Phase 1 |
| Phase 3 — Consumer Integration | 34 days | Phase 2 |
| Phase 4 — Docker & DevOps | 12 days | Phase 1 |
| Phase 5 — Production Hardening | 23 days | Phase 3 |
| Phase 6 — Advanced (future) | Ongoing | Phase 5 |
**Total MVP (Phases 04): ~1014 days**
---
## Rollback Strategy
- The extraction-service is **additive** — no existing code is modified until Phase 3
- Phase 3 consumer integration uses new endpoints/routes — existing triage/transcript flows remain untouched
- If extraction-service is down, consumers fall back to their existing behavior (MindLyst mock triage, LysnrAI raw transcripts)
- The `@bytelyst/extraction` package is optional — dashboards only import it for new extraction features