saravanakumardb1 6c71255d19 docs: update documentation

2026-02-14 13:22:25 -08:00

24 KiB

Raw Blame History

Extraction Service — Roadmap & Task Checklist

Service: @lysnrai/extraction-service (port 4005) Package: @bytelyst/extraction (shared types + client) Core dependency: google/langextract (Python)

Companion docs: ECOSYSTEM_ARCHITECTURE.md · ROADMAP.md

Overview

A shared extraction microservice that uses Google's LangExtract library to extract structured information from unstructured text. Both LysnrAI and MindLyst consume this service for their respective extraction needs.

Architecture: Fastify (routing, auth, validation, request tracing) + Python sidecar (LangExtract). The Fastify layer keeps the service consistent with the other 4 services. The Python process handles the actual LLM-powered extraction.

┌──────────────────────────────────────────────────────────┐
│                   extraction-service                      │
│                      (port 4005)                          │
│                                                           │
│  ┌─────────────────────┐    ┌──────────────────────────┐ │
│  │   Fastify (TS)      │    │   Python Sidecar         │ │
│  │                     │    │                          │ │
│  │  - Auth middleware   │──►│  - LangExtract wrapper   │ │
│  │  - Zod validation   │◄──│  - Task registry         │ │
│  │  - x-request-id     │    │  - Model provider config │ │
│  │  - Rate limiting    │    │  - Result caching        │ │
│  │  - /health          │    │                          │ │
│  └─────────────────────┘    └──────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
        ▲                              ▲
        │                              │
   REST API                     FastAPI (internal :4006)
   (external)                   or subprocess stdio

Consumers

Product	Use Case	Entry Point
LysnrAI — Desktop/Backend	Post-transcription extraction (action items, decisions, dates, people)	`backend/src/clients/extraction_client.py`
LysnrAI — Admin Dashboard	Transcript analytics, entity review	`admin-dashboard-web/src/lib/extraction-client.ts`
MindLyst — KMP/Web	Triage pipeline (brain routing, entity extraction, topic classification)	`mindlyst-native/web/src/pages/api/triage.ts`
MindLyst — Web Dashboard	Brain insight generation, reflection enrichment	Direct API calls via `@bytelyst/api-client`

Phase 0 — Foundation & Scaffolding

Goal: Set up the service skeleton, Python environment, and build pipeline.

Service scaffold (Fastify)

0.1 Create services/extraction-service/ directory structure:

services/extraction-service/
  src/
    lib/
      config.ts            # Zod config schema (PORT, HOST, CORS, PYTHON_SIDECAR_URL, etc.)
      errors.ts            # Re-export from @bytelyst/errors
      cosmos.ts            # Re-export from @bytelyst/cosmos (for task registry persistence)
      product-config.ts    # Re-export from @bytelyst/config
      python-bridge.ts     # HTTP client to Python sidecar
    modules/
      extract/
        types.ts           # Zod schemas: ExtractionTask, ExtractionExample, ExtractionResult
        routes.ts          # POST /api/extract, POST /api/extract/batch, GET /api/tasks
      tasks/
        types.ts           # Predefined task definitions (triage, transcript, etc.)
        repository.ts      # Cosmos CRUD for custom task definitions
        routes.ts          # CRUD endpoints for task management
    server.ts              # createServiceApp + route registration
  package.json
  tsconfig.json
  Dockerfile

0.2 Create package.json (@lysnrai/extraction-service, port 4005) matching existing service conventions
0.3 Create tsconfig.json extending ../../tsconfig.base.json
0.4 Create src/lib/config.ts with Zod schema:
- PORT (default 4005), HOST, CORS_ORIGIN
- PYTHON_SIDECAR_URL (default http://localhost:4006)
- DEFAULT_MODEL_ID (default gemini-2.5-flash)
- GEMINI_API_KEY or AZURE_OPENAI_API_KEY / AZURE_OPENAI_ENDPOINT
- MAX_WORKERS (default 10), MAX_CHAR_BUFFER (default 2000)
- COSMOS_ENDPOINT, COSMOS_KEY, COSMOS_DATABASE, JWT_SECRET
0.5 Create src/server.ts using createServiceApp() + startService() from @bytelyst/fastify-core
0.6 Add .env.example with all required env vars
0.7 Verify: pnpm build passes for the new service

Python sidecar scaffold

0.8 Create services/extraction-service/python/ directory:

python/
  src/
    __init__.py
    app.py                 # FastAPI app (internal, port 4006)
    extractor.py           # LangExtract wrapper
    task_registry.py       # Built-in task definitions
    models.py              # Pydantic models matching TS Zod schemas
  requirements.txt         # langextract, fastapi, uvicorn, pydantic
  Dockerfile               # Python 3.12 slim

0.9 Create python/requirements.txt:

langextract>=0.3.0
fastapi>=0.115.0
uvicorn>=0.34.0
pydantic>=2.10.0
pydantic-settings>=2.7.0
structlog>=24.4.0

0.10 Create python/src/app.py — FastAPI app with endpoints:
- POST /extract — single document extraction
- POST /extract/batch — batch extraction
- GET /health — sidecar health check
0.11 Create python/src/extractor.py — wrapper around lx.extract() with configurable model provider
0.12 Verify: Python sidecar starts and /health returns OK

Package scaffold (`@bytelyst/extraction`)

0.13 Create packages/extraction/ directory:

packages/extraction/
  src/
    index.ts               # Public API
    types.ts               # Shared TypeScript types
    client.ts              # createExtractionClient() factory
  package.json
  tsconfig.json

0.14 Create package.json (@bytelyst/extraction) with @bytelyst/api-client as peer dep
0.15 Define TypeScript types matching the extraction API:
- ExtractionTask — prompt description + examples + model config
- ExtractionExample — text + extractions (class, text, attributes)
- ExtractionResult — extracted entities with source grounding
- ExtractionRequest — task + input text/URL
- ExtractionResponse — results + metadata (model, duration, token count)
0.16 Create createExtractionClient() factory using createApiClient() pattern
0.17 Verify: pnpm build passes for the new package

Workspace wiring

0.18 Add extraction-service and extraction to pnpm-workspace.yaml (already covered by packages/* + services/* globs — verify)
0.19 Run pnpm install from repo root — verify workspace resolution
0.20 Verify: pnpm build and pnpm typecheck pass across entire repo

Phase 1 — Core Extraction API

Goal: Working extraction endpoint that accepts text + task definition and returns structured results via LangExtract.

Python extractor implementation

1.1 Implement extractor.py:
- Accept task definition (prompt, examples, model config)
- Accept input text (string or URL)
- Call lx.extract() with configurable parameters (model_id, extraction_passes, max_workers, max_char_buffer)
- Return structured results with source grounding (extraction_class, extraction_text, attributes, char offsets)
- Handle errors gracefully (model timeout, rate limit, invalid input)
1.2 Implement model provider configuration:
- Gemini (default): API key from env
- Azure OpenAI: endpoint + key from env
- Ollama (local dev): configurable base URL
1.3 Add request/response logging via structlog (never print())
1.4 Add request timeout configuration (default 120s for long documents)

Fastify routes

1.5 Implement src/modules/extract/types.ts:
- ExtractRequestSchema (Zod) — task definition + input text + options
- ExtractResponseSchema (Zod) — array of extractions + metadata
- BatchExtractRequestSchema — array of inputs + shared task
1.6 Implement src/modules/extract/routes.ts:
- POST /api/extract — auth required, validates input, proxies to Python sidecar
- POST /api/extract/batch — auth required, accepts multiple inputs
- GET /api/extract/models — list available model providers
1.7 Implement src/lib/python-bridge.ts:
- HTTP client to Python sidecar (fetch with timeout, retry, error mapping)
- Health check polling on startup (wait for sidecar readiness)
- Request ID forwarding (x-request-id)
1.8 Add rate limiting to extraction endpoints (configurable per-user limit)

Tests

1.9 Write unit tests for Zod schemas (types.test.ts)
1.10 Write integration tests for extract routes (mock Python sidecar responses)
1.11 Write Python unit tests for extractor.py (mock lx.extract)
1.12 Verify: pnpm test passes, pytest passes

Phase 2 — Predefined Task Library

Goal: Ship a curated set of extraction task definitions that LysnrAI and MindLyst can use out-of-the-box.

Task definitions

2.1 Define transcript-extraction task:
- Classes: action_item, decision, question, deadline, person, topic
- 3–5 few-shot examples from realistic meeting transcripts
- Default model: gemini-2.5-flash
2.2 Define triage task (MindLyst):
- Classes: topic, entity, action, emotion, date_reference, brain_signal
- brain_signal attributes: { brain: "work|home|money|health|global", confidence: float }
- 3–5 few-shot examples per brain type
2.3 Define memory-insight task (MindLyst):
- Classes: pattern, recurring_theme, relationship, milestone
- Examples from accumulated brain memories
2.4 Define reflection-enrichment task (MindLyst):
- Classes: emotional_state, accomplishment, concern, goal_progress
- Examples from journal-style text
2.5 Define bug-report-extraction task (Tracker):
- Classes: steps_to_reproduce, expected_behavior, actual_behavior, affected_component, severity
- Examples from real issue submissions

Task registry (Cosmos DB)

2.6 Create Cosmos container: extraction_tasks (partition key: /productId)
2.7 Implement src/modules/tasks/repository.ts — CRUD for task definitions
2.8 Implement src/modules/tasks/routes.ts:
- GET /api/tasks — list all tasks (built-in + custom)
- GET /api/tasks/:id — get task by ID
- POST /api/tasks — create custom task (admin only)
- PUT /api/tasks/:id — update task (admin only)
- DELETE /api/tasks/:id — delete custom task (admin only)
2.9 Seed built-in tasks on service startup (idempotent upsert)
2.10 Add productId to all task documents

Python task registry

2.11 Implement task_registry.py — load task definitions from Cosmos (via Fastify API) or local JSON fallback
2.12 Create python/tasks/ directory with JSON files for each built-in task
2.13 Add task validation: verify examples follow LangExtract best practices (ordered, verbatim, no overlap)

Tests

2.14 Write tests for task CRUD routes
2.15 Write tests for task seeding logic
2.16 Verify: all tests pass

Phase 3 — Consumer Integration

Goal: Wire LysnrAI and MindLyst to call the extraction service.

`@bytelyst/extraction` package finalization

3.1 Add typed methods to createExtractionClient():
- extract(input, taskId, options?) — single extraction
- extractBatch(inputs, taskId, options?) — batch extraction
- listTasks() — get available tasks
- getTask(id) — get task details
3.2 Export all types from src/index.ts
3.3 Publish: pnpm build in packages/extraction/

LysnrAI integration

3.4 Add @bytelyst/extraction to admin-dashboard-web/package.json (via file: ref)
3.5 Create admin-dashboard-web/src/lib/extraction-client.ts — typed client instance
3.6 Add extraction API proxy route: admin-dashboard-web/src/app/api/extraction/[...path]/route.ts
3.7 Create Python extraction client in backend/src/clients/extraction_client.py:
- HTTP client to extraction-service (port 4005)
- Methods: extract_transcript(text), extract_batch(texts)
3.8 Add post-transcription extraction to LysnrAI backend:
- New endpoint: POST /api/transcripts/{id}/extract
- Calls extraction-service with transcript-extraction task
- Stores results alongside transcript
3.9 Add extraction results display to admin dashboard (transcript detail page)

MindLyst integration

3.10 Add @bytelyst/extraction to mindlyst-native/web/package.json (via file: ref):

"@bytelyst/extraction": "file:../../../learning_ai_common_plat/packages/extraction"

3.11 Create mindlyst-native/web/src/lib/extraction-client.ts
3.12 Create API route: mindlyst-native/web/src/pages/api/extract.ts
- Accepts raw capture text, calls extraction-service with triage task
- Returns brain routing + extracted entities
3.13 Update triage flow on web dashboard to use extraction results for brain auto-routing
3.14 Wire brain insight generation to use memory-insight task
3.15 Wire reflection enrichment to use reflection-enrichment task

Tests

3.16 Add integration tests for LysnrAI extraction endpoint
3.17 Add integration tests for MindLyst triage-via-extraction flow
3.18 Verify: npx tsc --noEmit passes in all 3 dashboards + MindLyst web

Phase 4 — Docker & DevOps

Goal: Containerize, add to docker-compose, update run scripts.

Dockerfile

4.1 Create multi-stage Dockerfile for extraction-service:
- Stage 1: Node.js build (Fastify TS → JS)
- Stage 2: Python setup (install langextract + deps)
- Stage 3: Runtime (Node.js + Python, supervisord to run both processes)
4.2 Create supervisord.conf to manage Fastify (port 4005) + Python sidecar (port 4006)
4.3 Verify: docker build succeeds

Docker Compose

4.4 Add extraction-service to docker-compose.yml:

extraction-service:
  build:
    context: .
    dockerfile: services/extraction-service/Dockerfile
  ports:
    - '4005:4005'
  env_file:
    - .env
  environment:
    - PORT=4005
    - PYTHON_SIDECAR_URL=http://localhost:4006
  labels:
    - 'traefik.enable=true'
    - 'traefik.http.routers.extraction.rule=PathPrefix(`/api/extract`) || PathPrefix(`/api/tasks`)'
    - 'traefik.http.services.extraction.loadbalancer.server.port=4005'
  logging:
    driver: loki
    options:
      loki-url: 'http://host.docker.internal:3100/loki/api/v1/push'
      loki-retries: '3'
  restart: unless-stopped
  healthcheck:
    test: ['CMD', 'wget', '--no-verbose', '--tries=1', '--spider', 'http://localhost:4005/health']
    interval: 30s
    timeout: 10s
    retries: 3

4.5 Add to LysnrAI docker-compose.yml (references ../learning_ai_common_plat/services/extraction-service/)

Run scripts

4.6 Add extraction-service to run-local-all-services.sh in LysnrAI repo
4.7 Add extraction-service to .windsurf/workflows/start-all-services.md
4.8 Add .env.example entries to LysnrAI repo root (EXTRACTION_SERVICE_URL=http://localhost:4005)
4.9 Add .env.example entries to MindLyst web (same)

CI

4.10 Create .github/workflows/ci-extraction-service.yml:
- Trigger: push to services/extraction-service/** or packages/extraction/**
- Steps: pnpm install, pnpm build, pnpm test (TS), pip install + pytest (Python)
4.11 Verify: CI workflow passes

Phase 5 — Production Hardening

Goal: Rate limiting, caching, observability, cost controls.

Caching

5.1 Add result caching in Python sidecar:
- Cache key: hash(task_id + input_text + model_id)
- TTL: configurable (default 24h)
- Storage: in-memory LRU (dev) or Redis (prod)
5.2 Add cache hit/miss headers to Fastify response (X-Extraction-Cache: HIT/MISS)

Cost controls

5.3 Add per-user daily extraction quota (configurable per plan tier):
- Free: 10 extractions/day
- Pro: 100 extractions/day
- Enterprise: unlimited
5.4 Track usage in Cosmos extraction_usage container (partition: /userId)
5.5 Return 429 Too Many Requests with quota info when exceeded
5.6 Add usage reporting endpoint: GET /api/extract/usage (admin)

Observability

5.7 Add structured logging for every extraction:
- Request: task_id, input_length, model_id, user_id, product_id
- Response: entity_count, duration_ms, token_count, cache_hit
5.8 Add Prometheus metrics (via fastify-metrics):
- extraction_requests_total (labels: task_id, model_id, product_id, status)
- extraction_duration_seconds (histogram)
- extraction_entities_extracted (histogram)
- extraction_cache_hit_total
5.9 Add Grafana dashboard for extraction service (in services/monitoring/grafana/dashboards/)

Error handling

5.10 Map LangExtract errors to @bytelyst/errors:
- Model timeout → 408 Request Timeout
- Rate limit (upstream LLM) → 429 Too Many Requests with retry-after
- Invalid task definition → 400 Bad Request
- Model unavailable → 503 Service Unavailable
5.11 Add circuit breaker for Python sidecar (fail fast if sidecar is down)
5.12 Add graceful degradation: return partial results if some chunks fail

Phase 6 — Advanced Features (Future)

Goal: Power-user features, visualization, and batch processing.

Visualization

6.1 Expose LangExtract's HTML visualization:
- GET /api/extract/:requestId/visualization — returns interactive HTML
- Embed in admin dashboard for extraction quality review
6.2 Store visualization artifacts in Azure Blob Storage (extractions container)

Batch & async processing

6.3 Add async extraction endpoint:
- POST /api/extract/async — returns job ID immediately
- GET /api/extract/jobs/:id — poll for status + results
- Webhook callback when complete
6.4 Add Vertex AI batch processing support (for high-volume MindLyst triage)

Custom model support

6.5 Add Ollama provider for local/air-gapped deployments
6.6 Add model benchmarking endpoint: run same task across models, compare quality + cost

Multi-language extraction

6.7 Test and validate extraction across languages (LangExtract supports multi-language via LLM)
6.8 Add language detection to extraction pipeline (auto-detect input language)

Env Vars Summary

Variable	Service	Default	Description
`PORT`	extraction-service	`4005`	Fastify listen port
`HOST`	extraction-service	`0.0.0.0`	Fastify listen host
`CORS_ORIGIN`	extraction-service	`*`	Allowed origins
`PYTHON_SIDECAR_URL`	extraction-service	`http://localhost:4006`	Python sidecar URL
`DEFAULT_MODEL_ID`	extraction-service	`gemini-2.5-flash`	Default LLM model
`GEMINI_API_KEY`	python sidecar	—	Google Gemini API key
`AZURE_OPENAI_API_KEY`	python sidecar	—	Azure OpenAI key (alternative)
`AZURE_OPENAI_ENDPOINT`	python sidecar	—	Azure OpenAI endpoint (alternative)
`MAX_WORKERS`	python sidecar	`10`	Parallel extraction workers
`MAX_CHAR_BUFFER`	python sidecar	`2000`	Chunk size for long docs
`EXTRACTION_CACHE_TTL`	python sidecar	`86400`	Cache TTL in seconds
`COSMOS_ENDPOINT`	extraction-service	—	Azure Cosmos DB endpoint
`COSMOS_KEY`	extraction-service	—	Azure Cosmos DB key
`COSMOS_DATABASE`	extraction-service	`lysnrai`	Database name
`JWT_SECRET`	extraction-service	—	JWT validation secret
`EXTRACTION_SERVICE_URL`	consumers	`http://localhost:4005`	Used by dashboards/backends

Port Allocation

Service	Port
growth-service	4001
billing-service	4002
platform-service	4003
tracker-service	4004
extraction-service	4005
extraction-service python sidecar (internal)	4006

Dependency Graph

@bytelyst/extraction (package)
  └── @bytelyst/api-client (peer dep)

@lysnrai/extraction-service (service)
  ├── @bytelyst/fastify-core
  ├── @bytelyst/auth
  ├── @bytelyst/config
  ├── @bytelyst/cosmos
  ├── @bytelyst/errors
  ├── fastify, zod, jose (direct deps)
  └── python sidecar
      └── langextract, fastapi, uvicorn, structlog

Estimated Effort

Phase	Effort	Dependencies
Phase 0 — Foundation	2–3 days	None
Phase 1 — Core API	2–3 days	Phase 0
Phase 2 — Task Library	2 days	Phase 1
Phase 3 — Consumer Integration	3–4 days	Phase 2
Phase 4 — Docker & DevOps	1–2 days	Phase 1
Phase 5 — Production Hardening	2–3 days	Phase 3
Phase 6 — Advanced (future)	Ongoing	Phase 5

Total MVP (Phases 0–4): ~10–14 days

Rollback Strategy

The extraction-service is additive — no existing code is modified until Phase 3
Phase 3 consumer integration uses new endpoints/routes — existing triage/transcript flows remain untouched
If extraction-service is down, consumers fall back to their existing behavior (MindLyst mock triage, LysnrAI raw transcripts)
The @bytelyst/extraction package is optional — dashboards only import it for new extraction features

24 KiB Raw Blame History Unescape Escape

Extraction Service — Roadmap & Task Checklist

Overview

Consumers

Phase 0 — Foundation & Scaffolding

Service scaffold (Fastify)

Python sidecar scaffold

Package scaffold (@bytelyst/extraction)

Workspace wiring

Phase 1 — Core Extraction API

Python extractor implementation

Fastify routes

Tests

Phase 2 — Predefined Task Library

Task definitions

Task registry (Cosmos DB)

Python task registry

Tests

Phase 3 — Consumer Integration

@bytelyst/extraction package finalization

LysnrAI integration

MindLyst integration

Tests

Phase 4 — Docker & DevOps

Dockerfile

Docker Compose

Run scripts

CI

Phase 5 — Production Hardening

Caching

Cost controls

Observability

Error handling

Phase 6 — Advanced Features (Future)

Visualization

Batch & async processing

Custom model support

Multi-language extraction

Env Vars Summary

Port Allocation

Dependency Graph

Estimated Effort

Rollback Strategy

24 KiB

Raw Blame History

Package scaffold (`@bytelyst/extraction`)

`@bytelyst/extraction` package finalization