24 KiB
Extraction Service — Roadmap & Task Checklist
Service:
@lysnrai/extraction-service(port 4005) Package:@bytelyst/extraction(shared types + client) Core dependency: google/langextract (Python)Companion docs: ECOSYSTEM_ARCHITECTURE.md · ROADMAP.md
Overview
A shared extraction microservice that uses Google's LangExtract library to extract structured information from unstructured text. Both LysnrAI and MindLyst consume this service for their respective extraction needs.
Architecture: Fastify (routing, auth, validation, request tracing) + Python sidecar (LangExtract). The Fastify layer keeps the service consistent with the other 4 services. The Python process handles the actual LLM-powered extraction.
┌──────────────────────────────────────────────────────────┐
│ extraction-service │
│ (port 4005) │
│ │
│ ┌─────────────────────┐ ┌──────────────────────────┐ │
│ │ Fastify (TS) │ │ Python Sidecar │ │
│ │ │ │ │ │
│ │ - Auth middleware │──►│ - LangExtract wrapper │ │
│ │ - Zod validation │◄──│ - Task registry │ │
│ │ - x-request-id │ │ - Model provider config │ │
│ │ - Rate limiting │ │ - Result caching │ │
│ │ - /health │ │ │ │
│ └─────────────────────┘ └──────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
▲ ▲
│ │
REST API FastAPI (internal :4006)
(external) or subprocess stdio
Consumers
| Product | Use Case | Entry Point |
|---|---|---|
| LysnrAI — Desktop/Backend | Post-transcription extraction (action items, decisions, dates, people) | backend/src/clients/extraction_client.py |
| LysnrAI — Admin Dashboard | Transcript analytics, entity review | admin-dashboard-web/src/lib/extraction-client.ts |
| MindLyst — KMP/Web | Triage pipeline (brain routing, entity extraction, topic classification) | mindlyst-native/web/src/pages/api/triage.ts |
| MindLyst — Web Dashboard | Brain insight generation, reflection enrichment | Direct API calls via @bytelyst/api-client |
Phase 0 — Foundation & Scaffolding
Goal: Set up the service skeleton, Python environment, and build pipeline.
Service scaffold (Fastify)
- 0.1 Create
services/extraction-service/directory structure:services/extraction-service/ src/ lib/ config.ts # Zod config schema (PORT, HOST, CORS, PYTHON_SIDECAR_URL, etc.) errors.ts # Re-export from @bytelyst/errors cosmos.ts # Re-export from @bytelyst/cosmos (for task registry persistence) product-config.ts # Re-export from @bytelyst/config python-bridge.ts # HTTP client to Python sidecar modules/ extract/ types.ts # Zod schemas: ExtractionTask, ExtractionExample, ExtractionResult routes.ts # POST /api/extract, POST /api/extract/batch, GET /api/tasks tasks/ types.ts # Predefined task definitions (triage, transcript, etc.) repository.ts # Cosmos CRUD for custom task definitions routes.ts # CRUD endpoints for task management server.ts # createServiceApp + route registration package.json tsconfig.json Dockerfile - 0.2 Create
package.json(@lysnrai/extraction-service, port 4005) matching existing service conventions - 0.3 Create
tsconfig.jsonextending../../tsconfig.base.json - 0.4 Create
src/lib/config.tswith Zod schema:PORT(default 4005),HOST,CORS_ORIGINPYTHON_SIDECAR_URL(defaulthttp://localhost:4006)DEFAULT_MODEL_ID(defaultgemini-2.5-flash)GEMINI_API_KEYorAZURE_OPENAI_API_KEY/AZURE_OPENAI_ENDPOINTMAX_WORKERS(default 10),MAX_CHAR_BUFFER(default 2000)COSMOS_ENDPOINT,COSMOS_KEY,COSMOS_DATABASE,JWT_SECRET
- 0.5 Create
src/server.tsusingcreateServiceApp()+startService()from@bytelyst/fastify-core - 0.6 Add
.env.examplewith all required env vars - 0.7 Verify:
pnpm buildpasses for the new service
Python sidecar scaffold
- 0.8 Create
services/extraction-service/python/directory:python/ src/ __init__.py app.py # FastAPI app (internal, port 4006) extractor.py # LangExtract wrapper task_registry.py # Built-in task definitions models.py # Pydantic models matching TS Zod schemas requirements.txt # langextract, fastapi, uvicorn, pydantic Dockerfile # Python 3.12 slim - 0.9 Create
python/requirements.txt:langextract>=0.3.0 fastapi>=0.115.0 uvicorn>=0.34.0 pydantic>=2.10.0 pydantic-settings>=2.7.0 structlog>=24.4.0 - 0.10 Create
python/src/app.py— FastAPI app with endpoints:POST /extract— single document extractionPOST /extract/batch— batch extractionGET /health— sidecar health check
- 0.11 Create
python/src/extractor.py— wrapper aroundlx.extract()with configurable model provider - 0.12 Verify: Python sidecar starts and
/healthreturns OK
Package scaffold (@bytelyst/extraction)
- 0.13 Create
packages/extraction/directory:packages/extraction/ src/ index.ts # Public API types.ts # Shared TypeScript types client.ts # createExtractionClient() factory package.json tsconfig.json - 0.14 Create
package.json(@bytelyst/extraction) with@bytelyst/api-clientas peer dep - 0.15 Define TypeScript types matching the extraction API:
ExtractionTask— prompt description + examples + model configExtractionExample— text + extractions (class, text, attributes)ExtractionResult— extracted entities with source groundingExtractionRequest— task + input text/URLExtractionResponse— results + metadata (model, duration, token count)
- 0.16 Create
createExtractionClient()factory usingcreateApiClient()pattern - 0.17 Verify:
pnpm buildpasses for the new package
Workspace wiring
- 0.18 Add
extraction-serviceandextractiontopnpm-workspace.yaml(already covered bypackages/*+services/*globs — verify) - 0.19 Run
pnpm installfrom repo root — verify workspace resolution - 0.20 Verify:
pnpm buildandpnpm typecheckpass across entire repo
Phase 1 — Core Extraction API
Goal: Working extraction endpoint that accepts text + task definition and returns structured results via LangExtract.
Python extractor implementation
- 1.1 Implement
extractor.py:- Accept task definition (prompt, examples, model config)
- Accept input text (string or URL)
- Call
lx.extract()with configurable parameters (model_id, extraction_passes, max_workers, max_char_buffer) - Return structured results with source grounding (extraction_class, extraction_text, attributes, char offsets)
- Handle errors gracefully (model timeout, rate limit, invalid input)
- 1.2 Implement model provider configuration:
- Gemini (default): API key from env
- Azure OpenAI: endpoint + key from env
- Ollama (local dev): configurable base URL
- 1.3 Add request/response logging via
structlog(neverprint()) - 1.4 Add request timeout configuration (default 120s for long documents)
Fastify routes
- 1.5 Implement
src/modules/extract/types.ts:ExtractRequestSchema(Zod) — task definition + input text + optionsExtractResponseSchema(Zod) — array of extractions + metadataBatchExtractRequestSchema— array of inputs + shared task
- 1.6 Implement
src/modules/extract/routes.ts:POST /api/extract— auth required, validates input, proxies to Python sidecarPOST /api/extract/batch— auth required, accepts multiple inputsGET /api/extract/models— list available model providers
- 1.7 Implement
src/lib/python-bridge.ts:- HTTP client to Python sidecar (fetch with timeout, retry, error mapping)
- Health check polling on startup (wait for sidecar readiness)
- Request ID forwarding (
x-request-id)
- 1.8 Add rate limiting to extraction endpoints (configurable per-user limit)
Tests
- 1.9 Write unit tests for Zod schemas (
types.test.ts) - 1.10 Write integration tests for extract routes (mock Python sidecar responses)
- 1.11 Write Python unit tests for
extractor.py(mocklx.extract) - 1.12 Verify:
pnpm testpasses,pytestpasses
Phase 2 — Predefined Task Library
Goal: Ship a curated set of extraction task definitions that LysnrAI and MindLyst can use out-of-the-box.
Task definitions
- 2.1 Define
transcript-extractiontask:- Classes:
action_item,decision,question,deadline,person,topic - 3–5 few-shot examples from realistic meeting transcripts
- Default model:
gemini-2.5-flash
- Classes:
- 2.2 Define
triagetask (MindLyst):- Classes:
topic,entity,action,emotion,date_reference,brain_signal - brain_signal attributes:
{ brain: "work|home|money|health|global", confidence: float } - 3–5 few-shot examples per brain type
- Classes:
- 2.3 Define
memory-insighttask (MindLyst):- Classes:
pattern,recurring_theme,relationship,milestone - Examples from accumulated brain memories
- Classes:
- 2.4 Define
reflection-enrichmenttask (MindLyst):- Classes:
emotional_state,accomplishment,concern,goal_progress - Examples from journal-style text
- Classes:
- 2.5 Define
bug-report-extractiontask (Tracker):- Classes:
steps_to_reproduce,expected_behavior,actual_behavior,affected_component,severity - Examples from real issue submissions
- Classes:
Task registry (Cosmos DB)
- 2.6 Create Cosmos container:
extraction_tasks(partition key:/productId) - 2.7 Implement
src/modules/tasks/repository.ts— CRUD for task definitions - 2.8 Implement
src/modules/tasks/routes.ts:GET /api/tasks— list all tasks (built-in + custom)GET /api/tasks/:id— get task by IDPOST /api/tasks— create custom task (admin only)PUT /api/tasks/:id— update task (admin only)DELETE /api/tasks/:id— delete custom task (admin only)
- 2.9 Seed built-in tasks on service startup (idempotent upsert)
- 2.10 Add
productIdto all task documents
Python task registry
- 2.11 Implement
task_registry.py— load task definitions from Cosmos (via Fastify API) or local JSON fallback - 2.12 Create
python/tasks/directory with JSON files for each built-in task - 2.13 Add task validation: verify examples follow LangExtract best practices (ordered, verbatim, no overlap)
Tests
- 2.14 Write tests for task CRUD routes
- 2.15 Write tests for task seeding logic
- 2.16 Verify: all tests pass
Phase 3 — Consumer Integration
Goal: Wire LysnrAI and MindLyst to call the extraction service.
@bytelyst/extraction package finalization
- 3.1 Add typed methods to
createExtractionClient():extract(input, taskId, options?)— single extractionextractBatch(inputs, taskId, options?)— batch extractionlistTasks()— get available tasksgetTask(id)— get task details
- 3.2 Export all types from
src/index.ts - 3.3 Publish:
pnpm buildinpackages/extraction/
LysnrAI integration
- 3.4 Add
@bytelyst/extractiontoadmin-dashboard-web/package.json(viafile:ref) - 3.5 Create
admin-dashboard-web/src/lib/extraction-client.ts— typed client instance - 3.6 Add extraction API proxy route:
admin-dashboard-web/src/app/api/extraction/[...path]/route.ts - 3.7 Create Python extraction client in
backend/src/clients/extraction_client.py:- HTTP client to extraction-service (port 4005)
- Methods:
extract_transcript(text),extract_batch(texts)
- 3.8 Add post-transcription extraction to LysnrAI backend:
- New endpoint:
POST /api/transcripts/{id}/extract - Calls extraction-service with
transcript-extractiontask - Stores results alongside transcript
- New endpoint:
- 3.9 Add extraction results display to admin dashboard (transcript detail page)
MindLyst integration
- 3.10 Add
@bytelyst/extractiontomindlyst-native/web/package.json(viafile:ref):"@bytelyst/extraction": "file:../../../learning_ai_common_plat/packages/extraction" - 3.11 Create
mindlyst-native/web/src/lib/extraction-client.ts - 3.12 Create API route:
mindlyst-native/web/src/pages/api/extract.ts- Accepts raw capture text, calls extraction-service with
triagetask - Returns brain routing + extracted entities
- Accepts raw capture text, calls extraction-service with
- 3.13 Update triage flow on web dashboard to use extraction results for brain auto-routing
- 3.14 Wire brain insight generation to use
memory-insighttask - 3.15 Wire reflection enrichment to use
reflection-enrichmenttask
Tests
- 3.16 Add integration tests for LysnrAI extraction endpoint
- 3.17 Add integration tests for MindLyst triage-via-extraction flow
- 3.18 Verify:
npx tsc --noEmitpasses in all 3 dashboards + MindLyst web
Phase 4 — Docker & DevOps
Goal: Containerize, add to docker-compose, update run scripts.
Dockerfile
- 4.1 Create multi-stage
Dockerfilefor extraction-service:- Stage 1: Node.js build (Fastify TS → JS)
- Stage 2: Python setup (install langextract + deps)
- Stage 3: Runtime (Node.js + Python, supervisord to run both processes)
- 4.2 Create
supervisord.confto manage Fastify (port 4005) + Python sidecar (port 4006) - 4.3 Verify:
docker buildsucceeds
Docker Compose
- 4.4 Add
extraction-servicetodocker-compose.yml:extraction-service: build: context: . dockerfile: services/extraction-service/Dockerfile ports: - '4005:4005' env_file: - .env environment: - PORT=4005 - PYTHON_SIDECAR_URL=http://localhost:4006 labels: - 'traefik.enable=true' - 'traefik.http.routers.extraction.rule=PathPrefix(`/api/extract`) || PathPrefix(`/api/tasks`)' - 'traefik.http.services.extraction.loadbalancer.server.port=4005' logging: driver: loki options: loki-url: 'http://host.docker.internal:3100/loki/api/v1/push' loki-retries: '3' restart: unless-stopped healthcheck: test: ['CMD', 'wget', '--no-verbose', '--tries=1', '--spider', 'http://localhost:4005/health'] interval: 30s timeout: 10s retries: 3 - 4.5 Add to LysnrAI
docker-compose.yml(references../learning_ai_common_plat/services/extraction-service/)
Run scripts
- 4.6 Add extraction-service to
run-local-all-services.shin LysnrAI repo - 4.7 Add extraction-service to
.windsurf/workflows/start-all-services.md - 4.8 Add
.env.exampleentries to LysnrAI repo root (EXTRACTION_SERVICE_URL=http://localhost:4005) - 4.9 Add
.env.exampleentries to MindLyst web (same)
CI
- 4.10 Create
.github/workflows/ci-extraction-service.yml:- Trigger: push to
services/extraction-service/**orpackages/extraction/** - Steps: pnpm install, pnpm build, pnpm test (TS), pip install + pytest (Python)
- Trigger: push to
- 4.11 Verify: CI workflow passes
Phase 5 — Production Hardening
Goal: Rate limiting, caching, observability, cost controls.
Caching
- 5.1 Add result caching in Python sidecar:
- Cache key: hash(task_id + input_text + model_id)
- TTL: configurable (default 24h)
- Storage: in-memory LRU (dev) or Redis (prod)
- 5.2 Add cache hit/miss headers to Fastify response (
X-Extraction-Cache: HIT/MISS)
Cost controls
- 5.3 Add per-user daily extraction quota (configurable per plan tier):
- Free: 10 extractions/day
- Pro: 100 extractions/day
- Enterprise: unlimited
- 5.4 Track usage in Cosmos
extraction_usagecontainer (partition:/userId) - 5.5 Return
429 Too Many Requestswith quota info when exceeded - 5.6 Add usage reporting endpoint:
GET /api/extract/usage(admin)
Observability
- 5.7 Add structured logging for every extraction:
- Request: task_id, input_length, model_id, user_id, product_id
- Response: entity_count, duration_ms, token_count, cache_hit
- 5.8 Add Prometheus metrics (via
fastify-metrics):extraction_requests_total(labels: task_id, model_id, product_id, status)extraction_duration_seconds(histogram)extraction_entities_extracted(histogram)extraction_cache_hit_total
- 5.9 Add Grafana dashboard for extraction service (in
services/monitoring/grafana/dashboards/)
Error handling
- 5.10 Map LangExtract errors to
@bytelyst/errors:- Model timeout →
408 Request Timeout - Rate limit (upstream LLM) →
429 Too Many Requestswith retry-after - Invalid task definition →
400 Bad Request - Model unavailable →
503 Service Unavailable
- Model timeout →
- 5.11 Add circuit breaker for Python sidecar (fail fast if sidecar is down)
- 5.12 Add graceful degradation: return partial results if some chunks fail
Phase 6 — Advanced Features (Future)
Goal: Power-user features, visualization, and batch processing.
Visualization
- 6.1 Expose LangExtract's HTML visualization:
GET /api/extract/:requestId/visualization— returns interactive HTML- Embed in admin dashboard for extraction quality review
- 6.2 Store visualization artifacts in Azure Blob Storage (
extractionscontainer)
Batch & async processing
- 6.3 Add async extraction endpoint:
POST /api/extract/async— returns job ID immediatelyGET /api/extract/jobs/:id— poll for status + results- Webhook callback when complete
- 6.4 Add Vertex AI batch processing support (for high-volume MindLyst triage)
Custom model support
- 6.5 Add Ollama provider for local/air-gapped deployments
- 6.6 Add model benchmarking endpoint: run same task across models, compare quality + cost
Multi-language extraction
- 6.7 Test and validate extraction across languages (LangExtract supports multi-language via LLM)
- 6.8 Add language detection to extraction pipeline (auto-detect input language)
Env Vars Summary
| Variable | Service | Default | Description |
|---|---|---|---|
PORT |
extraction-service | 4005 |
Fastify listen port |
HOST |
extraction-service | 0.0.0.0 |
Fastify listen host |
CORS_ORIGIN |
extraction-service | * |
Allowed origins |
PYTHON_SIDECAR_URL |
extraction-service | http://localhost:4006 |
Python sidecar URL |
DEFAULT_MODEL_ID |
extraction-service | gemini-2.5-flash |
Default LLM model |
GEMINI_API_KEY |
python sidecar | — | Google Gemini API key |
AZURE_OPENAI_API_KEY |
python sidecar | — | Azure OpenAI key (alternative) |
AZURE_OPENAI_ENDPOINT |
python sidecar | — | Azure OpenAI endpoint (alternative) |
MAX_WORKERS |
python sidecar | 10 |
Parallel extraction workers |
MAX_CHAR_BUFFER |
python sidecar | 2000 |
Chunk size for long docs |
EXTRACTION_CACHE_TTL |
python sidecar | 86400 |
Cache TTL in seconds |
COSMOS_ENDPOINT |
extraction-service | — | Azure Cosmos DB endpoint |
COSMOS_KEY |
extraction-service | — | Azure Cosmos DB key |
COSMOS_DATABASE |
extraction-service | lysnrai |
Database name |
JWT_SECRET |
extraction-service | — | JWT validation secret |
EXTRACTION_SERVICE_URL |
consumers | http://localhost:4005 |
Used by dashboards/backends |
Port Allocation
| Service | Port |
|---|---|
| growth-service | 4001 |
| billing-service | 4002 |
| platform-service | 4003 |
| tracker-service | 4004 |
| extraction-service | 4005 |
| extraction-service python sidecar (internal) | 4006 |
Dependency Graph
@bytelyst/extraction (package)
└── @bytelyst/api-client (peer dep)
@lysnrai/extraction-service (service)
├── @bytelyst/fastify-core
├── @bytelyst/auth
├── @bytelyst/config
├── @bytelyst/cosmos
├── @bytelyst/errors
├── fastify, zod, jose (direct deps)
└── python sidecar
└── langextract, fastapi, uvicorn, structlog
Estimated Effort
| Phase | Effort | Dependencies |
|---|---|---|
| Phase 0 — Foundation | 2–3 days | None |
| Phase 1 — Core API | 2–3 days | Phase 0 |
| Phase 2 — Task Library | 2 days | Phase 1 |
| Phase 3 — Consumer Integration | 3–4 days | Phase 2 |
| Phase 4 — Docker & DevOps | 1–2 days | Phase 1 |
| Phase 5 — Production Hardening | 2–3 days | Phase 3 |
| Phase 6 — Advanced (future) | Ongoing | Phase 5 |
Total MVP (Phases 0–4): ~10–14 days
Rollback Strategy
- The extraction-service is additive — no existing code is modified until Phase 3
- Phase 3 consumer integration uses new endpoints/routes — existing triage/transcript flows remain untouched
- If extraction-service is down, consumers fall back to their existing behavior (MindLyst mock triage, LysnrAI raw transcripts)
- The
@bytelyst/extractionpackage is optional — dashboards only import it for new extraction features