learning_ai_common_plat/docs/MCP+A2A/DOMAIN_EXTRACTION_SERVICE.md

1.9 KiB

Domain — extraction-service (MCP + A2A Opportunities)

Why extraction-service is ideal for MCP

It already provides:

  • a single entrypoint (POST /extract, POST /extract/batch)
  • async extraction jobs (/extract/jobs)
  • model registry
  • sidecar health monitoring and circuit breaker
  • rate limits + quotas + cache

Agents can use MCP tools to iterate on prompts/tasks safely and repeatably.

High-value MCP tool proposals

Core extraction

  • extraction.extract(text, taskId?, modelId?, productId?)
  • extraction.extractBatch(inputs, modelId?)

Async jobs

  • extraction.submitJob(inputs, modelId?, webhookUrl?)
  • extraction.getJob(jobId)
  • extraction.listJobs()

Observability

  • extraction.sidecarHealth()
  • extraction.metrics()
  • extraction.cacheStats() (backs GET /extract/cache-stats)
  • extraction.sidecarMonitoringState() (backs GET /extract/monitoring/sidecar)

Rate limits / admin utilities

  • extraction.getProductRateLimitStatus(productId?)
  • extraction.resetProductRateLimit(productId) (admin)
  • extraction.modelRegistry
  • extraction.taskCatalog
    • list task IDs used across products (triage, reflection-enrichment, memory-insight, etc.)
  • extraction.promptGuidelines

1) Task design loop

  • TaskDesignerAgent drafts:
    • task prompt
    • a small set of examples
  • EvalRunnerAgent runs:
    • extractBatch over an eval set
    • compares JSON shape correctness
  • RegressionAgent checks:
    • no degradation vs previous baseline

2) Extraction incident response

  • If extraction errors spike:
    • check sidecar health and circuit breaker state
    • reduce per-product rate limits
    • switch modelId (if supported)

Product integration hotspots

  • MindLyst web API routes proxy to extraction-service (/api/extract and triage routes).
  • Future: other products can standardize on the same tasks and use a shared task registry.