learning_ai_common_plat/docs/CLOUD/CLOUD_PROVIDER_MIGRATION_ANALYSIS.md

36 KiB
Raw Blame History

Cloud Provider Migration Analysis — ByteLyst Ecosystem

Author: AI Analysis (Cascade) Date: 2026-03-01 Scope: All 7 repos — LysnrAI, MindLyst, ChronoMind, NomGap, PeakPulse, Common Platform, JarvisJr Purpose: Evaluate current Azure investment, assess migration feasibility to AWS / GCP / MongoDB Atlas / multi-cloud, and provide actionable recommendations.


Table of Contents

  1. Executive Summary
  2. Current Azure Investment Inventory
  3. Dependency Depth Analysis
  4. Migration Target Comparison
  5. Per-Service Migration Analysis
  6. Migration Scenario Scoring
  7. Cost Comparison
  8. Abstraction Layer Assessment
  9. Risk Analysis
  10. Recommendations
  11. Migration Playbook (If Chosen)
  12. Appendix A: File-Level Azure Dependency Map
  13. Appendix B: SDK & Package Inventory

1. Executive Summary

The ByteLyst ecosystem is moderately coupled to Azure. The coupling is concentrated in 3 packages (@bytelyst/cosmos, @bytelyst/blob, @bytelyst/config) and 2 Python modules (azure_stt.py, cosmos_client.py). The architecture already uses an internal abstraction layer — most application code never imports Azure SDKs directly.

Key Findings

Dimension Assessment
Overall Azure lock-in Medium — concentrated in ~15 files, but those files are foundational
Easiest to migrate Blob Storage, Key Vault, OpenAI, Application Insights
Hardest to migrate Cosmos DB (SQL API queries in 56+ repository files), Azure Speech SDK
Best alternative DB MongoDB Atlas (closest query model to Cosmos SQL API)
Best alternative cloud AWS (broadest service parity, mature SDK ecosystem)
Estimated migration effort 48 weeks for full cloud swap (Cosmos DB is the long pole)
Recommendation Stay on Azure for now, but invest in abstraction layers to reduce future switching cost

Azure Services Used (8 total)

# Azure Service Monthly Cost Lock-in Risk Files Affected
1 Cosmos DB (SQL/NoSQL API) ~$410 HIGH 56+ repository files, 3 databases, ~45 containers
2 Blob Storage ~$0.20 LOW 2 packages + 1 Python module
3 Azure OpenAI ~$510 LOW 3 files (already supports OpenAI fallback)
4 Speech Services $0 (F0) HIGH 2 files (deep SDK integration, streaming)
5 Key Vault ~$0.06 LOW 2 files (1 TS, 1 Python)
6 Notification Hubs $0 (Free) MEDIUM Planned, not yet deeply integrated
7 Application Insights $0 (5GB free) LOW 1 file (custom telemetry already built)
8 Azure Identity (DefaultAzureCredential) $0 LOW Used by Key Vault + Secrets Manager

2. Current Azure Investment Inventory

2.1 Azure Resources (from Azure Portal)

Resource Azure Name Region SKU Status
Resource Group rg-mywisprai East US Active
Cosmos DB cosmos-mywisprai West US 2 Serverless Active — 3 DBs, ~45 containers
Blob Storage bytelystblobs West US 2 StorageV2, RAGRS Active — 9+ containers
Azure OpenAI mywisprai-openai-sweden Sweden Central S0 Active — gpt-4o-mini deployment
Speech Service mywisprai-speech East US F0 (Free) Active
Key Vault kv-mywisprai East US Standard Active — ~25 secrets
Notification Hubs lysnnai namespace East US Free Active — 2 hubs
App Insights bytelyst-appinsights East US Classic Active

2.2 Cosmos DB Databases & Containers

Database Containers Products Using
lysnrai ~27 containers (users, subscriptions, feature_flags, audit_log, tracker_items, telemetry_events, etc.) LysnrAI, platform-service (all products)
mindlyst ~20 containers (brains, memory_items, streaks, reflections, etc.) MindLyst
mywisprai 10 containers (legacy, pre-rebrand) Legacy / migration target

Total: ~57 containers across 3 databases, all using Cosmos SQL (NoSQL) API with SQL-like queries (SELECT, WHERE, ORDER BY, OFFSET/LIMIT, aggregate functions).

2.3 Code Investment by Language

Language Azure SDK Packages Files Using Azure Lines of Azure-Specific Code
TypeScript @azure/cosmos, @azure/storage-blob, @azure/identity, @azure/keyvault-secrets ~65 files ~500 lines
Python azure-cognitiveservices-speech, azure-cosmos, azure-storage-blob, azure-identity, azure-keyvault-secrets, openai (AzureOpenAI) ~8 files ~400 lines
Swift MicrosoftCognitiveServicesSpeech (SPX framework) ~3 files ~150 lines
Kotlin None directly (uses platform-service REST API) 0 files 0 lines

3. Dependency Depth Analysis

3.1 Cosmos DB — DEEP (56+ files)

This is the most deeply embedded Azure dependency. Every repository module follows the pattern:

types.ts → repository.ts → routes.ts
                ↑
         Uses @azure/cosmos SDK
         SQL queries: SELECT c.id, c.name FROM c WHERE c.productId = @pid

Touchpoints:

  • packages/cosmos/ — shared client singleton (@azure/cosmos peer dep)
  • services/platform-service/src/modules/*/repository.ts56 repository files with Cosmos SQL queries
  • services/extraction-service/src/modules/*/repository.ts — 2 repository files
  • dashboards/admin-web/src/lib/cosmos.ts — direct @azure/cosmos import
  • dashboards/admin-web/src/lib/repositories/*.ts — 4 repository files
  • mindlyst-native/web/src/lib/cosmos.ts — direct @azure/cosmos import
  • learning_voice_ai_agent/src/cloud/cosmos_client.py — Python Cosmos client
  • learning_voice_ai_agent/backend/src/cloud/cosmos.py — Python backend Cosmos client

Query patterns used:

  • container.items.query() with parameterized SQL
  • container.items.create(), .replace(), .delete(), .read()
  • container.items.upsert()
  • Partition key routing (/userId, /productId, /id)
  • Cross-partition queries (admin/analytics)
  • SELECT VALUE COUNT(1) aggregates
  • OFFSET ... LIMIT pagination
  • ORDER BY sorting
  • ARRAY_CONTAINS() for array queries

3.2 Azure Speech SDK — DEEP (3 files, streaming integration)

The Speech SDK is used for real-time streaming speech-to-text with features that are tightly coupled to the Azure SDK's event-driven architecture:

  • src/audio/azure_stt.py — 248 lines. Uses PushAudioInputStream, SpeechRecognizer, continuous recognition with recognizing/recognized/canceled/session_stopped event callbacks, PhraseListGrammar, auto-language detection (10 languages), auto-reconnect
  • src/ui/settings.py + src/ui/unified_window.py — connection testing
  • mindlyst-native/iosApp/Services/AzureSpeechTranscriber.swift — iOS Swift SPX framework
  • mobile_app/ios/LysnrAI/ — iOS keyboard extension uses SPX framework

3.3 Blob Storage — SHALLOW (3 files)

  • packages/blob/src/blob.ts — 162 lines, singleton client, SAS URL generation
  • src/cloud/blob_client.py — 190 lines, Python equivalent
  • services/platform-service/src/modules/blob/ — REST API wrapper

3.4 Azure OpenAI — SHALLOW (3 files, already abstracted)

  • src/llm/text_cleaner.py — uses openai.AzureOpenAI (OpenAI SDK with Azure endpoint)
  • backend/src/clients/openai_client.py — uses openai.AsyncAzureOpenAI
  • mindlyst-native/web/src/lib/llm.tsalready has OpenAI fallback (resolves provider dynamically)

The openai Python/JS SDK supports both Azure and OpenAI endpoints with minimal config change. MindLyst web already handles this automatically.

3.5 Key Vault — SHALLOW (2 files)

  • packages/config/src/keyvault.ts — 90 lines, resolveKeyVaultSecrets() with graceful fallback
  • src/secrets/keyvault.py — 69 lines, SecretResolver class with env var fallback

Both implementations already fall back to environment variables when Key Vault is unavailable. Migration = just stop using Key Vault and use the env var path.

3.6 Notification Hubs — NOT YET INTEGRATED

Planned but not deeply wired. Only namespace/hub exists in Azure. Mobile apps use BLPlatformClient (REST) to talk to platform-service, which would route push notifications.

3.7 Application Insights — SHALLOW (1 file)

  • opencensus-ext-azure in Python requirements (optional telemetry)
  • Custom telemetry system already built (@bytelyst/telemetry-client, platform-service telemetry module with Cosmos storage)

The custom telemetry system means App Insights is supplementary, not critical.


4. Migration Target Comparison

4.1 Database: Cosmos DB → Alternatives

Feature Azure Cosmos DB (current) MongoDB Atlas AWS DynamoDB Google Firestore PostgreSQL (Supabase/Neon)
Data model Document (JSON) Document (JSON) Key-Value + Document Document (JSON) Relational + JSONB
Query language SQL-like MQL (MongoDB Query) PartiQL / API GQL-like API SQL
Partition keys Required Shard keys (optional) Required Collection groups Not applicable
Serverless Yes Yes (Atlas Serverless) Yes Yes Yes (Neon)
SQL queries SELECT c.id FROM c WHERE c.x = @y db.collection.find({x: y}) SELECT id FROM table WHERE x = ? Client SDK queries Standard SQL
Aggregates Basic (COUNT, SUM, AVG) Full ($group, $match, $lookup) Limited Limited Full SQL
Cross-partition Yes (expensive) Yes (scatter-gather) Scan (expensive) Yes N/A
Change feed Yes Change Streams DynamoDB Streams Real-time listeners Logical replication
Global distribution Built-in multi-region Atlas Global Clusters Global Tables Multi-region Manual / Citus
Max doc size 2 MB 16 MB 400 KB 1 MB Unlimited (JSONB)
Free tier 1000 RU/s + 25 GB 512 MB 25 GB + 25 WCU/RCU 1 GiB + 50K reads/day 0.5 GB (Neon)
Migration effort Medium (query rewrite) Hard (paradigm shift) Hard (no SQL) Hard (schema design)

4.2 Object Storage: Blob → Alternatives

Feature Azure Blob (current) AWS S3 GCP Cloud Storage Cloudflare R2 MinIO (self-hosted)
API compatibility Azure Blob API S3 API GCS API / S3-compat S3-compatible S3-compatible
SAS tokens Yes (Azure SAS) Pre-signed URLs Signed URLs Pre-signed URLs Pre-signed URLs
CDN integration Azure CDN CloudFront Cloud CDN Built-in Manual
Cost (per GB) $0.018 (Cool) $0.023 (Standard) $0.020 $0.015 (no egress) Self-hosted
Migration effort Easy Easy Easy Easy

4.3 Speech-to-Text: Azure Speech → Alternatives

Feature Azure Speech (current) AWS Transcribe Google Speech-to-Text Deepgram Whisper (local)
Streaming STT Yes (push stream) Yes (WebSocket) Yes (streaming) Yes (WebSocket) No (batch only)
Languages 100+ 100+ 125+ 36+ 99+
Auto-detect lang Up to 10 at-once Yes Yes Yes Yes
Custom vocabulary PhraseListGrammar Custom vocabulary Speech adaptation Keywords No
Native SDK Python, Swift (SPX), JS Python, no iOS SDK Python, iOS, JS REST/WebSocket Python only
iOS native SDK SPX framework (ObjC) No native SDK Yes (gRPC) No native SDK No
Free tier 5 hrs/month (F0) 60 min/month 60 min/month None Free (local GPU)
Latency ~200ms ~300ms ~200ms ~100ms ~500ms+ (local)
Migration effort Hard (no iOS SDK) Medium (has iOS SDK) Medium (REST only) Hard (no streaming)

4.4 LLM / AI: Azure OpenAI → Alternatives

Feature Azure OpenAI (current) OpenAI API (direct) Google Gemini AWS Bedrock Anthropic Claude
Models GPT-4o, GPT-4o-mini Same models Gemini 2.5 Claude, Llama, Titan Claude 3.5/4
API compatibility OpenAI SDK (azure mode) OpenAI SDK (native) Google SDK AWS SDK Anthropic SDK
Data residency Azure regions US only Google regions AWS regions US/EU
Cost (GPT-4o-mini) $0.15/$0.60 per M tokens $0.15/$0.60 per M tokens ~$0.10/$0.40 (Flash) Varies ~$0.25/$1.25 (Haiku)
Migration effort Trivial (change endpoint) Easy (SDK swap) Medium Easy (SDK swap)

4.5 Secrets Management: Key Vault → Alternatives

Feature Azure Key Vault (current) AWS Secrets Manager GCP Secret Manager HashiCorp Vault Doppler / Infisical
Cost $0.03/10K ops $0.40/secret/month $0.06/10K ops Free (OSS) Free tier
SDK @azure/keyvault-secrets @aws-sdk/client-secrets-manager @google-cloud/secret-manager HTTP API SDK / CLI
Migration effort Easy Easy Medium Easy

Note: The codebase already falls back to env vars when Key Vault is unavailable. This means Key Vault can be replaced by any secrets manager or simply .env files without code changes to application logic.

4.6 Push Notifications: Notification Hubs → Alternatives

Feature Azure NH (current) AWS SNS Firebase Cloud Messaging OneSignal Expo Push
APNs + FCM Yes Yes FCM only (APNs via FCM) Yes Yes
Free tier 1M pushes/month 1M publishes Unlimited 10K subscribers Unlimited
Migration effort Easy Easy Easy Easy (NomGap uses Expo)

5. Per-Service Migration Analysis

5.1 Cosmos DB → MongoDB Atlas

Difficulty: MEDIUM-HIGH | Effort: 35 weeks | Risk: MEDIUM

This is the single largest migration task. Here's why:

What needs to change

Layer Current (Cosmos SQL API) Target (MongoDB) Files
Client package @azure/cosmosCosmosClient mongodbMongoClient packages/cosmos/src/client.ts
Container registry getContainer(name) db.collection(name) packages/cosmos/src/containers.ts
All repository files container.items.query('SELECT...') collection.find({...}) 56+ files in platform-service
Dashboard Cosmos clients @azure/cosmos direct mongodb direct 2 files (admin, MindLyst)
Python clients azure.cosmos.CosmosClient pymongo.MongoClient 2 files
Query syntax SQL-like (SELECT c.id FROM c WHERE c.productId = @pid AND c.userId = @uid ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20) MQL (collection.find({productId: pid, userId: uid}).sort({createdAt: -1}).skip(0).limit(20)) All repository files
Partition keys Explicit partition key in every query Shard key (auto-routed) All repository files
Upsert container.items.upsert(doc) collection.updateOne({_id: id}, {$set: doc}, {upsert: true}) ~20 files
Read by ID container.item(id, partitionKey).read() collection.findOne({_id: id}) All repository files

What stays the same

  • Document structure (JSON documents with id, productId, partition keys)
  • Data model (no schema changes needed — MongoDB is also schemaless)
  • Partition key concept maps to shard key
  • Serverless pricing model available on both

Key migration steps

  1. Update @bytelyst/cosmos package to export MongoDB-compatible API
  2. Rewrite all SQL queries to MQL (56+ files)
  3. Replace container.items.query()collection.find()
  4. Replace container.item(id, pk).read()collection.findOne({_id: id})
  5. Replace container.items.create()collection.insertOne()
  6. Replace container.items.replace()collection.replaceOne()
  7. Replace container.items.upsert()collection.updateOne({upsert: true})
  8. Update Python clients similarly
  9. Migrate data (use Azure Data Factory or custom script)
  10. Update all test mocks

Why MongoDB Atlas is the best DB alternative

  • Closest query model to Cosmos SQL API (both are document DBs)
  • MongoDB has a Cosmos DB compatibility mode (but going native is better)
  • Cosmos DB was originally inspired by MongoDB's document model
  • MongoDB's find() queries map closely to Cosmos SQL SELECT queries
  • Both support partition/shard keys, TTL indexes, change streams
  • MongoDB Atlas Serverless pricing is competitive
  • MongoDB has excellent TypeScript and Python SDKs

5.2 Azure Speech → Google Cloud Speech-to-Text

Difficulty: HIGH | Effort: 23 weeks | Risk: HIGH

Why this is hard

  • The Azure Speech SDK uses a push-stream architecture (PushAudioInputStream) that is deeply integrated into the audio pipeline
  • The SpeechRecognizer has event-driven callbacks (recognizing, recognized, canceled, session_stopped) that the code relies on for real-time partial/final transcript delivery
  • Custom vocabulary via PhraseListGrammar is Azure-specific
  • Auto-language detection config is Azure-specific
  • The iOS SPX framework (Objective-C) is used in LysnrAI keyboard extension and MindLyst — there's no direct equivalent for most alternatives

Best alternative: Google Cloud Speech-to-Text

  • Has streaming recognition with similar event model
  • Has an iOS SDK (gRPC-based)
  • Supports custom vocabulary (speech adaptation)
  • Supports auto-language detection
  • Similar pricing and free tier

What needs to change

  • src/audio/azure_stt.py — complete rewrite (~248 lines)
  • iosApp/Services/AzureSpeechTranscriber.swift — complete rewrite
  • LysnrAI/LysnrKeyboard/ — keyboard extension STT integration
  • Audio format handling (may differ between providers)
  • Connection test code in settings UI

5.3 Blob Storage → AWS S3 or Cloudflare R2

Difficulty: LOW | Effort: 23 days | Risk: LOW

Why this is easy

  • @bytelyst/blob package is a thin wrapper (162 lines)
  • Only 3 files need changes
  • S3 API is the de facto standard — R2, MinIO, GCS all support S3-compatible API
  • SAS tokens → Pre-signed URLs (same concept, different implementation)

What needs to change

  • packages/blob/src/blob.ts — swap @azure/storage-blob@aws-sdk/client-s3 + @aws-sdk/s3-request-presigner
  • src/cloud/blob_client.py — swap azure.storage.blobboto3
  • services/platform-service/src/modules/blob/ — update routes for pre-signed URL format
  • Environment variables: AZURE_BLOB_*AWS_S3_* or S3_*

5.4 Azure OpenAI → OpenAI API (direct) or Gemini

Difficulty: TRIVIAL | Effort: < 1 day | Risk: VERY LOW

Why this is trivial

  • The openai Python SDK supports both Azure and OpenAI endpoints — just change config
  • MindLyst web llm.ts already auto-detects Azure vs OpenAI and builds the correct URL
  • LysnrAI desktop uses AzureOpenAI class from openai SDK — switch to OpenAI class
  • Same models, same API shape, same pricing

What needs to change

  • Set OPENAI_API_KEY instead of AZURE_OPENAI_* env vars
  • Change AzureOpenAI(azure_endpoint=..., api_key=..., api_version=...)OpenAI(api_key=...)
  • Change AsyncAzureOpenAI(...)AsyncOpenAI(...)
  • Remove api_version parameter
  • That's it. The openai SDK handles the rest.

5.5 Key Vault → Environment Variables / Any Secrets Manager

Difficulty: TRIVIAL | Effort: < 1 day | Risk: VERY LOW

Both keyvault.ts and keyvault.py already implement graceful fallback:

  • If AZURE_KEYVAULT_URL is not set → uses env vars directly
  • If Key Vault is unreachable → falls back to env vars

To migrate: Simply stop setting AZURE_KEYVAULT_URL. Everything works via env vars. Then optionally adopt any other secrets manager (AWS Secrets Manager, Doppler, Infisical, etc.).

5.6 Notification Hubs → Firebase Cloud Messaging

Difficulty: LOW | Effort: 12 days | Risk: LOW

Not yet deeply integrated. The platform-service notification module sends via REST API. Swap the push provider client.

5.7 Application Insights → Self-hosted / Grafana

Difficulty: TRIVIAL | Effort: Already done | Risk: NONE

The ecosystem already has:

  • Custom telemetry system (@bytelyst/telemetry-client → platform-service → Cosmos)
  • Loki + Grafana in services/monitoring/
  • App Insights is supplementary, can be dropped with zero code changes

6. Migration Scenario Scoring

Scenario A: Stay on Azure (Status Quo)

Dimension Score (1-5) Notes
Migration effort 5 (none) No work needed
Cost 4 ~$15/month at current scale, competitive
Vendor diversity 1 Single cloud vendor
Feature parity 5 Everything works today
Total 15/20

Scenario B: Full Migration to AWS

Dimension Score (1-5) Notes
Migration effort 2 68 weeks, Cosmos→DynamoDB is painful
Cost 3 Similar or slightly higher at small scale
Vendor diversity 1 Still single cloud, just different
Feature parity 3 No native iOS Speech SDK, DynamoDB query model is very different
Total 9/20

Scenario C: Multi-Cloud (MongoDB Atlas + OpenAI + R2 + Google STT)

Dimension Score (1-5) Notes
Migration effort 2 57 weeks, Cosmos→MongoDB is medium
Cost 4 MongoDB Atlas free tier, R2 no egress fees
Vendor diversity 5 No single-vendor dependency
Feature parity 4 MongoDB is a better document DB than Cosmos in many ways
Total 15/20

Scenario D: Stay Azure + Add Abstraction Layers

Dimension Score (1-5) Notes
Migration effort 4 12 weeks to add repository interface pattern
Cost 4 No change
Vendor diversity 3 Ready to switch, but still on Azure
Feature parity 5 Everything works today
Total 16/20 Winner

Scenario E: Migrate DB Only (Cosmos → MongoDB Atlas, keep rest on Azure)

Dimension Score (1-5) Notes
Migration effort 3 35 weeks for DB migration
Cost 4 MongoDB Atlas Serverless may be cheaper
Vendor diversity 3 DB is independent, other services still Azure
Feature parity 5 MongoDB is very capable
Total 15/20

7. Cost Comparison

Current Azure Costs (MVP / Low Usage)

Service Monthly Cost Notes
Cosmos DB (Serverless) ~$410 3 databases, ~45 containers
Blob Storage (Cool, RAGRS) ~$0.20 9+ containers
Azure OpenAI (GPT-4o-mini) ~$510 Pay per token
Speech (F0) $0 5 hrs/month free
Key Vault ~$0.06 ~25 secrets
Notification Hubs (Free) $0 1M pushes/month
App Insights $0 5 GB/month free
Total ~$1020/month

Equivalent AWS Costs

Service AWS Equivalent Monthly Cost
Cosmos DB → DynamoDB (On-Demand) DynamoDB ~$515
Blob → S3 Standard S3 ~$0.25
Azure OpenAI → OpenAI API Same pricing ~$510
Speech → Transcribe Transcribe ~$13
Key Vault → Secrets Manager Secrets Manager ~$10 (per-secret pricing)
Notification Hubs → SNS SNS ~$0.50
App Insights → CloudWatch CloudWatch ~$3
Total ~$2542/month

Equivalent Multi-Cloud Costs

Service Provider Monthly Cost
Cosmos DB → MongoDB Atlas Serverless MongoDB ~$38
Blob → Cloudflare R2 Cloudflare ~$0.15 (no egress)
Azure OpenAI → OpenAI API (direct) OpenAI ~$510
Speech → Google STT Google Cloud ~$13
Key Vault → Doppler (free tier) Doppler $0
Push → Firebase FCM Google $0
Monitoring → Grafana Cloud (free) Grafana $0
Total ~$1022/month

Cost Summary

Scenario Monthly Cost vs Current
Azure (current) ~$1020 Baseline
Full AWS ~$2542 +50110%
Multi-cloud ~$1022 ~Same
MongoDB Atlas + Azure rest ~$1018 ~Same

Verdict: At current scale, cost is not a compelling reason to migrate. All options are under $50/month. Cost becomes more significant at scale (10K+ users), where MongoDB Atlas and R2 would likely be cheaper due to no egress fees and better serverless pricing.


8. Abstraction Layer Assessment

Current State: Partially Abstracted

The codebase already has meaningful abstraction:

Layer Abstraction Level Notes
Cosmos DB Partial@bytelyst/cosmos package Application code still writes raw SQL queries and uses @azure/cosmos types
Blob Storage Good@bytelyst/blob package Thin wrapper, easy to swap internals
OpenAI/LLM Good — MindLyst has provider auto-detection LysnrAI desktop/backend hardcodes AzureOpenAI
Key Vault Excellent — graceful fallback to env vars Already cloud-agnostic in practice
Speech None — raw SDK usage Deep Azure SDK coupling in 3 files
Auth (JWT) Excellent — uses jose library No cloud dependency
Push notifications Good — platform-service abstraction Swap provider client only

What's Missing: Repository Interface Pattern

The biggest gap is that repository files directly use @azure/cosmos types and SQL query syntax. To make the DB layer swappable, you'd need:

// Proposed: packages/cosmos/src/repository.ts
export interface DocumentRepository<T> {
  findById(id: string, partitionKey: string): Promise<T | null>;
  findMany(filter: Record<string, unknown>, opts?: QueryOptions): Promise<T[]>;
  create(doc: T): Promise<T>;
  replace(id: string, doc: T, partitionKey: string): Promise<T>;
  upsert(doc: T): Promise<T>;
  delete(id: string, partitionKey: string): Promise<void>;
  count(filter: Record<string, unknown>): Promise<number>;
}

This would allow swapping Cosmos → MongoDB → PostgreSQL behind the interface without touching 56+ repository files.

Effort to add: 12 weeks. This is the highest-ROI investment regardless of migration decision.


9. Risk Analysis

9.1 Risks of Staying on Azure

Risk Likelihood Impact Mitigation
Azure pricing increases Low Medium Add abstraction layer for future portability
Azure outage Low High Multi-region already possible (Cosmos global distribution)
Feature stagnation Very Low Low Azure is investing heavily in AI services
Vendor lock-in deepens over time Medium Medium Add abstraction layers proactively

9.2 Risks of Migrating

Risk Likelihood Impact Mitigation
Data loss during migration Low Critical Test migration on staging first, keep Azure as backup
Query performance differences Medium Medium Benchmark before committing
Feature gaps in new provider Medium Medium Prototype critical features first
Wasted engineering time Medium High Only migrate if there's a clear business driver
Regression bugs in 56+ repository files High Medium Comprehensive test suite (1,029 tests) catches most issues
Speech quality degradation Medium High A/B test both providers before committing

9.3 Azure-Specific Lock-in Risks (ranked)

# Component Lock-in Level Escape Hatch
1 Cosmos DB SQL API High Rewrite queries to MongoDB MQL or add repository interface
2 Azure Speech SDK (streaming) High Google STT has comparable streaming API
3 Azure Identity (DefaultAzureCredential) Medium Only used by Key Vault, which is already optional
4 Blob Storage SAS tokens Low Pre-signed URLs are equivalent across all providers
5 Azure OpenAI Very Low OpenAI SDK works with both — 1-line config change
6 Key Vault Very Low Already has env var fallback
7 Notification Hubs Very Low Not deeply integrated yet
8 Application Insights None Custom telemetry already built

10. Recommendations

This is the highest-scoring approach. Here's the prioritized action plan:

Phase 1: Add Repository Interface (12 weeks)

  • Create DocumentRepository<T> interface in @bytelyst/cosmos
  • Implement CosmosDocumentRepository<T> that wraps current @azure/cosmos calls
  • Gradually migrate the 56 repository files to use the interface
  • This makes future DB migration a matter of implementing MongoDocumentRepository<T> — no application code changes needed

Phase 2: Normalize LLM Abstraction (23 days)

  • Move LysnrAI desktop/backend from AzureOpenAI → auto-detecting provider pattern (like MindLyst web already does)
  • Support OPENAI_PROVIDER=azure|openai|gemini across all repos
  • This makes LLM provider swappable via config

Phase 3: Speech Abstraction Layer (1 week, optional)

  • Create SpeechTranscriber protocol/interface
  • Implement AzureSpeechTranscriber (current code, extracted)
  • Prepare GoogleSpeechTranscriber stub for future use
  • This is lower priority since Azure Speech F0 tier is free

Phase 4: Document Decision Criteria for Future Migration

  • Define triggers that would justify migration (e.g., cost > $X/month, Azure outage > Y hours, need for feature Z)
  • Review annually

Why NOT Migrate Now

  1. Cost is negligible — ~$1020/month doesn't justify weeks of engineering
  2. No business driver — Azure isn't blocking any feature development
  3. Risk/reward is unfavorable — 48 weeks of migration work for ~$0 cost savings
  4. Test coverage is good but not perfect — 1,029 tests cover most paths, but query-level changes in 56 files still risk regressions
  5. Azure free tiers are generous — Speech F0, Notification Hubs Free, App Insights free tier

When Migration WOULD Make Sense

  • Cosmos DB costs exceed $100/month → Consider MongoDB Atlas Serverless
  • Azure Speech quality is insufficient → Evaluate Google STT or Deepgram
  • Enterprise customer requires specific cloud → Build the repository interface, then implement their cloud backend
  • Azure has extended outage affecting your region → Multi-region or multi-cloud
  • You want to go fully open-source → PostgreSQL (Supabase) + Whisper + MinIO (significant rewrite)

11. Migration Playbook (If Chosen)

If you decide to migrate in the future, here's the execution order (shortest critical path):

Week 12: Database Abstraction

  1. Create DocumentRepository<T> interface
  2. Implement CosmosDocumentRepository<T> (wraps current code)
  3. Migrate all 56 repository files to use interface
  4. Verify all 1,029 tests pass

Week 34: Database Migration (Cosmos → MongoDB)

  1. Implement MongoDocumentRepository<T>
  2. Set up MongoDB Atlas Serverless cluster
  3. Write data migration script (Cosmos → MongoDB)
  4. Run migration on staging, verify data integrity
  5. Switch repository implementation via config flag
  6. Run full test suite against MongoDB

Week 5: Storage + Secrets

  1. Swap @bytelyst/blob internals to S3-compatible client
  2. Migrate blobs (azcopy → aws s3 sync or similar)
  3. Replace Key Vault with new secrets manager (or just env vars)
  4. Update all environment variable names

Week 6: LLM + Speech (if needed)

  1. Switch OpenAI from Azure endpoint to direct (config change only)
  2. If migrating Speech: rewrite azure_stt.py and Swift AzureSpeechTranscriber
  3. A/B test new speech provider against Azure

Week 78: Cleanup + Verification

  1. Remove all @azure/* npm packages
  2. Remove all azure-* pip packages
  3. Update Docker configs, CI/CD
  4. Update documentation
  5. Monitor production for 2 weeks

Appendix A: File-Level Azure Dependency Map

TypeScript — @azure/cosmos (CRITICAL)

File Repo Direct Import
packages/cosmos/src/client.ts common-plat @azure/cosmos
packages/cosmos/src/containers.ts common-plat @azure/cosmos
services/platform-service/src/modules/*/repository.ts (56 files) common-plat Via @bytelyst/cosmos
services/extraction-service/src/modules/*/repository.ts (2 files) common-plat Via @bytelyst/cosmos
dashboards/admin-web/src/lib/cosmos.ts common-plat @azure/cosmos
dashboards/admin-web/src/lib/repositories/*.ts (4 files) common-plat Via cosmos.ts
mindlyst-native/web/src/lib/cosmos.ts MindLyst @azure/cosmos

TypeScript — @azure/storage-blob

File Repo Direct Import
packages/blob/src/blob.ts common-plat @azure/storage-blob

TypeScript — @azure/identity + @azure/keyvault-secrets

File Repo Direct Import
packages/config/src/keyvault.ts common-plat Dynamic import (both)
dashboards/admin-web/src/app/api/ops/secrets/route.ts common-plat Both (Secrets Manager UI)

Python — Azure SDKs

File Repo SDK
src/audio/azure_stt.py LysnrAI azure.cognitiveservices.speech
src/cloud/cosmos_client.py LysnrAI azure.cosmos
src/cloud/blob_client.py LysnrAI azure.storage.blob
src/secrets/keyvault.py LysnrAI azure.identity, azure.keyvault.secrets
backend/src/secrets/keyvault.py LysnrAI azure.identity, azure.keyvault.secrets
backend/src/cloud/cosmos.py LysnrAI azure.cosmos
src/llm/text_cleaner.py LysnrAI openai.AzureOpenAI
backend/src/clients/openai_client.py LysnrAI openai.AsyncAzureOpenAI

Swift — Azure Speech SDK

File Repo SDK
iosApp/Services/AzureSpeechTranscriber.swift MindLyst MicrosoftCognitiveServicesSpeech
LysnrAI/LysnrKeyboard/KeyboardViewController.swift LysnrAI SPX framework (via CocoaPods)

Appendix B: SDK & Package Inventory

npm packages (TypeScript)

Package Version Used By Swappable
@azure/cosmos ≥4.0.0 @bytelyst/cosmos, admin-web, MindLyst web Medium (query rewrite)
@azure/storage-blob ≥12.0.0 @bytelyst/blob Easy (S3 compat)
@azure/identity latest @bytelyst/config, admin-web secrets Easy (remove)
@azure/keyvault-secrets latest @bytelyst/config, admin-web secrets Easy (remove)

pip packages (Python)

Package Version Used By Swappable
azure-cognitiveservices-speech ≥1.42.0 Desktop STT Hard (deep SDK integration)
azure-cosmos latest Desktop + backend Cosmos client Medium (pymongo swap)
azure-storage-blob ≥12.24.0 Desktop blob client Easy (boto3 swap)
azure-identity ≥1.19.0 Key Vault auth Easy (remove)
azure-keyvault-secrets ≥4.9.0 Secrets resolver Easy (remove)
openai ≥1.60.0 AzureOpenAI / AsyncAzureOpenAI Trivial (change class name)
opencensus-ext-azure ≥1.1.0 Optional telemetry Trivial (remove)

Swift packages / CocoaPods

Package Used By Swappable
MicrosoftCognitiveServicesSpeech (SPX) LysnrAI iOS, MindLyst iOS Hard (need alternative streaming STT)

Document generated by automated codebase analysis. Numbers are accurate as of 2026-03-01. Update as the codebase evolves.