Cloud Provider Migration Analysis — ByteLyst Ecosystem
Author: AI Analysis (Cascade)
Date: 2026-03-01
Scope: All 7 repos — LysnrAI, MindLyst, ChronoMind, NomGap, PeakPulse, Common Platform, JarvisJr
Purpose: Evaluate current Azure investment, assess migration feasibility to AWS / GCP / MongoDB Atlas / multi-cloud, and provide actionable recommendations.
Table of Contents
- Executive Summary
- Current Azure Investment Inventory
- Dependency Depth Analysis
- Migration Target Comparison
- Per-Service Migration Analysis
- Migration Scenario Scoring
- Cost Comparison
- Abstraction Layer Assessment
- Risk Analysis
- Recommendations
- Migration Playbook (If Chosen)
- Appendix A: File-Level Azure Dependency Map
- Appendix B: SDK & Package Inventory
1. Executive Summary
The ByteLyst ecosystem is moderately coupled to Azure. The coupling is concentrated in 3 packages (@bytelyst/cosmos, @bytelyst/blob, @bytelyst/config) and 2 Python modules (azure_stt.py, cosmos_client.py). The architecture already uses an internal abstraction layer — most application code never imports Azure SDKs directly.
Key Findings
| Dimension |
Assessment |
| Overall Azure lock-in |
Medium — concentrated in ~15 files, but those files are foundational |
| Easiest to migrate |
Blob Storage, Key Vault, OpenAI, Application Insights |
| Hardest to migrate |
Cosmos DB (SQL API queries in 56+ repository files), Azure Speech SDK |
| Best alternative DB |
MongoDB Atlas (closest query model to Cosmos SQL API) |
| Best alternative cloud |
AWS (broadest service parity, mature SDK ecosystem) |
| Estimated migration effort |
4–8 weeks for full cloud swap (Cosmos DB is the long pole) |
| Recommendation |
Stay on Azure for now, but invest in abstraction layers to reduce future switching cost |
Azure Services Used (8 total)
| # |
Azure Service |
Monthly Cost |
Lock-in Risk |
Files Affected |
| 1 |
Cosmos DB (SQL/NoSQL API) |
~$4–10 |
HIGH |
56+ repository files, 3 databases, ~45 containers |
| 2 |
Blob Storage |
~$0.20 |
LOW |
2 packages + 1 Python module |
| 3 |
Azure OpenAI |
~$5–10 |
LOW |
3 files (already supports OpenAI fallback) |
| 4 |
Speech Services |
$0 (F0) |
HIGH |
2 files (deep SDK integration, streaming) |
| 5 |
Key Vault |
~$0.06 |
LOW |
2 files (1 TS, 1 Python) |
| 6 |
Notification Hubs |
$0 (Free) |
MEDIUM |
Planned, not yet deeply integrated |
| 7 |
Application Insights |
$0 (5GB free) |
LOW |
1 file (custom telemetry already built) |
| 8 |
Azure Identity (DefaultAzureCredential) |
$0 |
LOW |
Used by Key Vault + Secrets Manager |
2. Current Azure Investment Inventory
2.1 Azure Resources (from Azure Portal)
| Resource |
Azure Name |
Region |
SKU |
Status |
| Resource Group |
rg-mywisprai |
East US |
— |
Active |
| Cosmos DB |
cosmos-mywisprai |
West US 2 |
Serverless |
Active — 3 DBs, ~45 containers |
| Blob Storage |
bytelystblobs |
West US 2 |
StorageV2, RAGRS |
Active — 9+ containers |
| Azure OpenAI |
mywisprai-openai-sweden |
Sweden Central |
S0 |
Active — gpt-4o-mini deployment |
| Speech Service |
mywisprai-speech |
East US |
F0 (Free) |
Active |
| Key Vault |
kv-mywisprai |
East US |
Standard |
Active — ~25 secrets |
| Notification Hubs |
lysnnai namespace |
East US |
Free |
Active — 2 hubs |
| App Insights |
bytelyst-appinsights |
East US |
Classic |
Active |
2.2 Cosmos DB Databases & Containers
| Database |
Containers |
Products Using |
lysnrai |
~27 containers (users, subscriptions, feature_flags, audit_log, tracker_items, telemetry_events, etc.) |
LysnrAI, platform-service (all products) |
mindlyst |
~20 containers (brains, memory_items, streaks, reflections, etc.) |
MindLyst |
mywisprai |
10 containers (legacy, pre-rebrand) |
Legacy / migration target |
Total: ~57 containers across 3 databases, all using Cosmos SQL (NoSQL) API with SQL-like queries (SELECT, WHERE, ORDER BY, OFFSET/LIMIT, aggregate functions).
2.3 Code Investment by Language
| Language |
Azure SDK Packages |
Files Using Azure |
Lines of Azure-Specific Code |
| TypeScript |
@azure/cosmos, @azure/storage-blob, @azure/identity, @azure/keyvault-secrets |
~65 files |
~500 lines |
| Python |
azure-cognitiveservices-speech, azure-cosmos, azure-storage-blob, azure-identity, azure-keyvault-secrets, openai (AzureOpenAI) |
~8 files |
~400 lines |
| Swift |
MicrosoftCognitiveServicesSpeech (SPX framework) |
~3 files |
~150 lines |
| Kotlin |
None directly (uses platform-service REST API) |
0 files |
0 lines |
3. Dependency Depth Analysis
3.1 Cosmos DB — DEEP (56+ files)
This is the most deeply embedded Azure dependency. Every repository module follows the pattern:
types.ts → repository.ts → routes.ts
↑
Uses @azure/cosmos SDK
SQL queries: SELECT c.id, c.name FROM c WHERE c.productId = @pid
Touchpoints:
packages/cosmos/ — shared client singleton (@azure/cosmos peer dep)
services/platform-service/src/modules/*/repository.ts — 56 repository files with Cosmos SQL queries
services/extraction-service/src/modules/*/repository.ts — 2 repository files
dashboards/admin-web/src/lib/cosmos.ts — direct @azure/cosmos import
dashboards/admin-web/src/lib/repositories/*.ts — 4 repository files
mindlyst-native/web/src/lib/cosmos.ts — direct @azure/cosmos import
learning_voice_ai_agent/src/cloud/cosmos_client.py — Python Cosmos client
learning_voice_ai_agent/backend/src/cloud/cosmos.py — Python backend Cosmos client
Query patterns used:
container.items.query() with parameterized SQL
container.items.create(), .replace(), .delete(), .read()
container.items.upsert()
- Partition key routing (
/userId, /productId, /id)
- Cross-partition queries (admin/analytics)
SELECT VALUE COUNT(1) aggregates
OFFSET ... LIMIT pagination
ORDER BY sorting
ARRAY_CONTAINS() for array queries
3.2 Azure Speech SDK — DEEP (3 files, streaming integration)
The Speech SDK is used for real-time streaming speech-to-text with features that are tightly coupled to the Azure SDK's event-driven architecture:
src/audio/azure_stt.py — 248 lines. Uses PushAudioInputStream, SpeechRecognizer, continuous recognition with recognizing/recognized/canceled/session_stopped event callbacks, PhraseListGrammar, auto-language detection (10 languages), auto-reconnect
src/ui/settings.py + src/ui/unified_window.py — connection testing
mindlyst-native/iosApp/Services/AzureSpeechTranscriber.swift — iOS Swift SPX framework
mobile_app/ios/LysnrAI/ — iOS keyboard extension uses SPX framework
3.3 Blob Storage — SHALLOW (3 files)
packages/blob/src/blob.ts — 162 lines, singleton client, SAS URL generation
src/cloud/blob_client.py — 190 lines, Python equivalent
services/platform-service/src/modules/blob/ — REST API wrapper
3.4 Azure OpenAI — SHALLOW (3 files, already abstracted)
src/llm/text_cleaner.py — uses openai.AzureOpenAI (OpenAI SDK with Azure endpoint)
backend/src/clients/openai_client.py — uses openai.AsyncAzureOpenAI
mindlyst-native/web/src/lib/llm.ts — already has OpenAI fallback (resolves provider dynamically)
The openai Python/JS SDK supports both Azure and OpenAI endpoints with minimal config change. MindLyst web already handles this automatically.
3.5 Key Vault — SHALLOW (2 files)
packages/config/src/keyvault.ts — 90 lines, resolveKeyVaultSecrets() with graceful fallback
src/secrets/keyvault.py — 69 lines, SecretResolver class with env var fallback
Both implementations already fall back to environment variables when Key Vault is unavailable. Migration = just stop using Key Vault and use the env var path.
3.6 Notification Hubs — NOT YET INTEGRATED
Planned but not deeply wired. Only namespace/hub exists in Azure. Mobile apps use BLPlatformClient (REST) to talk to platform-service, which would route push notifications.
3.7 Application Insights — SHALLOW (1 file)
opencensus-ext-azure in Python requirements (optional telemetry)
- Custom telemetry system already built (
@bytelyst/telemetry-client, platform-service telemetry module with Cosmos storage)
The custom telemetry system means App Insights is supplementary, not critical.
4. Migration Target Comparison
4.1 Database: Cosmos DB → Alternatives
| Feature |
Azure Cosmos DB (current) |
MongoDB Atlas |
AWS DynamoDB |
Google Firestore |
PostgreSQL (Supabase/Neon) |
| Data model |
Document (JSON) |
Document (JSON) |
Key-Value + Document |
Document (JSON) |
Relational + JSONB |
| Query language |
SQL-like |
MQL (MongoDB Query) |
PartiQL / API |
GQL-like API |
SQL |
| Partition keys |
Required |
Shard keys (optional) |
Required |
Collection groups |
Not applicable |
| Serverless |
Yes |
Yes (Atlas Serverless) |
Yes |
Yes |
Yes (Neon) |
| SQL queries |
SELECT c.id FROM c WHERE c.x = @y |
db.collection.find({x: y}) |
SELECT id FROM table WHERE x = ? |
Client SDK queries |
Standard SQL |
| Aggregates |
Basic (COUNT, SUM, AVG) |
Full ($group, $match, $lookup) |
Limited |
Limited |
Full SQL |
| Cross-partition |
Yes (expensive) |
Yes (scatter-gather) |
Scan (expensive) |
Yes |
N/A |
| Change feed |
Yes |
Change Streams |
DynamoDB Streams |
Real-time listeners |
Logical replication |
| Global distribution |
Built-in multi-region |
Atlas Global Clusters |
Global Tables |
Multi-region |
Manual / Citus |
| Max doc size |
2 MB |
16 MB |
400 KB |
1 MB |
Unlimited (JSONB) |
| Free tier |
1000 RU/s + 25 GB |
512 MB |
25 GB + 25 WCU/RCU |
1 GiB + 50K reads/day |
0.5 GB (Neon) |
| Migration effort |
— |
Medium (query rewrite) |
Hard (paradigm shift) |
Hard (no SQL) |
Hard (schema design) |
4.2 Object Storage: Blob → Alternatives
| Feature |
Azure Blob (current) |
AWS S3 |
GCP Cloud Storage |
Cloudflare R2 |
MinIO (self-hosted) |
| API compatibility |
Azure Blob API |
S3 API |
GCS API / S3-compat |
S3-compatible |
S3-compatible |
| SAS tokens |
Yes (Azure SAS) |
Pre-signed URLs |
Signed URLs |
Pre-signed URLs |
Pre-signed URLs |
| CDN integration |
Azure CDN |
CloudFront |
Cloud CDN |
Built-in |
Manual |
| Cost (per GB) |
$0.018 (Cool) |
$0.023 (Standard) |
$0.020 |
$0.015 (no egress) |
Self-hosted |
| Migration effort |
— |
Easy |
Easy |
Easy |
Easy |
4.3 Speech-to-Text: Azure Speech → Alternatives
| Feature |
Azure Speech (current) |
AWS Transcribe |
Google Speech-to-Text |
Deepgram |
Whisper (local) |
| Streaming STT |
Yes (push stream) |
Yes (WebSocket) |
Yes (streaming) |
Yes (WebSocket) |
No (batch only) |
| Languages |
100+ |
100+ |
125+ |
36+ |
99+ |
| Auto-detect lang |
Up to 10 at-once |
Yes |
Yes |
Yes |
Yes |
| Custom vocabulary |
PhraseListGrammar |
Custom vocabulary |
Speech adaptation |
Keywords |
No |
| Native SDK |
Python, Swift (SPX), JS |
Python, no iOS SDK |
Python, iOS, JS |
REST/WebSocket |
Python only |
| iOS native SDK |
SPX framework (ObjC) |
No native SDK |
Yes (gRPC) |
No native SDK |
No |
| Free tier |
5 hrs/month (F0) |
60 min/month |
60 min/month |
None |
Free (local GPU) |
| Latency |
~200ms |
~300ms |
~200ms |
~100ms |
~500ms+ (local) |
| Migration effort |
— |
Hard (no iOS SDK) |
Medium (has iOS SDK) |
Medium (REST only) |
Hard (no streaming) |
4.4 LLM / AI: Azure OpenAI → Alternatives
| Feature |
Azure OpenAI (current) |
OpenAI API (direct) |
Google Gemini |
AWS Bedrock |
Anthropic Claude |
| Models |
GPT-4o, GPT-4o-mini |
Same models |
Gemini 2.5 |
Claude, Llama, Titan |
Claude 3.5/4 |
| API compatibility |
OpenAI SDK (azure mode) |
OpenAI SDK (native) |
Google SDK |
AWS SDK |
Anthropic SDK |
| Data residency |
Azure regions |
US only |
Google regions |
AWS regions |
US/EU |
| Cost (GPT-4o-mini) |
$0.15/$0.60 per M tokens |
$0.15/$0.60 per M tokens |
~$0.10/$0.40 (Flash) |
Varies |
~$0.25/$1.25 (Haiku) |
| Migration effort |
— |
Trivial (change endpoint) |
Easy (SDK swap) |
Medium |
Easy (SDK swap) |
4.5 Secrets Management: Key Vault → Alternatives
| Feature |
Azure Key Vault (current) |
AWS Secrets Manager |
GCP Secret Manager |
HashiCorp Vault |
Doppler / Infisical |
| Cost |
$0.03/10K ops |
$0.40/secret/month |
$0.06/10K ops |
Free (OSS) |
Free tier |
| SDK |
@azure/keyvault-secrets |
@aws-sdk/client-secrets-manager |
@google-cloud/secret-manager |
HTTP API |
SDK / CLI |
| Migration effort |
— |
Easy |
Easy |
Medium |
Easy |
Note: The codebase already falls back to env vars when Key Vault is unavailable. This means Key Vault can be replaced by any secrets manager or simply .env files without code changes to application logic.
4.6 Push Notifications: Notification Hubs → Alternatives
| Feature |
Azure NH (current) |
AWS SNS |
Firebase Cloud Messaging |
OneSignal |
Expo Push |
| APNs + FCM |
Yes |
Yes |
FCM only (APNs via FCM) |
Yes |
Yes |
| Free tier |
1M pushes/month |
1M publishes |
Unlimited |
10K subscribers |
Unlimited |
| Migration effort |
— |
Easy |
Easy |
Easy |
Easy (NomGap uses Expo) |
5. Per-Service Migration Analysis
5.1 Cosmos DB → MongoDB Atlas
Difficulty: MEDIUM-HIGH | Effort: 3–5 weeks | Risk: MEDIUM
This is the single largest migration task. Here's why:
What needs to change
| Layer |
Current (Cosmos SQL API) |
Target (MongoDB) |
Files |
| Client package |
@azure/cosmos → CosmosClient |
mongodb → MongoClient |
packages/cosmos/src/client.ts |
| Container registry |
getContainer(name) |
db.collection(name) |
packages/cosmos/src/containers.ts |
| All repository files |
container.items.query('SELECT...') |
collection.find({...}) |
56+ files in platform-service |
| Dashboard Cosmos clients |
@azure/cosmos direct |
mongodb direct |
2 files (admin, MindLyst) |
| Python clients |
azure.cosmos.CosmosClient |
pymongo.MongoClient |
2 files |
| Query syntax |
SQL-like (SELECT c.id FROM c WHERE c.productId = @pid AND c.userId = @uid ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20) |
MQL (collection.find({productId: pid, userId: uid}).sort({createdAt: -1}).skip(0).limit(20)) |
All repository files |
| Partition keys |
Explicit partition key in every query |
Shard key (auto-routed) |
All repository files |
| Upsert |
container.items.upsert(doc) |
collection.updateOne({_id: id}, {$set: doc}, {upsert: true}) |
~20 files |
| Read by ID |
container.item(id, partitionKey).read() |
collection.findOne({_id: id}) |
All repository files |
What stays the same
- Document structure (JSON documents with
id, productId, partition keys)
- Data model (no schema changes needed — MongoDB is also schemaless)
- Partition key concept maps to shard key
- Serverless pricing model available on both
Key migration steps
- Update
@bytelyst/cosmos package to export MongoDB-compatible API
- Rewrite all SQL queries to MQL (56+ files)
- Replace
container.items.query() → collection.find()
- Replace
container.item(id, pk).read() → collection.findOne({_id: id})
- Replace
container.items.create() → collection.insertOne()
- Replace
container.items.replace() → collection.replaceOne()
- Replace
container.items.upsert() → collection.updateOne({upsert: true})
- Update Python clients similarly
- Migrate data (use Azure Data Factory or custom script)
- Update all test mocks
Why MongoDB Atlas is the best DB alternative
- Closest query model to Cosmos SQL API (both are document DBs)
- MongoDB has a Cosmos DB compatibility mode (but going native is better)
- Cosmos DB was originally inspired by MongoDB's document model
- MongoDB's
find() queries map closely to Cosmos SQL SELECT queries
- Both support partition/shard keys, TTL indexes, change streams
- MongoDB Atlas Serverless pricing is competitive
- MongoDB has excellent TypeScript and Python SDKs
5.2 Azure Speech → Google Cloud Speech-to-Text
Difficulty: HIGH | Effort: 2–3 weeks | Risk: HIGH
Why this is hard
- The Azure Speech SDK uses a push-stream architecture (
PushAudioInputStream) that is deeply integrated into the audio pipeline
- The
SpeechRecognizer has event-driven callbacks (recognizing, recognized, canceled, session_stopped) that the code relies on for real-time partial/final transcript delivery
- Custom vocabulary via
PhraseListGrammar is Azure-specific
- Auto-language detection config is Azure-specific
- The iOS SPX framework (Objective-C) is used in LysnrAI keyboard extension and MindLyst — there's no direct equivalent for most alternatives
Best alternative: Google Cloud Speech-to-Text
- Has streaming recognition with similar event model
- Has an iOS SDK (gRPC-based)
- Supports custom vocabulary (speech adaptation)
- Supports auto-language detection
- Similar pricing and free tier
What needs to change
src/audio/azure_stt.py — complete rewrite (~248 lines)
iosApp/Services/AzureSpeechTranscriber.swift — complete rewrite
LysnrAI/LysnrKeyboard/ — keyboard extension STT integration
- Audio format handling (may differ between providers)
- Connection test code in settings UI
5.3 Blob Storage → AWS S3 or Cloudflare R2
Difficulty: LOW | Effort: 2–3 days | Risk: LOW
Why this is easy
@bytelyst/blob package is a thin wrapper (162 lines)
- Only 3 files need changes
- S3 API is the de facto standard — R2, MinIO, GCS all support S3-compatible API
- SAS tokens → Pre-signed URLs (same concept, different implementation)
What needs to change
packages/blob/src/blob.ts — swap @azure/storage-blob → @aws-sdk/client-s3 + @aws-sdk/s3-request-presigner
src/cloud/blob_client.py — swap azure.storage.blob → boto3
services/platform-service/src/modules/blob/ — update routes for pre-signed URL format
- Environment variables:
AZURE_BLOB_* → AWS_S3_* or S3_*
5.4 Azure OpenAI → OpenAI API (direct) or Gemini
Difficulty: TRIVIAL | Effort: < 1 day | Risk: VERY LOW
Why this is trivial
- The
openai Python SDK supports both Azure and OpenAI endpoints — just change config
- MindLyst web
llm.ts already auto-detects Azure vs OpenAI and builds the correct URL
- LysnrAI desktop uses
AzureOpenAI class from openai SDK — switch to OpenAI class
- Same models, same API shape, same pricing
What needs to change
- Set
OPENAI_API_KEY instead of AZURE_OPENAI_* env vars
- Change
AzureOpenAI(azure_endpoint=..., api_key=..., api_version=...) → OpenAI(api_key=...)
- Change
AsyncAzureOpenAI(...) → AsyncOpenAI(...)
- Remove
api_version parameter
- That's it. The
openai SDK handles the rest.
5.5 Key Vault → Environment Variables / Any Secrets Manager
Difficulty: TRIVIAL | Effort: < 1 day | Risk: VERY LOW
Both keyvault.ts and keyvault.py already implement graceful fallback:
- If
AZURE_KEYVAULT_URL is not set → uses env vars directly
- If Key Vault is unreachable → falls back to env vars
To migrate: Simply stop setting AZURE_KEYVAULT_URL. Everything works via env vars. Then optionally adopt any other secrets manager (AWS Secrets Manager, Doppler, Infisical, etc.).
5.6 Notification Hubs → Firebase Cloud Messaging
Difficulty: LOW | Effort: 1–2 days | Risk: LOW
Not yet deeply integrated. The platform-service notification module sends via REST API. Swap the push provider client.
5.7 Application Insights → Self-hosted / Grafana
Difficulty: TRIVIAL | Effort: Already done | Risk: NONE
The ecosystem already has:
- Custom telemetry system (
@bytelyst/telemetry-client → platform-service → Cosmos)
- Loki + Grafana in
services/monitoring/
- App Insights is supplementary, can be dropped with zero code changes
6. Migration Scenario Scoring
Scenario A: Stay on Azure (Status Quo)
| Dimension |
Score (1-5) |
Notes |
| Migration effort |
5 (none) |
No work needed |
| Cost |
4 |
~$15/month at current scale, competitive |
| Vendor diversity |
1 |
Single cloud vendor |
| Feature parity |
5 |
Everything works today |
| Total |
15/20 |
|
Scenario B: Full Migration to AWS
| Dimension |
Score (1-5) |
Notes |
| Migration effort |
2 |
6–8 weeks, Cosmos→DynamoDB is painful |
| Cost |
3 |
Similar or slightly higher at small scale |
| Vendor diversity |
1 |
Still single cloud, just different |
| Feature parity |
3 |
No native iOS Speech SDK, DynamoDB query model is very different |
| Total |
9/20 |
|
Scenario C: Multi-Cloud (MongoDB Atlas + OpenAI + R2 + Google STT)
| Dimension |
Score (1-5) |
Notes |
| Migration effort |
2 |
5–7 weeks, Cosmos→MongoDB is medium |
| Cost |
4 |
MongoDB Atlas free tier, R2 no egress fees |
| Vendor diversity |
5 |
No single-vendor dependency |
| Feature parity |
4 |
MongoDB is a better document DB than Cosmos in many ways |
| Total |
15/20 |
|
Scenario D: Stay Azure + Add Abstraction Layers
| Dimension |
Score (1-5) |
Notes |
| Migration effort |
4 |
1–2 weeks to add repository interface pattern |
| Cost |
4 |
No change |
| Vendor diversity |
3 |
Ready to switch, but still on Azure |
| Feature parity |
5 |
Everything works today |
| Total |
16/20 |
Winner |
Scenario E: Migrate DB Only (Cosmos → MongoDB Atlas, keep rest on Azure)
| Dimension |
Score (1-5) |
Notes |
| Migration effort |
3 |
3–5 weeks for DB migration |
| Cost |
4 |
MongoDB Atlas Serverless may be cheaper |
| Vendor diversity |
3 |
DB is independent, other services still Azure |
| Feature parity |
5 |
MongoDB is very capable |
| Total |
15/20 |
|
7. Cost Comparison
Current Azure Costs (MVP / Low Usage)
| Service |
Monthly Cost |
Notes |
| Cosmos DB (Serverless) |
~$4–10 |
3 databases, ~45 containers |
| Blob Storage (Cool, RAGRS) |
~$0.20 |
9+ containers |
| Azure OpenAI (GPT-4o-mini) |
~$5–10 |
Pay per token |
| Speech (F0) |
$0 |
5 hrs/month free |
| Key Vault |
~$0.06 |
~25 secrets |
| Notification Hubs (Free) |
$0 |
1M pushes/month |
| App Insights |
$0 |
5 GB/month free |
| Total |
~$10–20/month |
|
Equivalent AWS Costs
| Service |
AWS Equivalent |
Monthly Cost |
| Cosmos DB → DynamoDB (On-Demand) |
DynamoDB |
~$5–15 |
| Blob → S3 Standard |
S3 |
~$0.25 |
| Azure OpenAI → OpenAI API |
Same pricing |
~$5–10 |
| Speech → Transcribe |
Transcribe |
~$1–3 |
| Key Vault → Secrets Manager |
Secrets Manager |
~$10 (per-secret pricing) |
| Notification Hubs → SNS |
SNS |
~$0.50 |
| App Insights → CloudWatch |
CloudWatch |
~$3 |
| Total |
|
~$25–42/month |
Equivalent Multi-Cloud Costs
| Service |
Provider |
Monthly Cost |
| Cosmos DB → MongoDB Atlas Serverless |
MongoDB |
~$3–8 |
| Blob → Cloudflare R2 |
Cloudflare |
~$0.15 (no egress) |
| Azure OpenAI → OpenAI API (direct) |
OpenAI |
~$5–10 |
| Speech → Google STT |
Google Cloud |
~$1–3 |
| Key Vault → Doppler (free tier) |
Doppler |
$0 |
| Push → Firebase FCM |
Google |
$0 |
| Monitoring → Grafana Cloud (free) |
Grafana |
$0 |
| Total |
|
~$10–22/month |
Cost Summary
| Scenario |
Monthly Cost |
vs Current |
| Azure (current) |
~$10–20 |
Baseline |
| Full AWS |
~$25–42 |
+50–110% |
| Multi-cloud |
~$10–22 |
~Same |
| MongoDB Atlas + Azure rest |
~$10–18 |
~Same |
Verdict: At current scale, cost is not a compelling reason to migrate. All options are under $50/month. Cost becomes more significant at scale (10K+ users), where MongoDB Atlas and R2 would likely be cheaper due to no egress fees and better serverless pricing.
8. Abstraction Layer Assessment
Current State: Partially Abstracted
The codebase already has meaningful abstraction:
| Layer |
Abstraction Level |
Notes |
| Cosmos DB |
Partial — @bytelyst/cosmos package |
Application code still writes raw SQL queries and uses @azure/cosmos types |
| Blob Storage |
Good — @bytelyst/blob package |
Thin wrapper, easy to swap internals |
| OpenAI/LLM |
Good — MindLyst has provider auto-detection |
LysnrAI desktop/backend hardcodes AzureOpenAI |
| Key Vault |
Excellent — graceful fallback to env vars |
Already cloud-agnostic in practice |
| Speech |
None — raw SDK usage |
Deep Azure SDK coupling in 3 files |
| Auth (JWT) |
Excellent — uses jose library |
No cloud dependency |
| Push notifications |
Good — platform-service abstraction |
Swap provider client only |
What's Missing: Repository Interface Pattern
The biggest gap is that repository files directly use @azure/cosmos types and SQL query syntax. To make the DB layer swappable, you'd need:
// Proposed: packages/cosmos/src/repository.ts
export interface DocumentRepository<T> {
findById(id: string, partitionKey: string): Promise<T | null>;
findMany(filter: Record<string, unknown>, opts?: QueryOptions): Promise<T[]>;
create(doc: T): Promise<T>;
replace(id: string, doc: T, partitionKey: string): Promise<T>;
upsert(doc: T): Promise<T>;
delete(id: string, partitionKey: string): Promise<void>;
count(filter: Record<string, unknown>): Promise<number>;
}
This would allow swapping Cosmos → MongoDB → PostgreSQL behind the interface without touching 56+ repository files.
Effort to add: 1–2 weeks. This is the highest-ROI investment regardless of migration decision.
9. Risk Analysis
9.1 Risks of Staying on Azure
| Risk |
Likelihood |
Impact |
Mitigation |
| Azure pricing increases |
Low |
Medium |
Add abstraction layer for future portability |
| Azure outage |
Low |
High |
Multi-region already possible (Cosmos global distribution) |
| Feature stagnation |
Very Low |
Low |
Azure is investing heavily in AI services |
| Vendor lock-in deepens over time |
Medium |
Medium |
Add abstraction layers proactively |
9.2 Risks of Migrating
| Risk |
Likelihood |
Impact |
Mitigation |
| Data loss during migration |
Low |
Critical |
Test migration on staging first, keep Azure as backup |
| Query performance differences |
Medium |
Medium |
Benchmark before committing |
| Feature gaps in new provider |
Medium |
Medium |
Prototype critical features first |
| Wasted engineering time |
Medium |
High |
Only migrate if there's a clear business driver |
| Regression bugs in 56+ repository files |
High |
Medium |
Comprehensive test suite (1,029 tests) catches most issues |
| Speech quality degradation |
Medium |
High |
A/B test both providers before committing |
9.3 Azure-Specific Lock-in Risks (ranked)
| # |
Component |
Lock-in Level |
Escape Hatch |
| 1 |
Cosmos DB SQL API |
High |
Rewrite queries to MongoDB MQL or add repository interface |
| 2 |
Azure Speech SDK (streaming) |
High |
Google STT has comparable streaming API |
| 3 |
Azure Identity (DefaultAzureCredential) |
Medium |
Only used by Key Vault, which is already optional |
| 4 |
Blob Storage SAS tokens |
Low |
Pre-signed URLs are equivalent across all providers |
| 5 |
Azure OpenAI |
Very Low |
OpenAI SDK works with both — 1-line config change |
| 6 |
Key Vault |
Very Low |
Already has env var fallback |
| 7 |
Notification Hubs |
Very Low |
Not deeply integrated yet |
| 8 |
Application Insights |
None |
Custom telemetry already built |
10. Recommendations
Recommended Strategy: Stay on Azure + Invest in Abstraction (Scenario D)
This is the highest-scoring approach. Here's the prioritized action plan:
Phase 1: Add Repository Interface (1–2 weeks)
- Create
DocumentRepository<T> interface in @bytelyst/cosmos
- Implement
CosmosDocumentRepository<T> that wraps current @azure/cosmos calls
- Gradually migrate the 56 repository files to use the interface
- This makes future DB migration a matter of implementing
MongoDocumentRepository<T> — no application code changes needed
Phase 2: Normalize LLM Abstraction (2–3 days)
- Move LysnrAI desktop/backend from
AzureOpenAI → auto-detecting provider pattern (like MindLyst web already does)
- Support
OPENAI_PROVIDER=azure|openai|gemini across all repos
- This makes LLM provider swappable via config
Phase 3: Speech Abstraction Layer (1 week, optional)
- Create
SpeechTranscriber protocol/interface
- Implement
AzureSpeechTranscriber (current code, extracted)
- Prepare
GoogleSpeechTranscriber stub for future use
- This is lower priority since Azure Speech F0 tier is free
Phase 4: Document Decision Criteria for Future Migration
- Define triggers that would justify migration (e.g., cost > $X/month, Azure outage > Y hours, need for feature Z)
- Review annually
Why NOT Migrate Now
- Cost is negligible — ~$10–20/month doesn't justify weeks of engineering
- No business driver — Azure isn't blocking any feature development
- Risk/reward is unfavorable — 4–8 weeks of migration work for ~$0 cost savings
- Test coverage is good but not perfect — 1,029 tests cover most paths, but query-level changes in 56 files still risk regressions
- Azure free tiers are generous — Speech F0, Notification Hubs Free, App Insights free tier
When Migration WOULD Make Sense
- Cosmos DB costs exceed $100/month → Consider MongoDB Atlas Serverless
- Azure Speech quality is insufficient → Evaluate Google STT or Deepgram
- Enterprise customer requires specific cloud → Build the repository interface, then implement their cloud backend
- Azure has extended outage affecting your region → Multi-region or multi-cloud
- You want to go fully open-source → PostgreSQL (Supabase) + Whisper + MinIO (significant rewrite)
11. Migration Playbook (If Chosen)
If you decide to migrate in the future, here's the execution order (shortest critical path):
Week 1–2: Database Abstraction
- Create
DocumentRepository<T> interface
- Implement
CosmosDocumentRepository<T> (wraps current code)
- Migrate all 56 repository files to use interface
- Verify all 1,029 tests pass
Week 3–4: Database Migration (Cosmos → MongoDB)
- Implement
MongoDocumentRepository<T>
- Set up MongoDB Atlas Serverless cluster
- Write data migration script (Cosmos → MongoDB)
- Run migration on staging, verify data integrity
- Switch repository implementation via config flag
- Run full test suite against MongoDB
Week 5: Storage + Secrets
- Swap
@bytelyst/blob internals to S3-compatible client
- Migrate blobs (azcopy → aws s3 sync or similar)
- Replace Key Vault with new secrets manager (or just env vars)
- Update all environment variable names
Week 6: LLM + Speech (if needed)
- Switch OpenAI from Azure endpoint to direct (config change only)
- If migrating Speech: rewrite
azure_stt.py and Swift AzureSpeechTranscriber
- A/B test new speech provider against Azure
Week 7–8: Cleanup + Verification
- Remove all
@azure/* npm packages
- Remove all
azure-* pip packages
- Update Docker configs, CI/CD
- Update documentation
- Monitor production for 2 weeks
Appendix A: File-Level Azure Dependency Map
TypeScript — @azure/cosmos (CRITICAL)
| File |
Repo |
Direct Import |
packages/cosmos/src/client.ts |
common-plat |
@azure/cosmos |
packages/cosmos/src/containers.ts |
common-plat |
@azure/cosmos |
services/platform-service/src/modules/*/repository.ts (56 files) |
common-plat |
Via @bytelyst/cosmos |
services/extraction-service/src/modules/*/repository.ts (2 files) |
common-plat |
Via @bytelyst/cosmos |
dashboards/admin-web/src/lib/cosmos.ts |
common-plat |
@azure/cosmos |
dashboards/admin-web/src/lib/repositories/*.ts (4 files) |
common-plat |
Via cosmos.ts |
mindlyst-native/web/src/lib/cosmos.ts |
MindLyst |
@azure/cosmos |
TypeScript — @azure/storage-blob
| File |
Repo |
Direct Import |
packages/blob/src/blob.ts |
common-plat |
@azure/storage-blob |
TypeScript — @azure/identity + @azure/keyvault-secrets
| File |
Repo |
Direct Import |
packages/config/src/keyvault.ts |
common-plat |
Dynamic import (both) |
dashboards/admin-web/src/app/api/ops/secrets/route.ts |
common-plat |
Both (Secrets Manager UI) |
Python — Azure SDKs
| File |
Repo |
SDK |
src/audio/azure_stt.py |
LysnrAI |
azure.cognitiveservices.speech |
src/cloud/cosmos_client.py |
LysnrAI |
azure.cosmos |
src/cloud/blob_client.py |
LysnrAI |
azure.storage.blob |
src/secrets/keyvault.py |
LysnrAI |
azure.identity, azure.keyvault.secrets |
backend/src/secrets/keyvault.py |
LysnrAI |
azure.identity, azure.keyvault.secrets |
backend/src/cloud/cosmos.py |
LysnrAI |
azure.cosmos |
src/llm/text_cleaner.py |
LysnrAI |
openai.AzureOpenAI |
backend/src/clients/openai_client.py |
LysnrAI |
openai.AsyncAzureOpenAI |
Swift — Azure Speech SDK
| File |
Repo |
SDK |
iosApp/Services/AzureSpeechTranscriber.swift |
MindLyst |
MicrosoftCognitiveServicesSpeech |
LysnrAI/LysnrKeyboard/KeyboardViewController.swift |
LysnrAI |
SPX framework (via CocoaPods) |
Appendix B: SDK & Package Inventory
npm packages (TypeScript)
| Package |
Version |
Used By |
Swappable |
@azure/cosmos |
≥4.0.0 |
@bytelyst/cosmos, admin-web, MindLyst web |
Medium (query rewrite) |
@azure/storage-blob |
≥12.0.0 |
@bytelyst/blob |
Easy (S3 compat) |
@azure/identity |
latest |
@bytelyst/config, admin-web secrets |
Easy (remove) |
@azure/keyvault-secrets |
latest |
@bytelyst/config, admin-web secrets |
Easy (remove) |
pip packages (Python)
| Package |
Version |
Used By |
Swappable |
azure-cognitiveservices-speech |
≥1.42.0 |
Desktop STT |
Hard (deep SDK integration) |
azure-cosmos |
latest |
Desktop + backend Cosmos client |
Medium (pymongo swap) |
azure-storage-blob |
≥12.24.0 |
Desktop blob client |
Easy (boto3 swap) |
azure-identity |
≥1.19.0 |
Key Vault auth |
Easy (remove) |
azure-keyvault-secrets |
≥4.9.0 |
Secrets resolver |
Easy (remove) |
openai |
≥1.60.0 |
AzureOpenAI / AsyncAzureOpenAI |
Trivial (change class name) |
opencensus-ext-azure |
≥1.1.0 |
Optional telemetry |
Trivial (remove) |
Swift packages / CocoaPods
| Package |
Used By |
Swappable |
MicrosoftCognitiveServicesSpeech (SPX) |
LysnrAI iOS, MindLyst iOS |
Hard (need alternative streaming STT) |
Document generated by automated codebase analysis. Numbers are accurate as of 2026-03-01. Update as the codebase evolves.