docs: reorganize docs/ into category folders with roadmaps/{completed,partial,not-started}
This commit is contained in:
parent
7742ebd58f
commit
dd4410548e
@ -1,726 +0,0 @@
|
|||||||
# Cloud Provider Migration Analysis — ByteLyst Ecosystem
|
|
||||||
|
|
||||||
> **Author:** AI Analysis (Cascade)
|
|
||||||
> **Date:** 2026-03-01
|
|
||||||
> **Scope:** All 7 repos — LysnrAI, MindLyst, ChronoMind, NomGap, PeakPulse, Common Platform, JarvisJr
|
|
||||||
> **Purpose:** Evaluate current Azure investment, assess migration feasibility to AWS / GCP / MongoDB Atlas / multi-cloud, and provide actionable recommendations.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Table of Contents
|
|
||||||
|
|
||||||
1. [Executive Summary](#1-executive-summary)
|
|
||||||
2. [Current Azure Investment Inventory](#2-current-azure-investment-inventory)
|
|
||||||
3. [Dependency Depth Analysis](#3-dependency-depth-analysis)
|
|
||||||
4. [Migration Target Comparison](#4-migration-target-comparison)
|
|
||||||
5. [Per-Service Migration Analysis](#5-per-service-migration-analysis)
|
|
||||||
6. [Migration Scenario Scoring](#6-migration-scenario-scoring)
|
|
||||||
7. [Cost Comparison](#7-cost-comparison)
|
|
||||||
8. [Abstraction Layer Assessment](#8-abstraction-layer-assessment)
|
|
||||||
9. [Risk Analysis](#9-risk-analysis)
|
|
||||||
10. [Recommendations](#10-recommendations)
|
|
||||||
11. [Migration Playbook (If Chosen)](#11-migration-playbook-if-chosen)
|
|
||||||
12. [Appendix A: File-Level Azure Dependency Map](#appendix-a-file-level-azure-dependency-map)
|
|
||||||
13. [Appendix B: SDK & Package Inventory](#appendix-b-sdk--package-inventory)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. Executive Summary
|
|
||||||
|
|
||||||
The ByteLyst ecosystem is **moderately coupled** to Azure. The coupling is concentrated in **3 packages** (`@bytelyst/cosmos`, `@bytelyst/blob`, `@bytelyst/config`) and **2 Python modules** (`azure_stt.py`, `cosmos_client.py`). The architecture already uses an internal abstraction layer — most application code never imports Azure SDKs directly.
|
|
||||||
|
|
||||||
### Key Findings
|
|
||||||
|
|
||||||
| Dimension | Assessment |
|
|
||||||
|-----------|-----------|
|
|
||||||
| **Overall Azure lock-in** | **Medium** — concentrated in ~15 files, but those files are foundational |
|
|
||||||
| **Easiest to migrate** | Blob Storage, Key Vault, OpenAI, Application Insights |
|
|
||||||
| **Hardest to migrate** | Cosmos DB (SQL API queries in 56+ repository files), Azure Speech SDK |
|
|
||||||
| **Best alternative DB** | MongoDB Atlas (closest query model to Cosmos SQL API) |
|
|
||||||
| **Best alternative cloud** | AWS (broadest service parity, mature SDK ecosystem) |
|
|
||||||
| **Estimated migration effort** | 4–8 weeks for full cloud swap (Cosmos DB is the long pole) |
|
|
||||||
| **Recommendation** | **Stay on Azure** for now, but invest in abstraction layers to reduce future switching cost |
|
|
||||||
|
|
||||||
### Azure Services Used (8 total)
|
|
||||||
|
|
||||||
| # | Azure Service | Monthly Cost | Lock-in Risk | Files Affected |
|
|
||||||
|---|--------------|-------------|-------------|----------------|
|
|
||||||
| 1 | **Cosmos DB** (SQL/NoSQL API) | ~$4–10 | **HIGH** | 56+ repository files, 3 databases, ~45 containers |
|
|
||||||
| 2 | **Blob Storage** | ~$0.20 | LOW | 2 packages + 1 Python module |
|
|
||||||
| 3 | **Azure OpenAI** | ~$5–10 | LOW | 3 files (already supports OpenAI fallback) |
|
|
||||||
| 4 | **Speech Services** | $0 (F0) | **HIGH** | 2 files (deep SDK integration, streaming) |
|
|
||||||
| 5 | **Key Vault** | ~$0.06 | LOW | 2 files (1 TS, 1 Python) |
|
|
||||||
| 6 | **Notification Hubs** | $0 (Free) | MEDIUM | Planned, not yet deeply integrated |
|
|
||||||
| 7 | **Application Insights** | $0 (5GB free) | LOW | 1 file (custom telemetry already built) |
|
|
||||||
| 8 | **Azure Identity** (DefaultAzureCredential) | $0 | LOW | Used by Key Vault + Secrets Manager |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 2. Current Azure Investment Inventory
|
|
||||||
|
|
||||||
### 2.1 Azure Resources (from Azure Portal)
|
|
||||||
|
|
||||||
| Resource | Azure Name | Region | SKU | Status |
|
|
||||||
|----------|-----------|--------|-----|--------|
|
|
||||||
| Resource Group | `rg-mywisprai` | East US | — | Active |
|
|
||||||
| Cosmos DB | `cosmos-mywisprai` | West US 2 | Serverless | Active — 3 DBs, ~45 containers |
|
|
||||||
| Blob Storage | `bytelystblobs` | West US 2 | StorageV2, RAGRS | Active — 9+ containers |
|
|
||||||
| Azure OpenAI | `mywisprai-openai-sweden` | Sweden Central | S0 | Active — gpt-4o-mini deployment |
|
|
||||||
| Speech Service | `mywisprai-speech` | East US | F0 (Free) | Active |
|
|
||||||
| Key Vault | `kv-mywisprai` | East US | Standard | Active — ~25 secrets |
|
|
||||||
| Notification Hubs | `lysnnai` namespace | East US | Free | Active — 2 hubs |
|
|
||||||
| App Insights | `bytelyst-appinsights` | East US | Classic | Active |
|
|
||||||
|
|
||||||
### 2.2 Cosmos DB Databases & Containers
|
|
||||||
|
|
||||||
| Database | Containers | Products Using |
|
|
||||||
|----------|-----------|----------------|
|
|
||||||
| `lysnrai` | ~27 containers (users, subscriptions, feature_flags, audit_log, tracker_items, telemetry_events, etc.) | LysnrAI, platform-service (all products) |
|
|
||||||
| `mindlyst` | ~20 containers (brains, memory_items, streaks, reflections, etc.) | MindLyst |
|
|
||||||
| `mywisprai` | 10 containers (legacy, pre-rebrand) | Legacy / migration target |
|
|
||||||
|
|
||||||
**Total: ~57 containers across 3 databases**, all using Cosmos SQL (NoSQL) API with SQL-like queries (`SELECT`, `WHERE`, `ORDER BY`, `OFFSET/LIMIT`, aggregate functions).
|
|
||||||
|
|
||||||
### 2.3 Code Investment by Language
|
|
||||||
|
|
||||||
| Language | Azure SDK Packages | Files Using Azure | Lines of Azure-Specific Code |
|
|
||||||
|----------|-------------------|-------------------|------------------------------|
|
|
||||||
| **TypeScript** | `@azure/cosmos`, `@azure/storage-blob`, `@azure/identity`, `@azure/keyvault-secrets` | ~65 files | ~500 lines |
|
|
||||||
| **Python** | `azure-cognitiveservices-speech`, `azure-cosmos`, `azure-storage-blob`, `azure-identity`, `azure-keyvault-secrets`, `openai` (AzureOpenAI) | ~8 files | ~400 lines |
|
|
||||||
| **Swift** | `MicrosoftCognitiveServicesSpeech` (SPX framework) | ~3 files | ~150 lines |
|
|
||||||
| **Kotlin** | None directly (uses platform-service REST API) | 0 files | 0 lines |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. Dependency Depth Analysis
|
|
||||||
|
|
||||||
### 3.1 Cosmos DB — DEEP (56+ files)
|
|
||||||
|
|
||||||
This is the **most deeply embedded** Azure dependency. Every repository module follows the pattern:
|
|
||||||
|
|
||||||
```
|
|
||||||
types.ts → repository.ts → routes.ts
|
|
||||||
↑
|
|
||||||
Uses @azure/cosmos SDK
|
|
||||||
SQL queries: SELECT c.id, c.name FROM c WHERE c.productId = @pid
|
|
||||||
```
|
|
||||||
|
|
||||||
**Touchpoints:**
|
|
||||||
- `packages/cosmos/` — shared client singleton (`@azure/cosmos` peer dep)
|
|
||||||
- `services/platform-service/src/modules/*/repository.ts` — **56 repository files** with Cosmos SQL queries
|
|
||||||
- `services/extraction-service/src/modules/*/repository.ts` — 2 repository files
|
|
||||||
- `dashboards/admin-web/src/lib/cosmos.ts` — direct `@azure/cosmos` import
|
|
||||||
- `dashboards/admin-web/src/lib/repositories/*.ts` — 4 repository files
|
|
||||||
- `mindlyst-native/web/src/lib/cosmos.ts` — direct `@azure/cosmos` import
|
|
||||||
- `learning_voice_ai_agent/src/cloud/cosmos_client.py` — Python Cosmos client
|
|
||||||
- `learning_voice_ai_agent/backend/src/cloud/cosmos.py` — Python backend Cosmos client
|
|
||||||
|
|
||||||
**Query patterns used:**
|
|
||||||
- `container.items.query()` with parameterized SQL
|
|
||||||
- `container.items.create()`, `.replace()`, `.delete()`, `.read()`
|
|
||||||
- `container.items.upsert()`
|
|
||||||
- Partition key routing (`/userId`, `/productId`, `/id`)
|
|
||||||
- Cross-partition queries (admin/analytics)
|
|
||||||
- `SELECT VALUE COUNT(1)` aggregates
|
|
||||||
- `OFFSET ... LIMIT` pagination
|
|
||||||
- `ORDER BY` sorting
|
|
||||||
- `ARRAY_CONTAINS()` for array queries
|
|
||||||
|
|
||||||
### 3.2 Azure Speech SDK — DEEP (3 files, streaming integration)
|
|
||||||
|
|
||||||
The Speech SDK is used for **real-time streaming speech-to-text** with features that are tightly coupled to the Azure SDK's event-driven architecture:
|
|
||||||
|
|
||||||
- `src/audio/azure_stt.py` — 248 lines. Uses `PushAudioInputStream`, `SpeechRecognizer`, continuous recognition with `recognizing`/`recognized`/`canceled`/`session_stopped` event callbacks, `PhraseListGrammar`, auto-language detection (10 languages), auto-reconnect
|
|
||||||
- `src/ui/settings.py` + `src/ui/unified_window.py` — connection testing
|
|
||||||
- `mindlyst-native/iosApp/Services/AzureSpeechTranscriber.swift` — iOS Swift SPX framework
|
|
||||||
- `mobile_app/ios/LysnrAI/` — iOS keyboard extension uses SPX framework
|
|
||||||
|
|
||||||
### 3.3 Blob Storage — SHALLOW (3 files)
|
|
||||||
|
|
||||||
- `packages/blob/src/blob.ts` — 162 lines, singleton client, SAS URL generation
|
|
||||||
- `src/cloud/blob_client.py` — 190 lines, Python equivalent
|
|
||||||
- `services/platform-service/src/modules/blob/` — REST API wrapper
|
|
||||||
|
|
||||||
### 3.4 Azure OpenAI — SHALLOW (3 files, already abstracted)
|
|
||||||
|
|
||||||
- `src/llm/text_cleaner.py` — uses `openai.AzureOpenAI` (OpenAI SDK with Azure endpoint)
|
|
||||||
- `backend/src/clients/openai_client.py` — uses `openai.AsyncAzureOpenAI`
|
|
||||||
- `mindlyst-native/web/src/lib/llm.ts` — **already has OpenAI fallback** (resolves provider dynamically)
|
|
||||||
|
|
||||||
The `openai` Python/JS SDK supports both Azure and OpenAI endpoints with minimal config change. MindLyst web already handles this automatically.
|
|
||||||
|
|
||||||
### 3.5 Key Vault — SHALLOW (2 files)
|
|
||||||
|
|
||||||
- `packages/config/src/keyvault.ts` — 90 lines, `resolveKeyVaultSecrets()` with graceful fallback
|
|
||||||
- `src/secrets/keyvault.py` — 69 lines, `SecretResolver` class with env var fallback
|
|
||||||
|
|
||||||
Both implementations already fall back to environment variables when Key Vault is unavailable. Migration = just stop using Key Vault and use the env var path.
|
|
||||||
|
|
||||||
### 3.6 Notification Hubs — NOT YET INTEGRATED
|
|
||||||
|
|
||||||
Planned but not deeply wired. Only namespace/hub exists in Azure. Mobile apps use `BLPlatformClient` (REST) to talk to platform-service, which would route push notifications.
|
|
||||||
|
|
||||||
### 3.7 Application Insights — SHALLOW (1 file)
|
|
||||||
|
|
||||||
- `opencensus-ext-azure` in Python requirements (optional telemetry)
|
|
||||||
- Custom telemetry system already built (`@bytelyst/telemetry-client`, platform-service telemetry module with Cosmos storage)
|
|
||||||
|
|
||||||
The custom telemetry system means App Insights is supplementary, not critical.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. Migration Target Comparison
|
|
||||||
|
|
||||||
### 4.1 Database: Cosmos DB → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure Cosmos DB (current) | MongoDB Atlas | AWS DynamoDB | Google Firestore | PostgreSQL (Supabase/Neon) |
|
|
||||||
|---------|--------------------------|---------------|-------------|-----------------|---------------------------|
|
|
||||||
| **Data model** | Document (JSON) | Document (JSON) | Key-Value + Document | Document (JSON) | Relational + JSONB |
|
|
||||||
| **Query language** | SQL-like | MQL (MongoDB Query) | PartiQL / API | GQL-like API | SQL |
|
|
||||||
| **Partition keys** | Required | Shard keys (optional) | Required | Collection groups | Not applicable |
|
|
||||||
| **Serverless** | Yes | Yes (Atlas Serverless) | Yes | Yes | Yes (Neon) |
|
|
||||||
| **SQL queries** | `SELECT c.id FROM c WHERE c.x = @y` | `db.collection.find({x: y})` | `SELECT id FROM table WHERE x = ?` | Client SDK queries | Standard SQL |
|
|
||||||
| **Aggregates** | Basic (`COUNT`, `SUM`, `AVG`) | Full (`$group`, `$match`, `$lookup`) | Limited | Limited | Full SQL |
|
|
||||||
| **Cross-partition** | Yes (expensive) | Yes (scatter-gather) | Scan (expensive) | Yes | N/A |
|
|
||||||
| **Change feed** | Yes | Change Streams | DynamoDB Streams | Real-time listeners | Logical replication |
|
|
||||||
| **Global distribution** | Built-in multi-region | Atlas Global Clusters | Global Tables | Multi-region | Manual / Citus |
|
|
||||||
| **Max doc size** | 2 MB | 16 MB | 400 KB | 1 MB | Unlimited (JSONB) |
|
|
||||||
| **Free tier** | 1000 RU/s + 25 GB | 512 MB | 25 GB + 25 WCU/RCU | 1 GiB + 50K reads/day | 0.5 GB (Neon) |
|
|
||||||
| **Migration effort** | — | **Medium** (query rewrite) | **Hard** (paradigm shift) | **Hard** (no SQL) | **Hard** (schema design) |
|
|
||||||
|
|
||||||
### 4.2 Object Storage: Blob → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure Blob (current) | AWS S3 | GCP Cloud Storage | Cloudflare R2 | MinIO (self-hosted) |
|
|
||||||
|---------|---------------------|--------|-------------------|---------------|---------------------|
|
|
||||||
| **API compatibility** | Azure Blob API | S3 API | GCS API / S3-compat | S3-compatible | S3-compatible |
|
|
||||||
| **SAS tokens** | Yes (Azure SAS) | Pre-signed URLs | Signed URLs | Pre-signed URLs | Pre-signed URLs |
|
|
||||||
| **CDN integration** | Azure CDN | CloudFront | Cloud CDN | Built-in | Manual |
|
|
||||||
| **Cost (per GB)** | $0.018 (Cool) | $0.023 (Standard) | $0.020 | $0.015 (no egress) | Self-hosted |
|
|
||||||
| **Migration effort** | — | **Easy** | **Easy** | **Easy** | **Easy** |
|
|
||||||
|
|
||||||
### 4.3 Speech-to-Text: Azure Speech → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure Speech (current) | AWS Transcribe | Google Speech-to-Text | Deepgram | Whisper (local) |
|
|
||||||
|---------|----------------------|----------------|----------------------|----------|-----------------|
|
|
||||||
| **Streaming STT** | Yes (push stream) | Yes (WebSocket) | Yes (streaming) | Yes (WebSocket) | No (batch only) |
|
|
||||||
| **Languages** | 100+ | 100+ | 125+ | 36+ | 99+ |
|
|
||||||
| **Auto-detect lang** | Up to 10 at-once | Yes | Yes | Yes | Yes |
|
|
||||||
| **Custom vocabulary** | PhraseListGrammar | Custom vocabulary | Speech adaptation | Keywords | No |
|
|
||||||
| **Native SDK** | Python, Swift (SPX), JS | Python, no iOS SDK | Python, iOS, JS | REST/WebSocket | Python only |
|
|
||||||
| **iOS native SDK** | SPX framework (ObjC) | No native SDK | Yes (gRPC) | No native SDK | No |
|
|
||||||
| **Free tier** | 5 hrs/month (F0) | 60 min/month | 60 min/month | None | Free (local GPU) |
|
|
||||||
| **Latency** | ~200ms | ~300ms | ~200ms | ~100ms | ~500ms+ (local) |
|
|
||||||
| **Migration effort** | — | **Hard** (no iOS SDK) | **Medium** (has iOS SDK) | **Medium** (REST only) | **Hard** (no streaming) |
|
|
||||||
|
|
||||||
### 4.4 LLM / AI: Azure OpenAI → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure OpenAI (current) | OpenAI API (direct) | Google Gemini | AWS Bedrock | Anthropic Claude |
|
|
||||||
|---------|----------------------|--------------------|--------------|-----------| -----------------|
|
|
||||||
| **Models** | GPT-4o, GPT-4o-mini | Same models | Gemini 2.5 | Claude, Llama, Titan | Claude 3.5/4 |
|
|
||||||
| **API compatibility** | OpenAI SDK (azure mode) | OpenAI SDK (native) | Google SDK | AWS SDK | Anthropic SDK |
|
|
||||||
| **Data residency** | Azure regions | US only | Google regions | AWS regions | US/EU |
|
|
||||||
| **Cost (GPT-4o-mini)** | $0.15/$0.60 per M tokens | $0.15/$0.60 per M tokens | ~$0.10/$0.40 (Flash) | Varies | ~$0.25/$1.25 (Haiku) |
|
|
||||||
| **Migration effort** | — | **Trivial** (change endpoint) | **Easy** (SDK swap) | **Medium** | **Easy** (SDK swap) |
|
|
||||||
|
|
||||||
### 4.5 Secrets Management: Key Vault → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure Key Vault (current) | AWS Secrets Manager | GCP Secret Manager | HashiCorp Vault | Doppler / Infisical |
|
|
||||||
|---------|--------------------------|--------------------|--------------------|-----------------|---------------------|
|
|
||||||
| **Cost** | $0.03/10K ops | $0.40/secret/month | $0.06/10K ops | Free (OSS) | Free tier |
|
|
||||||
| **SDK** | `@azure/keyvault-secrets` | `@aws-sdk/client-secrets-manager` | `@google-cloud/secret-manager` | HTTP API | SDK / CLI |
|
|
||||||
| **Migration effort** | — | **Easy** | **Easy** | **Medium** | **Easy** |
|
|
||||||
|
|
||||||
**Note:** The codebase already falls back to env vars when Key Vault is unavailable. This means Key Vault can be replaced by **any** secrets manager or simply .env files without code changes to application logic.
|
|
||||||
|
|
||||||
### 4.6 Push Notifications: Notification Hubs → Alternatives
|
|
||||||
|
|
||||||
| Feature | Azure NH (current) | AWS SNS | Firebase Cloud Messaging | OneSignal | Expo Push |
|
|
||||||
|---------|-------------------|---------|--------------------------|-----------|-----------|
|
|
||||||
| **APNs + FCM** | Yes | Yes | FCM only (APNs via FCM) | Yes | Yes |
|
|
||||||
| **Free tier** | 1M pushes/month | 1M publishes | Unlimited | 10K subscribers | Unlimited |
|
|
||||||
| **Migration effort** | — | **Easy** | **Easy** | **Easy** | **Easy** (NomGap uses Expo) |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 5. Per-Service Migration Analysis
|
|
||||||
|
|
||||||
### 5.1 Cosmos DB → MongoDB Atlas
|
|
||||||
|
|
||||||
**Difficulty: MEDIUM-HIGH** | **Effort: 3–5 weeks** | **Risk: MEDIUM**
|
|
||||||
|
|
||||||
This is the **single largest migration task**. Here's why:
|
|
||||||
|
|
||||||
#### What needs to change
|
|
||||||
|
|
||||||
| Layer | Current (Cosmos SQL API) | Target (MongoDB) | Files |
|
|
||||||
|-------|--------------------------|-------------------|-------|
|
|
||||||
| Client package | `@azure/cosmos` → `CosmosClient` | `mongodb` → `MongoClient` | `packages/cosmos/src/client.ts` |
|
|
||||||
| Container registry | `getContainer(name)` | `db.collection(name)` | `packages/cosmos/src/containers.ts` |
|
|
||||||
| All repository files | `container.items.query('SELECT...')` | `collection.find({...})` | **56+ files** in platform-service |
|
|
||||||
| Dashboard Cosmos clients | `@azure/cosmos` direct | `mongodb` direct | 2 files (admin, MindLyst) |
|
|
||||||
| Python clients | `azure.cosmos.CosmosClient` | `pymongo.MongoClient` | 2 files |
|
|
||||||
| Query syntax | SQL-like (`SELECT c.id FROM c WHERE c.productId = @pid AND c.userId = @uid ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20`) | MQL (`collection.find({productId: pid, userId: uid}).sort({createdAt: -1}).skip(0).limit(20)`) | All repository files |
|
|
||||||
| Partition keys | Explicit partition key in every query | Shard key (auto-routed) | All repository files |
|
|
||||||
| Upsert | `container.items.upsert(doc)` | `collection.updateOne({_id: id}, {$set: doc}, {upsert: true})` | ~20 files |
|
|
||||||
| Read by ID | `container.item(id, partitionKey).read()` | `collection.findOne({_id: id})` | All repository files |
|
|
||||||
|
|
||||||
#### What stays the same
|
|
||||||
- Document structure (JSON documents with `id`, `productId`, partition keys)
|
|
||||||
- Data model (no schema changes needed — MongoDB is also schemaless)
|
|
||||||
- Partition key concept maps to shard key
|
|
||||||
- Serverless pricing model available on both
|
|
||||||
|
|
||||||
#### Key migration steps
|
|
||||||
1. Update `@bytelyst/cosmos` package to export MongoDB-compatible API
|
|
||||||
2. Rewrite all SQL queries to MQL (56+ files)
|
|
||||||
3. Replace `container.items.query()` → `collection.find()`
|
|
||||||
4. Replace `container.item(id, pk).read()` → `collection.findOne({_id: id})`
|
|
||||||
5. Replace `container.items.create()` → `collection.insertOne()`
|
|
||||||
6. Replace `container.items.replace()` → `collection.replaceOne()`
|
|
||||||
7. Replace `container.items.upsert()` → `collection.updateOne({upsert: true})`
|
|
||||||
8. Update Python clients similarly
|
|
||||||
9. Migrate data (use Azure Data Factory or custom script)
|
|
||||||
10. Update all test mocks
|
|
||||||
|
|
||||||
#### Why MongoDB Atlas is the best DB alternative
|
|
||||||
- **Closest query model** to Cosmos SQL API (both are document DBs)
|
|
||||||
- **MongoDB has a Cosmos DB compatibility mode** (but going native is better)
|
|
||||||
- Cosmos DB was originally inspired by MongoDB's document model
|
|
||||||
- MongoDB's `find()` queries map closely to Cosmos SQL `SELECT` queries
|
|
||||||
- Both support partition/shard keys, TTL indexes, change streams
|
|
||||||
- MongoDB Atlas Serverless pricing is competitive
|
|
||||||
- MongoDB has excellent TypeScript and Python SDKs
|
|
||||||
|
|
||||||
### 5.2 Azure Speech → Google Cloud Speech-to-Text
|
|
||||||
|
|
||||||
**Difficulty: HIGH** | **Effort: 2–3 weeks** | **Risk: HIGH**
|
|
||||||
|
|
||||||
#### Why this is hard
|
|
||||||
- The Azure Speech SDK uses a **push-stream architecture** (`PushAudioInputStream`) that is deeply integrated into the audio pipeline
|
|
||||||
- The `SpeechRecognizer` has event-driven callbacks (`recognizing`, `recognized`, `canceled`, `session_stopped`) that the code relies on for real-time partial/final transcript delivery
|
|
||||||
- Custom vocabulary via `PhraseListGrammar` is Azure-specific
|
|
||||||
- Auto-language detection config is Azure-specific
|
|
||||||
- The **iOS SPX framework** (Objective-C) is used in LysnrAI keyboard extension and MindLyst — there's no direct equivalent for most alternatives
|
|
||||||
|
|
||||||
#### Best alternative: Google Cloud Speech-to-Text
|
|
||||||
- Has streaming recognition with similar event model
|
|
||||||
- Has an iOS SDK (gRPC-based)
|
|
||||||
- Supports custom vocabulary (speech adaptation)
|
|
||||||
- Supports auto-language detection
|
|
||||||
- Similar pricing and free tier
|
|
||||||
|
|
||||||
#### What needs to change
|
|
||||||
- `src/audio/azure_stt.py` — complete rewrite (~248 lines)
|
|
||||||
- `iosApp/Services/AzureSpeechTranscriber.swift` — complete rewrite
|
|
||||||
- `LysnrAI/LysnrKeyboard/` — keyboard extension STT integration
|
|
||||||
- Audio format handling (may differ between providers)
|
|
||||||
- Connection test code in settings UI
|
|
||||||
|
|
||||||
### 5.3 Blob Storage → AWS S3 or Cloudflare R2
|
|
||||||
|
|
||||||
**Difficulty: LOW** | **Effort: 2–3 days** | **Risk: LOW**
|
|
||||||
|
|
||||||
#### Why this is easy
|
|
||||||
- `@bytelyst/blob` package is a thin wrapper (162 lines)
|
|
||||||
- Only 3 files need changes
|
|
||||||
- S3 API is the de facto standard — R2, MinIO, GCS all support S3-compatible API
|
|
||||||
- SAS tokens → Pre-signed URLs (same concept, different implementation)
|
|
||||||
|
|
||||||
#### What needs to change
|
|
||||||
- `packages/blob/src/blob.ts` — swap `@azure/storage-blob` → `@aws-sdk/client-s3` + `@aws-sdk/s3-request-presigner`
|
|
||||||
- `src/cloud/blob_client.py` — swap `azure.storage.blob` → `boto3`
|
|
||||||
- `services/platform-service/src/modules/blob/` — update routes for pre-signed URL format
|
|
||||||
- Environment variables: `AZURE_BLOB_*` → `AWS_S3_*` or `S3_*`
|
|
||||||
|
|
||||||
### 5.4 Azure OpenAI → OpenAI API (direct) or Gemini
|
|
||||||
|
|
||||||
**Difficulty: TRIVIAL** | **Effort: < 1 day** | **Risk: VERY LOW**
|
|
||||||
|
|
||||||
#### Why this is trivial
|
|
||||||
- The `openai` Python SDK supports both Azure and OpenAI endpoints — just change config
|
|
||||||
- MindLyst web `llm.ts` **already auto-detects** Azure vs OpenAI and builds the correct URL
|
|
||||||
- LysnrAI desktop uses `AzureOpenAI` class from `openai` SDK — switch to `OpenAI` class
|
|
||||||
- Same models, same API shape, same pricing
|
|
||||||
|
|
||||||
#### What needs to change
|
|
||||||
- Set `OPENAI_API_KEY` instead of `AZURE_OPENAI_*` env vars
|
|
||||||
- Change `AzureOpenAI(azure_endpoint=..., api_key=..., api_version=...)` → `OpenAI(api_key=...)`
|
|
||||||
- Change `AsyncAzureOpenAI(...)` → `AsyncOpenAI(...)`
|
|
||||||
- Remove `api_version` parameter
|
|
||||||
- That's it. The `openai` SDK handles the rest.
|
|
||||||
|
|
||||||
### 5.5 Key Vault → Environment Variables / Any Secrets Manager
|
|
||||||
|
|
||||||
**Difficulty: TRIVIAL** | **Effort: < 1 day** | **Risk: VERY LOW**
|
|
||||||
|
|
||||||
Both `keyvault.ts` and `keyvault.py` already implement graceful fallback:
|
|
||||||
- If `AZURE_KEYVAULT_URL` is not set → uses env vars directly
|
|
||||||
- If Key Vault is unreachable → falls back to env vars
|
|
||||||
|
|
||||||
**To migrate:** Simply stop setting `AZURE_KEYVAULT_URL`. Everything works via env vars. Then optionally adopt any other secrets manager (AWS Secrets Manager, Doppler, Infisical, etc.).
|
|
||||||
|
|
||||||
### 5.6 Notification Hubs → Firebase Cloud Messaging
|
|
||||||
|
|
||||||
**Difficulty: LOW** | **Effort: 1–2 days** | **Risk: LOW**
|
|
||||||
|
|
||||||
Not yet deeply integrated. The platform-service notification module sends via REST API. Swap the push provider client.
|
|
||||||
|
|
||||||
### 5.7 Application Insights → Self-hosted / Grafana
|
|
||||||
|
|
||||||
**Difficulty: TRIVIAL** | **Effort: Already done** | **Risk: NONE**
|
|
||||||
|
|
||||||
The ecosystem already has:
|
|
||||||
- Custom telemetry system (`@bytelyst/telemetry-client` → platform-service → Cosmos)
|
|
||||||
- Loki + Grafana in `services/monitoring/`
|
|
||||||
- App Insights is supplementary, can be dropped with zero code changes
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 6. Migration Scenario Scoring
|
|
||||||
|
|
||||||
### Scenario A: Stay on Azure (Status Quo)
|
|
||||||
|
|
||||||
| Dimension | Score (1-5) | Notes |
|
|
||||||
|-----------|-------------|-------|
|
|
||||||
| Migration effort | **5** (none) | No work needed |
|
|
||||||
| Cost | **4** | ~$15/month at current scale, competitive |
|
|
||||||
| Vendor diversity | **1** | Single cloud vendor |
|
|
||||||
| Feature parity | **5** | Everything works today |
|
|
||||||
| **Total** | **15/20** | |
|
|
||||||
|
|
||||||
### Scenario B: Full Migration to AWS
|
|
||||||
|
|
||||||
| Dimension | Score (1-5) | Notes |
|
|
||||||
|-----------|-------------|-------|
|
|
||||||
| Migration effort | **2** | 6–8 weeks, Cosmos→DynamoDB is painful |
|
|
||||||
| Cost | **3** | Similar or slightly higher at small scale |
|
|
||||||
| Vendor diversity | **1** | Still single cloud, just different |
|
|
||||||
| Feature parity | **3** | No native iOS Speech SDK, DynamoDB query model is very different |
|
|
||||||
| **Total** | **9/20** | |
|
|
||||||
|
|
||||||
### Scenario C: Multi-Cloud (MongoDB Atlas + OpenAI + R2 + Google STT)
|
|
||||||
|
|
||||||
| Dimension | Score (1-5) | Notes |
|
|
||||||
|-----------|-------------|-------|
|
|
||||||
| Migration effort | **2** | 5–7 weeks, Cosmos→MongoDB is medium |
|
|
||||||
| Cost | **4** | MongoDB Atlas free tier, R2 no egress fees |
|
|
||||||
| Vendor diversity | **5** | No single-vendor dependency |
|
|
||||||
| Feature parity | **4** | MongoDB is a better document DB than Cosmos in many ways |
|
|
||||||
| **Total** | **15/20** | |
|
|
||||||
|
|
||||||
### Scenario D: Stay Azure + Add Abstraction Layers
|
|
||||||
|
|
||||||
| Dimension | Score (1-5) | Notes |
|
|
||||||
|-----------|-------------|-------|
|
|
||||||
| Migration effort | **4** | 1–2 weeks to add repository interface pattern |
|
|
||||||
| Cost | **4** | No change |
|
|
||||||
| Vendor diversity | **3** | Ready to switch, but still on Azure |
|
|
||||||
| Feature parity | **5** | Everything works today |
|
|
||||||
| **Total** | **16/20** | **Winner** |
|
|
||||||
|
|
||||||
### Scenario E: Migrate DB Only (Cosmos → MongoDB Atlas, keep rest on Azure)
|
|
||||||
|
|
||||||
| Dimension | Score (1-5) | Notes |
|
|
||||||
|-----------|-------------|-------|
|
|
||||||
| Migration effort | **3** | 3–5 weeks for DB migration |
|
|
||||||
| Cost | **4** | MongoDB Atlas Serverless may be cheaper |
|
|
||||||
| Vendor diversity | **3** | DB is independent, other services still Azure |
|
|
||||||
| Feature parity | **5** | MongoDB is very capable |
|
|
||||||
| **Total** | **15/20** | |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 7. Cost Comparison
|
|
||||||
|
|
||||||
### Current Azure Costs (MVP / Low Usage)
|
|
||||||
|
|
||||||
| Service | Monthly Cost | Notes |
|
|
||||||
|---------|-------------|-------|
|
|
||||||
| Cosmos DB (Serverless) | ~$4–10 | 3 databases, ~45 containers |
|
|
||||||
| Blob Storage (Cool, RAGRS) | ~$0.20 | 9+ containers |
|
|
||||||
| Azure OpenAI (GPT-4o-mini) | ~$5–10 | Pay per token |
|
|
||||||
| Speech (F0) | $0 | 5 hrs/month free |
|
|
||||||
| Key Vault | ~$0.06 | ~25 secrets |
|
|
||||||
| Notification Hubs (Free) | $0 | 1M pushes/month |
|
|
||||||
| App Insights | $0 | 5 GB/month free |
|
|
||||||
| **Total** | **~$10–20/month** | |
|
|
||||||
|
|
||||||
### Equivalent AWS Costs
|
|
||||||
|
|
||||||
| Service | AWS Equivalent | Monthly Cost |
|
|
||||||
|---------|---------------|-------------|
|
|
||||||
| Cosmos DB → DynamoDB (On-Demand) | DynamoDB | ~$5–15 |
|
|
||||||
| Blob → S3 Standard | S3 | ~$0.25 |
|
|
||||||
| Azure OpenAI → OpenAI API | Same pricing | ~$5–10 |
|
|
||||||
| Speech → Transcribe | Transcribe | ~$1–3 |
|
|
||||||
| Key Vault → Secrets Manager | Secrets Manager | ~$10 (per-secret pricing) |
|
|
||||||
| Notification Hubs → SNS | SNS | ~$0.50 |
|
|
||||||
| App Insights → CloudWatch | CloudWatch | ~$3 |
|
|
||||||
| **Total** | | **~$25–42/month** |
|
|
||||||
|
|
||||||
### Equivalent Multi-Cloud Costs
|
|
||||||
|
|
||||||
| Service | Provider | Monthly Cost |
|
|
||||||
|---------|---------|-------------|
|
|
||||||
| Cosmos DB → MongoDB Atlas Serverless | MongoDB | ~$3–8 |
|
|
||||||
| Blob → Cloudflare R2 | Cloudflare | ~$0.15 (no egress) |
|
|
||||||
| Azure OpenAI → OpenAI API (direct) | OpenAI | ~$5–10 |
|
|
||||||
| Speech → Google STT | Google Cloud | ~$1–3 |
|
|
||||||
| Key Vault → Doppler (free tier) | Doppler | $0 |
|
|
||||||
| Push → Firebase FCM | Google | $0 |
|
|
||||||
| Monitoring → Grafana Cloud (free) | Grafana | $0 |
|
|
||||||
| **Total** | | **~$10–22/month** |
|
|
||||||
|
|
||||||
### Cost Summary
|
|
||||||
|
|
||||||
| Scenario | Monthly Cost | vs Current |
|
|
||||||
|----------|-------------|-----------|
|
|
||||||
| **Azure (current)** | ~$10–20 | Baseline |
|
|
||||||
| **Full AWS** | ~$25–42 | +50–110% |
|
|
||||||
| **Multi-cloud** | ~$10–22 | ~Same |
|
|
||||||
| **MongoDB Atlas + Azure rest** | ~$10–18 | ~Same |
|
|
||||||
|
|
||||||
**Verdict:** At current scale, cost is not a compelling reason to migrate. All options are under $50/month. Cost becomes more significant at scale (10K+ users), where MongoDB Atlas and R2 would likely be cheaper due to no egress fees and better serverless pricing.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 8. Abstraction Layer Assessment
|
|
||||||
|
|
||||||
### Current State: Partially Abstracted
|
|
||||||
|
|
||||||
The codebase already has meaningful abstraction:
|
|
||||||
|
|
||||||
| Layer | Abstraction Level | Notes |
|
|
||||||
|-------|-------------------|-------|
|
|
||||||
| **Cosmos DB** | **Partial** — `@bytelyst/cosmos` package | Application code still writes raw SQL queries and uses `@azure/cosmos` types |
|
|
||||||
| **Blob Storage** | **Good** — `@bytelyst/blob` package | Thin wrapper, easy to swap internals |
|
|
||||||
| **OpenAI/LLM** | **Good** — MindLyst has provider auto-detection | LysnrAI desktop/backend hardcodes `AzureOpenAI` |
|
|
||||||
| **Key Vault** | **Excellent** — graceful fallback to env vars | Already cloud-agnostic in practice |
|
|
||||||
| **Speech** | **None** — raw SDK usage | Deep Azure SDK coupling in 3 files |
|
|
||||||
| **Auth (JWT)** | **Excellent** — uses `jose` library | No cloud dependency |
|
|
||||||
| **Push notifications** | **Good** — platform-service abstraction | Swap provider client only |
|
|
||||||
|
|
||||||
### What's Missing: Repository Interface Pattern
|
|
||||||
|
|
||||||
The biggest gap is that repository files directly use `@azure/cosmos` types and SQL query syntax. To make the DB layer swappable, you'd need:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// Proposed: packages/cosmos/src/repository.ts
|
|
||||||
export interface DocumentRepository<T> {
|
|
||||||
findById(id: string, partitionKey: string): Promise<T | null>;
|
|
||||||
findMany(filter: Record<string, unknown>, opts?: QueryOptions): Promise<T[]>;
|
|
||||||
create(doc: T): Promise<T>;
|
|
||||||
replace(id: string, doc: T, partitionKey: string): Promise<T>;
|
|
||||||
upsert(doc: T): Promise<T>;
|
|
||||||
delete(id: string, partitionKey: string): Promise<void>;
|
|
||||||
count(filter: Record<string, unknown>): Promise<number>;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
This would allow swapping Cosmos → MongoDB → PostgreSQL behind the interface without touching 56+ repository files.
|
|
||||||
|
|
||||||
**Effort to add:** 1–2 weeks. This is the **highest-ROI investment** regardless of migration decision.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 9. Risk Analysis
|
|
||||||
|
|
||||||
### 9.1 Risks of Staying on Azure
|
|
||||||
|
|
||||||
| Risk | Likelihood | Impact | Mitigation |
|
|
||||||
|------|-----------|--------|-----------|
|
|
||||||
| Azure pricing increases | Low | Medium | Add abstraction layer for future portability |
|
|
||||||
| Azure outage | Low | High | Multi-region already possible (Cosmos global distribution) |
|
|
||||||
| Feature stagnation | Very Low | Low | Azure is investing heavily in AI services |
|
|
||||||
| Vendor lock-in deepens over time | Medium | Medium | Add abstraction layers proactively |
|
|
||||||
|
|
||||||
### 9.2 Risks of Migrating
|
|
||||||
|
|
||||||
| Risk | Likelihood | Impact | Mitigation |
|
|
||||||
|------|-----------|--------|-----------|
|
|
||||||
| Data loss during migration | Low | Critical | Test migration on staging first, keep Azure as backup |
|
|
||||||
| Query performance differences | Medium | Medium | Benchmark before committing |
|
|
||||||
| Feature gaps in new provider | Medium | Medium | Prototype critical features first |
|
|
||||||
| Wasted engineering time | Medium | High | Only migrate if there's a clear business driver |
|
|
||||||
| Regression bugs in 56+ repository files | High | Medium | Comprehensive test suite (1,029 tests) catches most issues |
|
|
||||||
| Speech quality degradation | Medium | High | A/B test both providers before committing |
|
|
||||||
|
|
||||||
### 9.3 Azure-Specific Lock-in Risks (ranked)
|
|
||||||
|
|
||||||
| # | Component | Lock-in Level | Escape Hatch |
|
|
||||||
|---|-----------|--------------|-------------|
|
|
||||||
| 1 | **Cosmos DB SQL API** | High | Rewrite queries to MongoDB MQL or add repository interface |
|
|
||||||
| 2 | **Azure Speech SDK (streaming)** | High | Google STT has comparable streaming API |
|
|
||||||
| 3 | **Azure Identity (DefaultAzureCredential)** | Medium | Only used by Key Vault, which is already optional |
|
|
||||||
| 4 | **Blob Storage SAS tokens** | Low | Pre-signed URLs are equivalent across all providers |
|
|
||||||
| 5 | **Azure OpenAI** | Very Low | OpenAI SDK works with both — 1-line config change |
|
|
||||||
| 6 | **Key Vault** | Very Low | Already has env var fallback |
|
|
||||||
| 7 | **Notification Hubs** | Very Low | Not deeply integrated yet |
|
|
||||||
| 8 | **Application Insights** | None | Custom telemetry already built |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 10. Recommendations
|
|
||||||
|
|
||||||
### Recommended Strategy: **Stay on Azure + Invest in Abstraction** (Scenario D)
|
|
||||||
|
|
||||||
This is the highest-scoring approach. Here's the prioritized action plan:
|
|
||||||
|
|
||||||
#### Phase 1: Add Repository Interface (1–2 weeks)
|
|
||||||
- Create `DocumentRepository<T>` interface in `@bytelyst/cosmos`
|
|
||||||
- Implement `CosmosDocumentRepository<T>` that wraps current `@azure/cosmos` calls
|
|
||||||
- Gradually migrate the 56 repository files to use the interface
|
|
||||||
- This makes future DB migration a matter of implementing `MongoDocumentRepository<T>` — no application code changes needed
|
|
||||||
|
|
||||||
#### Phase 2: Normalize LLM Abstraction (2–3 days)
|
|
||||||
- Move LysnrAI desktop/backend from `AzureOpenAI` → auto-detecting provider pattern (like MindLyst web already does)
|
|
||||||
- Support `OPENAI_PROVIDER=azure|openai|gemini` across all repos
|
|
||||||
- This makes LLM provider swappable via config
|
|
||||||
|
|
||||||
#### Phase 3: Speech Abstraction Layer (1 week, optional)
|
|
||||||
- Create `SpeechTranscriber` protocol/interface
|
|
||||||
- Implement `AzureSpeechTranscriber` (current code, extracted)
|
|
||||||
- Prepare `GoogleSpeechTranscriber` stub for future use
|
|
||||||
- This is lower priority since Azure Speech F0 tier is free
|
|
||||||
|
|
||||||
#### Phase 4: Document Decision Criteria for Future Migration
|
|
||||||
- Define triggers that would justify migration (e.g., cost > $X/month, Azure outage > Y hours, need for feature Z)
|
|
||||||
- Review annually
|
|
||||||
|
|
||||||
### Why NOT Migrate Now
|
|
||||||
|
|
||||||
1. **Cost is negligible** — ~$10–20/month doesn't justify weeks of engineering
|
|
||||||
2. **No business driver** — Azure isn't blocking any feature development
|
|
||||||
3. **Risk/reward is unfavorable** — 4–8 weeks of migration work for ~$0 cost savings
|
|
||||||
4. **Test coverage is good but not perfect** — 1,029 tests cover most paths, but query-level changes in 56 files still risk regressions
|
|
||||||
5. **Azure free tiers are generous** — Speech F0, Notification Hubs Free, App Insights free tier
|
|
||||||
|
|
||||||
### When Migration WOULD Make Sense
|
|
||||||
|
|
||||||
- **Cosmos DB costs exceed $100/month** → Consider MongoDB Atlas Serverless
|
|
||||||
- **Azure Speech quality is insufficient** → Evaluate Google STT or Deepgram
|
|
||||||
- **Enterprise customer requires specific cloud** → Build the repository interface, then implement their cloud backend
|
|
||||||
- **Azure has extended outage affecting your region** → Multi-region or multi-cloud
|
|
||||||
- **You want to go fully open-source** → PostgreSQL (Supabase) + Whisper + MinIO (significant rewrite)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 11. Migration Playbook (If Chosen)
|
|
||||||
|
|
||||||
If you decide to migrate in the future, here's the execution order (shortest critical path):
|
|
||||||
|
|
||||||
### Week 1–2: Database Abstraction
|
|
||||||
1. Create `DocumentRepository<T>` interface
|
|
||||||
2. Implement `CosmosDocumentRepository<T>` (wraps current code)
|
|
||||||
3. Migrate all 56 repository files to use interface
|
|
||||||
4. Verify all 1,029 tests pass
|
|
||||||
|
|
||||||
### Week 3–4: Database Migration (Cosmos → MongoDB)
|
|
||||||
1. Implement `MongoDocumentRepository<T>`
|
|
||||||
2. Set up MongoDB Atlas Serverless cluster
|
|
||||||
3. Write data migration script (Cosmos → MongoDB)
|
|
||||||
4. Run migration on staging, verify data integrity
|
|
||||||
5. Switch repository implementation via config flag
|
|
||||||
6. Run full test suite against MongoDB
|
|
||||||
|
|
||||||
### Week 5: Storage + Secrets
|
|
||||||
1. Swap `@bytelyst/blob` internals to S3-compatible client
|
|
||||||
2. Migrate blobs (azcopy → aws s3 sync or similar)
|
|
||||||
3. Replace Key Vault with new secrets manager (or just env vars)
|
|
||||||
4. Update all environment variable names
|
|
||||||
|
|
||||||
### Week 6: LLM + Speech (if needed)
|
|
||||||
1. Switch OpenAI from Azure endpoint to direct (config change only)
|
|
||||||
2. If migrating Speech: rewrite `azure_stt.py` and Swift `AzureSpeechTranscriber`
|
|
||||||
3. A/B test new speech provider against Azure
|
|
||||||
|
|
||||||
### Week 7–8: Cleanup + Verification
|
|
||||||
1. Remove all `@azure/*` npm packages
|
|
||||||
2. Remove all `azure-*` pip packages
|
|
||||||
3. Update Docker configs, CI/CD
|
|
||||||
4. Update documentation
|
|
||||||
5. Monitor production for 2 weeks
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Appendix A: File-Level Azure Dependency Map
|
|
||||||
|
|
||||||
### TypeScript — `@azure/cosmos` (CRITICAL)
|
|
||||||
|
|
||||||
| File | Repo | Direct Import |
|
|
||||||
|------|------|---------------|
|
|
||||||
| `packages/cosmos/src/client.ts` | common-plat | `@azure/cosmos` |
|
|
||||||
| `packages/cosmos/src/containers.ts` | common-plat | `@azure/cosmos` |
|
|
||||||
| `services/platform-service/src/modules/*/repository.ts` (56 files) | common-plat | Via `@bytelyst/cosmos` |
|
|
||||||
| `services/extraction-service/src/modules/*/repository.ts` (2 files) | common-plat | Via `@bytelyst/cosmos` |
|
|
||||||
| `dashboards/admin-web/src/lib/cosmos.ts` | common-plat | `@azure/cosmos` |
|
|
||||||
| `dashboards/admin-web/src/lib/repositories/*.ts` (4 files) | common-plat | Via cosmos.ts |
|
|
||||||
| `mindlyst-native/web/src/lib/cosmos.ts` | MindLyst | `@azure/cosmos` |
|
|
||||||
|
|
||||||
### TypeScript — `@azure/storage-blob`
|
|
||||||
|
|
||||||
| File | Repo | Direct Import |
|
|
||||||
|------|------|---------------|
|
|
||||||
| `packages/blob/src/blob.ts` | common-plat | `@azure/storage-blob` |
|
|
||||||
|
|
||||||
### TypeScript — `@azure/identity` + `@azure/keyvault-secrets`
|
|
||||||
|
|
||||||
| File | Repo | Direct Import |
|
|
||||||
|------|------|---------------|
|
|
||||||
| `packages/config/src/keyvault.ts` | common-plat | Dynamic import (both) |
|
|
||||||
| `dashboards/admin-web/src/app/api/ops/secrets/route.ts` | common-plat | Both (Secrets Manager UI) |
|
|
||||||
|
|
||||||
### Python — Azure SDKs
|
|
||||||
|
|
||||||
| File | Repo | SDK |
|
|
||||||
|------|------|-----|
|
|
||||||
| `src/audio/azure_stt.py` | LysnrAI | `azure.cognitiveservices.speech` |
|
|
||||||
| `src/cloud/cosmos_client.py` | LysnrAI | `azure.cosmos` |
|
|
||||||
| `src/cloud/blob_client.py` | LysnrAI | `azure.storage.blob` |
|
|
||||||
| `src/secrets/keyvault.py` | LysnrAI | `azure.identity`, `azure.keyvault.secrets` |
|
|
||||||
| `backend/src/secrets/keyvault.py` | LysnrAI | `azure.identity`, `azure.keyvault.secrets` |
|
|
||||||
| `backend/src/cloud/cosmos.py` | LysnrAI | `azure.cosmos` |
|
|
||||||
| `src/llm/text_cleaner.py` | LysnrAI | `openai.AzureOpenAI` |
|
|
||||||
| `backend/src/clients/openai_client.py` | LysnrAI | `openai.AsyncAzureOpenAI` |
|
|
||||||
|
|
||||||
### Swift — Azure Speech SDK
|
|
||||||
|
|
||||||
| File | Repo | SDK |
|
|
||||||
|------|------|-----|
|
|
||||||
| `iosApp/Services/AzureSpeechTranscriber.swift` | MindLyst | `MicrosoftCognitiveServicesSpeech` |
|
|
||||||
| `LysnrAI/LysnrKeyboard/KeyboardViewController.swift` | LysnrAI | SPX framework (via CocoaPods) |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Appendix B: SDK & Package Inventory
|
|
||||||
|
|
||||||
### npm packages (TypeScript)
|
|
||||||
|
|
||||||
| Package | Version | Used By | Swappable |
|
|
||||||
|---------|---------|---------|-----------|
|
|
||||||
| `@azure/cosmos` | ≥4.0.0 | `@bytelyst/cosmos`, admin-web, MindLyst web | Medium (query rewrite) |
|
|
||||||
| `@azure/storage-blob` | ≥12.0.0 | `@bytelyst/blob` | Easy (S3 compat) |
|
|
||||||
| `@azure/identity` | latest | `@bytelyst/config`, admin-web secrets | Easy (remove) |
|
|
||||||
| `@azure/keyvault-secrets` | latest | `@bytelyst/config`, admin-web secrets | Easy (remove) |
|
|
||||||
|
|
||||||
### pip packages (Python)
|
|
||||||
|
|
||||||
| Package | Version | Used By | Swappable |
|
|
||||||
|---------|---------|---------|-----------|
|
|
||||||
| `azure-cognitiveservices-speech` | ≥1.42.0 | Desktop STT | Hard (deep SDK integration) |
|
|
||||||
| `azure-cosmos` | latest | Desktop + backend Cosmos client | Medium (pymongo swap) |
|
|
||||||
| `azure-storage-blob` | ≥12.24.0 | Desktop blob client | Easy (boto3 swap) |
|
|
||||||
| `azure-identity` | ≥1.19.0 | Key Vault auth | Easy (remove) |
|
|
||||||
| `azure-keyvault-secrets` | ≥4.9.0 | Secrets resolver | Easy (remove) |
|
|
||||||
| `openai` | ≥1.60.0 | `AzureOpenAI` / `AsyncAzureOpenAI` | Trivial (change class name) |
|
|
||||||
| `opencensus-ext-azure` | ≥1.1.0 | Optional telemetry | Trivial (remove) |
|
|
||||||
|
|
||||||
### Swift packages / CocoaPods
|
|
||||||
|
|
||||||
| Package | Used By | Swappable |
|
|
||||||
|---------|---------|-----------|
|
|
||||||
| `MicrosoftCognitiveServicesSpeech` (SPX) | LysnrAI iOS, MindLyst iOS | Hard (need alternative streaming STT) |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*Document generated by automated codebase analysis. Numbers are accurate as of 2026-03-01. Update as the codebase evolves.*
|
|
||||||
@ -1,181 +0,0 @@
|
|||||||
# Azure Connection Audit — Full Workspace Report
|
|
||||||
|
|
||||||
> **Date:** 2026-02-22
|
|
||||||
> **Scope:** `learning_ai_common_plat`, `learning_voice_ai_agent`, `learning_multimodal_memory_agents`, `learning_ai_clock`, `learning_ai_fastgap`
|
|
||||||
> **Auditor:** Cascade (AI)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Executive Summary
|
|
||||||
|
|
||||||
| Category | Issues Found | Fixed (session 1) | Fixed (session 2) | Remaining |
|
|
||||||
|----------|-------------|-------------------|-------------------|-----------|
|
|
||||||
| `x-request-id` missing | 12 clients | 2 (MindLyst) | **9** (root cause + feature-flags) | 0 ✅ |
|
|
||||||
| `x-product-id` missing | 6 clients | 0 | **6** (admin + user dashboards + Python) | 0 ✅ |
|
|
||||||
| Cosmos PK mismatch | 1 container | 0 (flagged) | 0 | 1 (needs migration) |
|
|
||||||
| `.env.example` gaps | 4 files | 1 (MindLyst) | **3** (ChronoMind, user-dash, admin-dash) | 0 ✅ |
|
|
||||||
| Hardcoded productId | 2 instances | 0 | **2** (telemetry.ts, platform_client.py) | 0 ✅ |
|
|
||||||
| Python client gaps | 1 file | 0 | **1** (headers + config) | 0 ✅ |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 1. `x-request-id` Header — Root Cause
|
|
||||||
|
|
||||||
### Finding
|
|
||||||
|
|
||||||
**`@bytelyst/api-client` does NOT auto-inject `x-request-id`.**
|
|
||||||
|
|
||||||
The `createApiClient()` factory in `packages/api-client/src/client.ts` only sets `Content-Type`, auth token (via `getToken`), and caller-supplied `defaultHeaders`. No `x-request-id` is generated. This means **every consumer** that relies on `@bytelyst/api-client` without explicitly adding the header is missing request tracing.
|
|
||||||
|
|
||||||
### Root Cause Fix
|
|
||||||
|
|
||||||
Add `x-request-id: crypto.randomUUID()` to `buildHeaders()` in `packages/api-client/src/client.ts`. This single change propagates to all consumers automatically.
|
|
||||||
|
|
||||||
### Affected Clients (missing `x-request-id`)
|
|
||||||
|
|
||||||
| Repo | File | Client Pattern |
|
|
||||||
|------|------|---------------|
|
|
||||||
| `common_plat` | `dashboards/admin-web/src/lib/billing-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `common_plat` | `dashboards/admin-web/src/lib/growth-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `common_plat` | `dashboards/admin-web/src/lib/platform-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `common_plat` | `dashboards/tracker-web/src/lib/tracker-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `common_plat` | `packages/extraction/src/client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/billing-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/growth-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/platform-client.ts` | `createApiClient` — no `x-request-id` |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/feature-flags.ts` | Custom `fetch` — no `x-request-id` |
|
|
||||||
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | `httpx` — no `x-request-id` |
|
|
||||||
|
|
||||||
### Already Fixed (previous session)
|
|
||||||
|
|
||||||
| Repo | File | Status |
|
|
||||||
|------|------|--------|
|
|
||||||
| `multimodal_memory` | `web/src/lib/billing-client.ts` | ✅ Added via `defaultHeaders` |
|
|
||||||
| `multimodal_memory` | `web/src/lib/feature-flags.ts` | ✅ Added manually |
|
|
||||||
|
|
||||||
### Already Correct
|
|
||||||
|
|
||||||
| Repo | File | Status |
|
|
||||||
|------|------|--------|
|
|
||||||
| `ai_fastgap` (NomGap) | `src/api/client.ts` | ✅ Custom client with `crypto.randomUUID()` |
|
|
||||||
| `ai_clock` (ChronoMind) | `web/src/lib/platform-sync.ts` | ✅ Custom client with `crypto.randomUUID()` |
|
|
||||||
| `voice_ai_agent` | `backend/src/main.py` | ✅ Middleware propagates/generates |
|
|
||||||
| `voice_ai_agent` | `backend/src/clients/extraction_client.py` | ✅ Passes `request_id` param |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 2. `x-product-id` Header Gaps
|
|
||||||
|
|
||||||
### Clients Missing `x-product-id`
|
|
||||||
|
|
||||||
| Repo | File | Impact |
|
|
||||||
|------|------|--------|
|
|
||||||
| `common_plat` | `admin-web/src/lib/billing-client.ts` | Server can't filter by product |
|
|
||||||
| `common_plat` | `admin-web/src/lib/growth-client.ts` | Server can't filter by product |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/billing-client.ts` | Server can't filter by product |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/growth-client.ts` | Server can't filter by product |
|
|
||||||
| `voice_ai_agent` | `user-dashboard-web/src/lib/platform-client.ts` | Passes in body, not header |
|
|
||||||
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | Passes in body/params, not header |
|
|
||||||
|
|
||||||
### Already Correct
|
|
||||||
|
|
||||||
| Repo | File |
|
|
||||||
|------|------|
|
|
||||||
| `ai_fastgap` (NomGap) | `src/api/client.ts` — `x-product-id: API_CONFIG.productId` |
|
|
||||||
| `ai_clock` (ChronoMind) | `web/src/lib/platform-sync.ts` — `x-product-id` header |
|
|
||||||
| `multimodal_memory` (MindLyst) | `web/src/lib/billing-client.ts` — via `defaultHeaders` |
|
|
||||||
| `multimodal_memory` (MindLyst) | `web/src/lib/feature-flags.ts` — explicit header |
|
|
||||||
| `common_plat` | `tracker-web/src/lib/tracker-client.ts` — from `localStorage` |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 3. Cosmos DB Partition Key Mismatch
|
|
||||||
|
|
||||||
### `referrals` Container — 3-way Mismatch
|
|
||||||
|
|
||||||
| Location | Partition Key |
|
|
||||||
|----------|--------------|
|
|
||||||
| `platform-service/src/lib/cosmos-init.ts` | `/id` |
|
|
||||||
| MindLyst `web/src/lib/cosmos.ts` | `/userId` |
|
|
||||||
| Admin dashboard `admin-web/src/lib/cosmos.ts` | `/referrerId` |
|
|
||||||
| User dashboard `user-dashboard-web/src/lib/cosmos.ts` | `/referrerId` |
|
|
||||||
|
|
||||||
**Status:** Flagged in previous session. Cannot be fixed without data migration. Comment added to `cosmos-init.ts`.
|
|
||||||
|
|
||||||
**Risk:** Cross-partition queries will silently succeed but may return incomplete results or fail on point reads if the wrong partition key is specified.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 4. Missing Environment Variables in `.env.example` Files
|
|
||||||
|
|
||||||
### ChronoMind `web/.env.example`
|
|
||||||
|
|
||||||
Currently only has:
|
|
||||||
```
|
|
||||||
NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003/api
|
|
||||||
```
|
|
||||||
|
|
||||||
**Missing:**
|
|
||||||
- `NEXT_PUBLIC_PRODUCT_ID=chronomind` — used implicitly by `platform-sync.ts` (hardcoded there, but should be env-driven for consistency)
|
|
||||||
|
|
||||||
### LysnrAI `user-dashboard-web/.env.example`
|
|
||||||
|
|
||||||
**Missing:**
|
|
||||||
- `NEXT_PUBLIC_PRODUCT_ID=lysnrai` — referenced by `feature-flags.ts` line 10
|
|
||||||
- `NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003` — referenced by `feature-flags.ts` line 11
|
|
||||||
|
|
||||||
Has `PLATFORM_SERVICE_URL` (server-side) but not the `NEXT_PUBLIC_` variant (client-side).
|
|
||||||
|
|
||||||
### LysnrAI root `.env.example`
|
|
||||||
|
|
||||||
**Missing:**
|
|
||||||
- `NEXT_PUBLIC_PRODUCT_ID` — not needed at root level (desktop app), so this is informational only.
|
|
||||||
|
|
||||||
### Admin dashboard `.env.example`
|
|
||||||
|
|
||||||
**Missing:**
|
|
||||||
- `AZURE_KEYVAULT_URL` — referenced by `instrumentation.ts` but not in `.env.example`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 5. Hardcoded `productId` Values
|
|
||||||
|
|
||||||
| Repo | File | Line | Value | Should Use |
|
|
||||||
|------|------|------|-------|-----------|
|
|
||||||
| `multimodal_memory` | `web/src/lib/telemetry.ts` | 19 | `productId: 'mindlyst'` | `process.env.NEXT_PUBLIC_PRODUCT_ID` |
|
|
||||||
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | 86, 101 | `product_id: str = "lysnrai"` | `settings.PRODUCT_ID` or config |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 6. Python Backend Client Gaps (`platform_client.py`)
|
|
||||||
|
|
||||||
The `PlatformClient` class in `backend/src/clients/platform_client.py` has several issues:
|
|
||||||
|
|
||||||
1. **No `x-request-id` header** on any request
|
|
||||||
2. **No `x-product-id` header** on any request
|
|
||||||
3. **Creates new `httpx.AsyncClient` per request** — no connection pooling
|
|
||||||
4. **Hardcoded `product_id="lysnrai"` defaults** — should use config
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 7. Previously Fixed (Session 1)
|
|
||||||
|
|
||||||
| Fix | Repo | File |
|
|
||||||
|-----|------|------|
|
|
||||||
| Added `x-request-id` to billing client | `multimodal_memory` | `web/src/lib/billing-client.ts` |
|
|
||||||
| Added `x-request-id` to feature flags | `multimodal_memory` | `web/src/lib/feature-flags.ts` |
|
|
||||||
| Added 13 MindLyst containers to cosmos-init | `common_plat` | `services/platform-service/src/lib/cosmos-init.ts` |
|
|
||||||
| Added Blob Storage creds to Python config | `voice_ai_agent` | `backend/src/config.py` |
|
|
||||||
| Added missing env vars to MindLyst | `multimodal_memory` | `web/.env.example` |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 8. Recommended Fix Order
|
|
||||||
|
|
||||||
1. **P0 — Root cause:** Add `x-request-id` auto-generation to `@bytelyst/api-client` `buildHeaders()` → fixes 9 TS clients at once
|
|
||||||
2. **P0 — LysnrAI feature-flags:** Add `x-request-id` to the custom `fetch` call in `user-dashboard-web/src/lib/feature-flags.ts`
|
|
||||||
3. **P1 — Python backend:** Add `x-request-id` and `x-product-id` headers to `platform_client.py`
|
|
||||||
4. **P1 — Env vars:** Add missing `NEXT_PUBLIC_*` vars to ChronoMind, LysnrAI user-dashboard, admin-dashboard `.env.example` files
|
|
||||||
5. **P2 — `x-product-id`:** Add to admin/user dashboard clients via `defaultHeaders` in `createApiClient` config
|
|
||||||
6. **P2 — Hardcoded productId:** Replace in `telemetry.ts` and `platform_client.py`
|
|
||||||
7. **P3 — Referrals PK mismatch:** Requires data migration strategy (separate task)
|
|
||||||
749
docs/architecture/CLOUD_PROVIDER_MIGRATION_ANALYSIS.md
Normal file
749
docs/architecture/CLOUD_PROVIDER_MIGRATION_ANALYSIS.md
Normal file
@ -0,0 +1,749 @@
|
|||||||
|
# Cloud Provider Migration Analysis — ByteLyst Ecosystem
|
||||||
|
|
||||||
|
> **Author:** AI Analysis (Cascade)
|
||||||
|
> **Date:** 2026-03-01
|
||||||
|
> **Scope:** All 7 repos — LysnrAI, MindLyst, ChronoMind, NomGap, PeakPulse, Common Platform, JarvisJr
|
||||||
|
> **Purpose:** Evaluate current Azure investment, assess migration feasibility to AWS / GCP / MongoDB Atlas / multi-cloud, and provide actionable recommendations.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
1. [Executive Summary](#1-executive-summary)
|
||||||
|
2. [Current Azure Investment Inventory](#2-current-azure-investment-inventory)
|
||||||
|
3. [Dependency Depth Analysis](#3-dependency-depth-analysis)
|
||||||
|
4. [Migration Target Comparison](#4-migration-target-comparison)
|
||||||
|
5. [Per-Service Migration Analysis](#5-per-service-migration-analysis)
|
||||||
|
6. [Migration Scenario Scoring](#6-migration-scenario-scoring)
|
||||||
|
7. [Cost Comparison](#7-cost-comparison)
|
||||||
|
8. [Abstraction Layer Assessment](#8-abstraction-layer-assessment)
|
||||||
|
9. [Risk Analysis](#9-risk-analysis)
|
||||||
|
10. [Recommendations](#10-recommendations)
|
||||||
|
11. [Migration Playbook (If Chosen)](#11-migration-playbook-if-chosen)
|
||||||
|
12. [Appendix A: File-Level Azure Dependency Map](#appendix-a-file-level-azure-dependency-map)
|
||||||
|
13. [Appendix B: SDK & Package Inventory](#appendix-b-sdk--package-inventory)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Executive Summary
|
||||||
|
|
||||||
|
The ByteLyst ecosystem is **moderately coupled** to Azure. The coupling is concentrated in **3 packages** (`@bytelyst/cosmos`, `@bytelyst/blob`, `@bytelyst/config`) and **2 Python modules** (`azure_stt.py`, `cosmos_client.py`). The architecture already uses an internal abstraction layer — most application code never imports Azure SDKs directly.
|
||||||
|
|
||||||
|
### Key Findings
|
||||||
|
|
||||||
|
| Dimension | Assessment |
|
||||||
|
| ------------------------------ | ------------------------------------------------------------------------------------------- |
|
||||||
|
| **Overall Azure lock-in** | **Medium** — concentrated in ~15 files, but those files are foundational |
|
||||||
|
| **Easiest to migrate** | Blob Storage, Key Vault, OpenAI, Application Insights |
|
||||||
|
| **Hardest to migrate** | Cosmos DB (SQL API queries in 56+ repository files), Azure Speech SDK |
|
||||||
|
| **Best alternative DB** | MongoDB Atlas (closest query model to Cosmos SQL API) |
|
||||||
|
| **Best alternative cloud** | AWS (broadest service parity, mature SDK ecosystem) |
|
||||||
|
| **Estimated migration effort** | 4–8 weeks for full cloud swap (Cosmos DB is the long pole) |
|
||||||
|
| **Recommendation** | **Stay on Azure** for now, but invest in abstraction layers to reduce future switching cost |
|
||||||
|
|
||||||
|
### Azure Services Used (8 total)
|
||||||
|
|
||||||
|
| # | Azure Service | Monthly Cost | Lock-in Risk | Files Affected |
|
||||||
|
| --- | ------------------------------------------- | ------------- | ------------ | ------------------------------------------------- |
|
||||||
|
| 1 | **Cosmos DB** (SQL/NoSQL API) | ~$4–10 | **HIGH** | 56+ repository files, 3 databases, ~45 containers |
|
||||||
|
| 2 | **Blob Storage** | ~$0.20 | LOW | 2 packages + 1 Python module |
|
||||||
|
| 3 | **Azure OpenAI** | ~$5–10 | LOW | 3 files (already supports OpenAI fallback) |
|
||||||
|
| 4 | **Speech Services** | $0 (F0) | **HIGH** | 2 files (deep SDK integration, streaming) |
|
||||||
|
| 5 | **Key Vault** | ~$0.06 | LOW | 2 files (1 TS, 1 Python) |
|
||||||
|
| 6 | **Notification Hubs** | $0 (Free) | MEDIUM | Planned, not yet deeply integrated |
|
||||||
|
| 7 | **Application Insights** | $0 (5GB free) | LOW | 1 file (custom telemetry already built) |
|
||||||
|
| 8 | **Azure Identity** (DefaultAzureCredential) | $0 | LOW | Used by Key Vault + Secrets Manager |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Current Azure Investment Inventory
|
||||||
|
|
||||||
|
### 2.1 Azure Resources (from Azure Portal)
|
||||||
|
|
||||||
|
| Resource | Azure Name | Region | SKU | Status |
|
||||||
|
| ----------------- | ------------------------- | -------------- | ---------------- | ------------------------------- |
|
||||||
|
| Resource Group | `rg-mywisprai` | East US | — | Active |
|
||||||
|
| Cosmos DB | `cosmos-mywisprai` | West US 2 | Serverless | Active — 3 DBs, ~45 containers |
|
||||||
|
| Blob Storage | `bytelystblobs` | West US 2 | StorageV2, RAGRS | Active — 9+ containers |
|
||||||
|
| Azure OpenAI | `mywisprai-openai-sweden` | Sweden Central | S0 | Active — gpt-4o-mini deployment |
|
||||||
|
| Speech Service | `mywisprai-speech` | East US | F0 (Free) | Active |
|
||||||
|
| Key Vault | `kv-mywisprai` | East US | Standard | Active — ~25 secrets |
|
||||||
|
| Notification Hubs | `lysnnai` namespace | East US | Free | Active — 2 hubs |
|
||||||
|
| App Insights | `bytelyst-appinsights` | East US | Classic | Active |
|
||||||
|
|
||||||
|
### 2.2 Cosmos DB Databases & Containers
|
||||||
|
|
||||||
|
| Database | Containers | Products Using |
|
||||||
|
| ----------- | ------------------------------------------------------------------------------------------------------ | ---------------------------------------- |
|
||||||
|
| `lysnrai` | ~27 containers (users, subscriptions, feature_flags, audit_log, tracker_items, telemetry_events, etc.) | LysnrAI, platform-service (all products) |
|
||||||
|
| `mindlyst` | ~20 containers (brains, memory_items, streaks, reflections, etc.) | MindLyst |
|
||||||
|
| `mywisprai` | 10 containers (legacy, pre-rebrand) | Legacy / migration target |
|
||||||
|
|
||||||
|
**Total: ~57 containers across 3 databases**, all using Cosmos SQL (NoSQL) API with SQL-like queries (`SELECT`, `WHERE`, `ORDER BY`, `OFFSET/LIMIT`, aggregate functions).
|
||||||
|
|
||||||
|
### 2.3 Code Investment by Language
|
||||||
|
|
||||||
|
| Language | Azure SDK Packages | Files Using Azure | Lines of Azure-Specific Code |
|
||||||
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ----------------- | ---------------------------- |
|
||||||
|
| **TypeScript** | `@azure/cosmos`, `@azure/storage-blob`, `@azure/identity`, `@azure/keyvault-secrets` | ~65 files | ~500 lines |
|
||||||
|
| **Python** | `azure-cognitiveservices-speech`, `azure-cosmos`, `azure-storage-blob`, `azure-identity`, `azure-keyvault-secrets`, `openai` (AzureOpenAI) | ~8 files | ~400 lines |
|
||||||
|
| **Swift** | `MicrosoftCognitiveServicesSpeech` (SPX framework) | ~3 files | ~150 lines |
|
||||||
|
| **Kotlin** | None directly (uses platform-service REST API) | 0 files | 0 lines |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Dependency Depth Analysis
|
||||||
|
|
||||||
|
### 3.1 Cosmos DB — DEEP (56+ files)
|
||||||
|
|
||||||
|
This is the **most deeply embedded** Azure dependency. Every repository module follows the pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
types.ts → repository.ts → routes.ts
|
||||||
|
↑
|
||||||
|
Uses @azure/cosmos SDK
|
||||||
|
SQL queries: SELECT c.id, c.name FROM c WHERE c.productId = @pid
|
||||||
|
```
|
||||||
|
|
||||||
|
**Touchpoints:**
|
||||||
|
|
||||||
|
- `packages/cosmos/` — shared client singleton (`@azure/cosmos` peer dep)
|
||||||
|
- `services/platform-service/src/modules/*/repository.ts` — **56 repository files** with Cosmos SQL queries
|
||||||
|
- `services/extraction-service/src/modules/*/repository.ts` — 2 repository files
|
||||||
|
- `dashboards/admin-web/src/lib/cosmos.ts` — direct `@azure/cosmos` import
|
||||||
|
- `dashboards/admin-web/src/lib/repositories/*.ts` — 4 repository files
|
||||||
|
- `mindlyst-native/web/src/lib/cosmos.ts` — direct `@azure/cosmos` import
|
||||||
|
- `learning_voice_ai_agent/src/cloud/cosmos_client.py` — Python Cosmos client
|
||||||
|
- `learning_voice_ai_agent/backend/src/cloud/cosmos.py` — Python backend Cosmos client
|
||||||
|
|
||||||
|
**Query patterns used:**
|
||||||
|
|
||||||
|
- `container.items.query()` with parameterized SQL
|
||||||
|
- `container.items.create()`, `.replace()`, `.delete()`, `.read()`
|
||||||
|
- `container.items.upsert()`
|
||||||
|
- Partition key routing (`/userId`, `/productId`, `/id`)
|
||||||
|
- Cross-partition queries (admin/analytics)
|
||||||
|
- `SELECT VALUE COUNT(1)` aggregates
|
||||||
|
- `OFFSET ... LIMIT` pagination
|
||||||
|
- `ORDER BY` sorting
|
||||||
|
- `ARRAY_CONTAINS()` for array queries
|
||||||
|
|
||||||
|
### 3.2 Azure Speech SDK — DEEP (3 files, streaming integration)
|
||||||
|
|
||||||
|
The Speech SDK is used for **real-time streaming speech-to-text** with features that are tightly coupled to the Azure SDK's event-driven architecture:
|
||||||
|
|
||||||
|
- `src/audio/azure_stt.py` — 248 lines. Uses `PushAudioInputStream`, `SpeechRecognizer`, continuous recognition with `recognizing`/`recognized`/`canceled`/`session_stopped` event callbacks, `PhraseListGrammar`, auto-language detection (10 languages), auto-reconnect
|
||||||
|
- `src/ui/settings.py` + `src/ui/unified_window.py` — connection testing
|
||||||
|
- `mindlyst-native/iosApp/Services/AzureSpeechTranscriber.swift` — iOS Swift SPX framework
|
||||||
|
- `mobile_app/ios/LysnrAI/` — iOS keyboard extension uses SPX framework
|
||||||
|
|
||||||
|
### 3.3 Blob Storage — SHALLOW (3 files)
|
||||||
|
|
||||||
|
- `packages/blob/src/blob.ts` — 162 lines, singleton client, SAS URL generation
|
||||||
|
- `src/cloud/blob_client.py` — 190 lines, Python equivalent
|
||||||
|
- `services/platform-service/src/modules/blob/` — REST API wrapper
|
||||||
|
|
||||||
|
### 3.4 Azure OpenAI — SHALLOW (3 files, already abstracted)
|
||||||
|
|
||||||
|
- `src/llm/text_cleaner.py` — uses `openai.AzureOpenAI` (OpenAI SDK with Azure endpoint)
|
||||||
|
- `backend/src/clients/openai_client.py` — uses `openai.AsyncAzureOpenAI`
|
||||||
|
- `mindlyst-native/web/src/lib/llm.ts` — **already has OpenAI fallback** (resolves provider dynamically)
|
||||||
|
|
||||||
|
The `openai` Python/JS SDK supports both Azure and OpenAI endpoints with minimal config change. MindLyst web already handles this automatically.
|
||||||
|
|
||||||
|
### 3.5 Key Vault — SHALLOW (2 files)
|
||||||
|
|
||||||
|
- `packages/config/src/keyvault.ts` — 90 lines, `resolveKeyVaultSecrets()` with graceful fallback
|
||||||
|
- `src/secrets/keyvault.py` — 69 lines, `SecretResolver` class with env var fallback
|
||||||
|
|
||||||
|
Both implementations already fall back to environment variables when Key Vault is unavailable. Migration = just stop using Key Vault and use the env var path.
|
||||||
|
|
||||||
|
### 3.6 Notification Hubs — NOT YET INTEGRATED
|
||||||
|
|
||||||
|
Planned but not deeply wired. Only namespace/hub exists in Azure. Mobile apps use `BLPlatformClient` (REST) to talk to platform-service, which would route push notifications.
|
||||||
|
|
||||||
|
### 3.7 Application Insights — SHALLOW (1 file)
|
||||||
|
|
||||||
|
- `opencensus-ext-azure` in Python requirements (optional telemetry)
|
||||||
|
- Custom telemetry system already built (`@bytelyst/telemetry-client`, platform-service telemetry module with Cosmos storage)
|
||||||
|
|
||||||
|
The custom telemetry system means App Insights is supplementary, not critical.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Migration Target Comparison
|
||||||
|
|
||||||
|
### 4.1 Database: Cosmos DB → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure Cosmos DB (current) | MongoDB Atlas | AWS DynamoDB | Google Firestore | PostgreSQL (Supabase/Neon) |
|
||||||
|
| ----------------------- | ----------------------------------- | ------------------------------------ | ---------------------------------- | --------------------- | -------------------------- |
|
||||||
|
| **Data model** | Document (JSON) | Document (JSON) | Key-Value + Document | Document (JSON) | Relational + JSONB |
|
||||||
|
| **Query language** | SQL-like | MQL (MongoDB Query) | PartiQL / API | GQL-like API | SQL |
|
||||||
|
| **Partition keys** | Required | Shard keys (optional) | Required | Collection groups | Not applicable |
|
||||||
|
| **Serverless** | Yes | Yes (Atlas Serverless) | Yes | Yes | Yes (Neon) |
|
||||||
|
| **SQL queries** | `SELECT c.id FROM c WHERE c.x = @y` | `db.collection.find({x: y})` | `SELECT id FROM table WHERE x = ?` | Client SDK queries | Standard SQL |
|
||||||
|
| **Aggregates** | Basic (`COUNT`, `SUM`, `AVG`) | Full (`$group`, `$match`, `$lookup`) | Limited | Limited | Full SQL |
|
||||||
|
| **Cross-partition** | Yes (expensive) | Yes (scatter-gather) | Scan (expensive) | Yes | N/A |
|
||||||
|
| **Change feed** | Yes | Change Streams | DynamoDB Streams | Real-time listeners | Logical replication |
|
||||||
|
| **Global distribution** | Built-in multi-region | Atlas Global Clusters | Global Tables | Multi-region | Manual / Citus |
|
||||||
|
| **Max doc size** | 2 MB | 16 MB | 400 KB | 1 MB | Unlimited (JSONB) |
|
||||||
|
| **Free tier** | 1000 RU/s + 25 GB | 512 MB | 25 GB + 25 WCU/RCU | 1 GiB + 50K reads/day | 0.5 GB (Neon) |
|
||||||
|
| **Migration effort** | — | **Medium** (query rewrite) | **Hard** (paradigm shift) | **Hard** (no SQL) | **Hard** (schema design) |
|
||||||
|
|
||||||
|
### 4.2 Object Storage: Blob → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure Blob (current) | AWS S3 | GCP Cloud Storage | Cloudflare R2 | MinIO (self-hosted) |
|
||||||
|
| --------------------- | -------------------- | ----------------- | ------------------- | ------------------ | ------------------- |
|
||||||
|
| **API compatibility** | Azure Blob API | S3 API | GCS API / S3-compat | S3-compatible | S3-compatible |
|
||||||
|
| **SAS tokens** | Yes (Azure SAS) | Pre-signed URLs | Signed URLs | Pre-signed URLs | Pre-signed URLs |
|
||||||
|
| **CDN integration** | Azure CDN | CloudFront | Cloud CDN | Built-in | Manual |
|
||||||
|
| **Cost (per GB)** | $0.018 (Cool) | $0.023 (Standard) | $0.020 | $0.015 (no egress) | Self-hosted |
|
||||||
|
| **Migration effort** | — | **Easy** | **Easy** | **Easy** | **Easy** |
|
||||||
|
|
||||||
|
### 4.3 Speech-to-Text: Azure Speech → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure Speech (current) | AWS Transcribe | Google Speech-to-Text | Deepgram | Whisper (local) |
|
||||||
|
| --------------------- | ----------------------- | --------------------- | ------------------------ | ---------------------- | ----------------------- |
|
||||||
|
| **Streaming STT** | Yes (push stream) | Yes (WebSocket) | Yes (streaming) | Yes (WebSocket) | No (batch only) |
|
||||||
|
| **Languages** | 100+ | 100+ | 125+ | 36+ | 99+ |
|
||||||
|
| **Auto-detect lang** | Up to 10 at-once | Yes | Yes | Yes | Yes |
|
||||||
|
| **Custom vocabulary** | PhraseListGrammar | Custom vocabulary | Speech adaptation | Keywords | No |
|
||||||
|
| **Native SDK** | Python, Swift (SPX), JS | Python, no iOS SDK | Python, iOS, JS | REST/WebSocket | Python only |
|
||||||
|
| **iOS native SDK** | SPX framework (ObjC) | No native SDK | Yes (gRPC) | No native SDK | No |
|
||||||
|
| **Free tier** | 5 hrs/month (F0) | 60 min/month | 60 min/month | None | Free (local GPU) |
|
||||||
|
| **Latency** | ~200ms | ~300ms | ~200ms | ~100ms | ~500ms+ (local) |
|
||||||
|
| **Migration effort** | — | **Hard** (no iOS SDK) | **Medium** (has iOS SDK) | **Medium** (REST only) | **Hard** (no streaming) |
|
||||||
|
|
||||||
|
### 4.4 LLM / AI: Azure OpenAI → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure OpenAI (current) | OpenAI API (direct) | Google Gemini | AWS Bedrock | Anthropic Claude |
|
||||||
|
| ---------------------- | ------------------------ | ----------------------------- | -------------------- | -------------------- | -------------------- |
|
||||||
|
| **Models** | GPT-4o, GPT-4o-mini | Same models | Gemini 2.5 | Claude, Llama, Titan | Claude 3.5/4 |
|
||||||
|
| **API compatibility** | OpenAI SDK (azure mode) | OpenAI SDK (native) | Google SDK | AWS SDK | Anthropic SDK |
|
||||||
|
| **Data residency** | Azure regions | US only | Google regions | AWS regions | US/EU |
|
||||||
|
| **Cost (GPT-4o-mini)** | $0.15/$0.60 per M tokens | $0.15/$0.60 per M tokens | ~$0.10/$0.40 (Flash) | Varies | ~$0.25/$1.25 (Haiku) |
|
||||||
|
| **Migration effort** | — | **Trivial** (change endpoint) | **Easy** (SDK swap) | **Medium** | **Easy** (SDK swap) |
|
||||||
|
|
||||||
|
### 4.5 Secrets Management: Key Vault → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure Key Vault (current) | AWS Secrets Manager | GCP Secret Manager | HashiCorp Vault | Doppler / Infisical |
|
||||||
|
| -------------------- | ------------------------- | --------------------------------- | ------------------------------ | --------------- | ------------------- |
|
||||||
|
| **Cost** | $0.03/10K ops | $0.40/secret/month | $0.06/10K ops | Free (OSS) | Free tier |
|
||||||
|
| **SDK** | `@azure/keyvault-secrets` | `@aws-sdk/client-secrets-manager` | `@google-cloud/secret-manager` | HTTP API | SDK / CLI |
|
||||||
|
| **Migration effort** | — | **Easy** | **Easy** | **Medium** | **Easy** |
|
||||||
|
|
||||||
|
**Note:** The codebase already falls back to env vars when Key Vault is unavailable. This means Key Vault can be replaced by **any** secrets manager or simply .env files without code changes to application logic.
|
||||||
|
|
||||||
|
### 4.6 Push Notifications: Notification Hubs → Alternatives
|
||||||
|
|
||||||
|
| Feature | Azure NH (current) | AWS SNS | Firebase Cloud Messaging | OneSignal | Expo Push |
|
||||||
|
| -------------------- | ------------------ | ------------ | ------------------------ | --------------- | --------------------------- |
|
||||||
|
| **APNs + FCM** | Yes | Yes | FCM only (APNs via FCM) | Yes | Yes |
|
||||||
|
| **Free tier** | 1M pushes/month | 1M publishes | Unlimited | 10K subscribers | Unlimited |
|
||||||
|
| **Migration effort** | — | **Easy** | **Easy** | **Easy** | **Easy** (NomGap uses Expo) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Per-Service Migration Analysis
|
||||||
|
|
||||||
|
### 5.1 Cosmos DB → MongoDB Atlas
|
||||||
|
|
||||||
|
**Difficulty: MEDIUM-HIGH** | **Effort: 3–5 weeks** | **Risk: MEDIUM**
|
||||||
|
|
||||||
|
This is the **single largest migration task**. Here's why:
|
||||||
|
|
||||||
|
#### What needs to change
|
||||||
|
|
||||||
|
| Layer | Current (Cosmos SQL API) | Target (MongoDB) | Files |
|
||||||
|
| ------------------------ | ------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------- | ----------------------------------- |
|
||||||
|
| Client package | `@azure/cosmos` → `CosmosClient` | `mongodb` → `MongoClient` | `packages/cosmos/src/client.ts` |
|
||||||
|
| Container registry | `getContainer(name)` | `db.collection(name)` | `packages/cosmos/src/containers.ts` |
|
||||||
|
| All repository files | `container.items.query('SELECT...')` | `collection.find({...})` | **56+ files** in platform-service |
|
||||||
|
| Dashboard Cosmos clients | `@azure/cosmos` direct | `mongodb` direct | 2 files (admin, MindLyst) |
|
||||||
|
| Python clients | `azure.cosmos.CosmosClient` | `pymongo.MongoClient` | 2 files |
|
||||||
|
| Query syntax | SQL-like (`SELECT c.id FROM c WHERE c.productId = @pid AND c.userId = @uid ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20`) | MQL (`collection.find({productId: pid, userId: uid}).sort({createdAt: -1}).skip(0).limit(20)`) | All repository files |
|
||||||
|
| Partition keys | Explicit partition key in every query | Shard key (auto-routed) | All repository files |
|
||||||
|
| Upsert | `container.items.upsert(doc)` | `collection.updateOne({_id: id}, {$set: doc}, {upsert: true})` | ~20 files |
|
||||||
|
| Read by ID | `container.item(id, partitionKey).read()` | `collection.findOne({_id: id})` | All repository files |
|
||||||
|
|
||||||
|
#### What stays the same
|
||||||
|
|
||||||
|
- Document structure (JSON documents with `id`, `productId`, partition keys)
|
||||||
|
- Data model (no schema changes needed — MongoDB is also schemaless)
|
||||||
|
- Partition key concept maps to shard key
|
||||||
|
- Serverless pricing model available on both
|
||||||
|
|
||||||
|
#### Key migration steps
|
||||||
|
|
||||||
|
1. Update `@bytelyst/cosmos` package to export MongoDB-compatible API
|
||||||
|
2. Rewrite all SQL queries to MQL (56+ files)
|
||||||
|
3. Replace `container.items.query()` → `collection.find()`
|
||||||
|
4. Replace `container.item(id, pk).read()` → `collection.findOne({_id: id})`
|
||||||
|
5. Replace `container.items.create()` → `collection.insertOne()`
|
||||||
|
6. Replace `container.items.replace()` → `collection.replaceOne()`
|
||||||
|
7. Replace `container.items.upsert()` → `collection.updateOne({upsert: true})`
|
||||||
|
8. Update Python clients similarly
|
||||||
|
9. Migrate data (use Azure Data Factory or custom script)
|
||||||
|
10. Update all test mocks
|
||||||
|
|
||||||
|
#### Why MongoDB Atlas is the best DB alternative
|
||||||
|
|
||||||
|
- **Closest query model** to Cosmos SQL API (both are document DBs)
|
||||||
|
- **MongoDB has a Cosmos DB compatibility mode** (but going native is better)
|
||||||
|
- Cosmos DB was originally inspired by MongoDB's document model
|
||||||
|
- MongoDB's `find()` queries map closely to Cosmos SQL `SELECT` queries
|
||||||
|
- Both support partition/shard keys, TTL indexes, change streams
|
||||||
|
- MongoDB Atlas Serverless pricing is competitive
|
||||||
|
- MongoDB has excellent TypeScript and Python SDKs
|
||||||
|
|
||||||
|
### 5.2 Azure Speech → Google Cloud Speech-to-Text
|
||||||
|
|
||||||
|
**Difficulty: HIGH** | **Effort: 2–3 weeks** | **Risk: HIGH**
|
||||||
|
|
||||||
|
#### Why this is hard
|
||||||
|
|
||||||
|
- The Azure Speech SDK uses a **push-stream architecture** (`PushAudioInputStream`) that is deeply integrated into the audio pipeline
|
||||||
|
- The `SpeechRecognizer` has event-driven callbacks (`recognizing`, `recognized`, `canceled`, `session_stopped`) that the code relies on for real-time partial/final transcript delivery
|
||||||
|
- Custom vocabulary via `PhraseListGrammar` is Azure-specific
|
||||||
|
- Auto-language detection config is Azure-specific
|
||||||
|
- The **iOS SPX framework** (Objective-C) is used in LysnrAI keyboard extension and MindLyst — there's no direct equivalent for most alternatives
|
||||||
|
|
||||||
|
#### Best alternative: Google Cloud Speech-to-Text
|
||||||
|
|
||||||
|
- Has streaming recognition with similar event model
|
||||||
|
- Has an iOS SDK (gRPC-based)
|
||||||
|
- Supports custom vocabulary (speech adaptation)
|
||||||
|
- Supports auto-language detection
|
||||||
|
- Similar pricing and free tier
|
||||||
|
|
||||||
|
#### What needs to change
|
||||||
|
|
||||||
|
- `src/audio/azure_stt.py` — complete rewrite (~248 lines)
|
||||||
|
- `iosApp/Services/AzureSpeechTranscriber.swift` — complete rewrite
|
||||||
|
- `LysnrAI/LysnrKeyboard/` — keyboard extension STT integration
|
||||||
|
- Audio format handling (may differ between providers)
|
||||||
|
- Connection test code in settings UI
|
||||||
|
|
||||||
|
### 5.3 Blob Storage → AWS S3 or Cloudflare R2
|
||||||
|
|
||||||
|
**Difficulty: LOW** | **Effort: 2–3 days** | **Risk: LOW**
|
||||||
|
|
||||||
|
#### Why this is easy
|
||||||
|
|
||||||
|
- `@bytelyst/blob` package is a thin wrapper (162 lines)
|
||||||
|
- Only 3 files need changes
|
||||||
|
- S3 API is the de facto standard — R2, MinIO, GCS all support S3-compatible API
|
||||||
|
- SAS tokens → Pre-signed URLs (same concept, different implementation)
|
||||||
|
|
||||||
|
#### What needs to change
|
||||||
|
|
||||||
|
- `packages/blob/src/blob.ts` — swap `@azure/storage-blob` → `@aws-sdk/client-s3` + `@aws-sdk/s3-request-presigner`
|
||||||
|
- `src/cloud/blob_client.py` — swap `azure.storage.blob` → `boto3`
|
||||||
|
- `services/platform-service/src/modules/blob/` — update routes for pre-signed URL format
|
||||||
|
- Environment variables: `AZURE_BLOB_*` → `AWS_S3_*` or `S3_*`
|
||||||
|
|
||||||
|
### 5.4 Azure OpenAI → OpenAI API (direct) or Gemini
|
||||||
|
|
||||||
|
**Difficulty: TRIVIAL** | **Effort: < 1 day** | **Risk: VERY LOW**
|
||||||
|
|
||||||
|
#### Why this is trivial
|
||||||
|
|
||||||
|
- The `openai` Python SDK supports both Azure and OpenAI endpoints — just change config
|
||||||
|
- MindLyst web `llm.ts` **already auto-detects** Azure vs OpenAI and builds the correct URL
|
||||||
|
- LysnrAI desktop uses `AzureOpenAI` class from `openai` SDK — switch to `OpenAI` class
|
||||||
|
- Same models, same API shape, same pricing
|
||||||
|
|
||||||
|
#### What needs to change
|
||||||
|
|
||||||
|
- Set `OPENAI_API_KEY` instead of `AZURE_OPENAI_*` env vars
|
||||||
|
- Change `AzureOpenAI(azure_endpoint=..., api_key=..., api_version=...)` → `OpenAI(api_key=...)`
|
||||||
|
- Change `AsyncAzureOpenAI(...)` → `AsyncOpenAI(...)`
|
||||||
|
- Remove `api_version` parameter
|
||||||
|
- That's it. The `openai` SDK handles the rest.
|
||||||
|
|
||||||
|
### 5.5 Key Vault → Environment Variables / Any Secrets Manager
|
||||||
|
|
||||||
|
**Difficulty: TRIVIAL** | **Effort: < 1 day** | **Risk: VERY LOW**
|
||||||
|
|
||||||
|
Both `keyvault.ts` and `keyvault.py` already implement graceful fallback:
|
||||||
|
|
||||||
|
- If `AZURE_KEYVAULT_URL` is not set → uses env vars directly
|
||||||
|
- If Key Vault is unreachable → falls back to env vars
|
||||||
|
|
||||||
|
**To migrate:** Simply stop setting `AZURE_KEYVAULT_URL`. Everything works via env vars. Then optionally adopt any other secrets manager (AWS Secrets Manager, Doppler, Infisical, etc.).
|
||||||
|
|
||||||
|
### 5.6 Notification Hubs → Firebase Cloud Messaging
|
||||||
|
|
||||||
|
**Difficulty: LOW** | **Effort: 1–2 days** | **Risk: LOW**
|
||||||
|
|
||||||
|
Not yet deeply integrated. The platform-service notification module sends via REST API. Swap the push provider client.
|
||||||
|
|
||||||
|
### 5.7 Application Insights → Self-hosted / Grafana
|
||||||
|
|
||||||
|
**Difficulty: TRIVIAL** | **Effort: Already done** | **Risk: NONE**
|
||||||
|
|
||||||
|
The ecosystem already has:
|
||||||
|
|
||||||
|
- Custom telemetry system (`@bytelyst/telemetry-client` → platform-service → Cosmos)
|
||||||
|
- Loki + Grafana in `services/monitoring/`
|
||||||
|
- App Insights is supplementary, can be dropped with zero code changes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Migration Scenario Scoring
|
||||||
|
|
||||||
|
### Scenario A: Stay on Azure (Status Quo)
|
||||||
|
|
||||||
|
| Dimension | Score (1-5) | Notes |
|
||||||
|
| ---------------- | ------------ | ---------------------------------------- |
|
||||||
|
| Migration effort | **5** (none) | No work needed |
|
||||||
|
| Cost | **4** | ~$15/month at current scale, competitive |
|
||||||
|
| Vendor diversity | **1** | Single cloud vendor |
|
||||||
|
| Feature parity | **5** | Everything works today |
|
||||||
|
| **Total** | **15/20** | |
|
||||||
|
|
||||||
|
### Scenario B: Full Migration to AWS
|
||||||
|
|
||||||
|
| Dimension | Score (1-5) | Notes |
|
||||||
|
| ---------------- | ----------- | ---------------------------------------------------------------- |
|
||||||
|
| Migration effort | **2** | 6–8 weeks, Cosmos→DynamoDB is painful |
|
||||||
|
| Cost | **3** | Similar or slightly higher at small scale |
|
||||||
|
| Vendor diversity | **1** | Still single cloud, just different |
|
||||||
|
| Feature parity | **3** | No native iOS Speech SDK, DynamoDB query model is very different |
|
||||||
|
| **Total** | **9/20** | |
|
||||||
|
|
||||||
|
### Scenario C: Multi-Cloud (MongoDB Atlas + OpenAI + R2 + Google STT)
|
||||||
|
|
||||||
|
| Dimension | Score (1-5) | Notes |
|
||||||
|
| ---------------- | ----------- | -------------------------------------------------------- |
|
||||||
|
| Migration effort | **2** | 5–7 weeks, Cosmos→MongoDB is medium |
|
||||||
|
| Cost | **4** | MongoDB Atlas free tier, R2 no egress fees |
|
||||||
|
| Vendor diversity | **5** | No single-vendor dependency |
|
||||||
|
| Feature parity | **4** | MongoDB is a better document DB than Cosmos in many ways |
|
||||||
|
| **Total** | **15/20** | |
|
||||||
|
|
||||||
|
### Scenario D: Stay Azure + Add Abstraction Layers
|
||||||
|
|
||||||
|
| Dimension | Score (1-5) | Notes |
|
||||||
|
| ---------------- | ----------- | --------------------------------------------- |
|
||||||
|
| Migration effort | **4** | 1–2 weeks to add repository interface pattern |
|
||||||
|
| Cost | **4** | No change |
|
||||||
|
| Vendor diversity | **3** | Ready to switch, but still on Azure |
|
||||||
|
| Feature parity | **5** | Everything works today |
|
||||||
|
| **Total** | **16/20** | **Winner** |
|
||||||
|
|
||||||
|
### Scenario E: Migrate DB Only (Cosmos → MongoDB Atlas, keep rest on Azure)
|
||||||
|
|
||||||
|
| Dimension | Score (1-5) | Notes |
|
||||||
|
| ---------------- | ----------- | --------------------------------------------- |
|
||||||
|
| Migration effort | **3** | 3–5 weeks for DB migration |
|
||||||
|
| Cost | **4** | MongoDB Atlas Serverless may be cheaper |
|
||||||
|
| Vendor diversity | **3** | DB is independent, other services still Azure |
|
||||||
|
| Feature parity | **5** | MongoDB is very capable |
|
||||||
|
| **Total** | **15/20** | |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Cost Comparison
|
||||||
|
|
||||||
|
### Current Azure Costs (MVP / Low Usage)
|
||||||
|
|
||||||
|
| Service | Monthly Cost | Notes |
|
||||||
|
| -------------------------- | ----------------- | --------------------------- |
|
||||||
|
| Cosmos DB (Serverless) | ~$4–10 | 3 databases, ~45 containers |
|
||||||
|
| Blob Storage (Cool, RAGRS) | ~$0.20 | 9+ containers |
|
||||||
|
| Azure OpenAI (GPT-4o-mini) | ~$5–10 | Pay per token |
|
||||||
|
| Speech (F0) | $0 | 5 hrs/month free |
|
||||||
|
| Key Vault | ~$0.06 | ~25 secrets |
|
||||||
|
| Notification Hubs (Free) | $0 | 1M pushes/month |
|
||||||
|
| App Insights | $0 | 5 GB/month free |
|
||||||
|
| **Total** | **~$10–20/month** | |
|
||||||
|
|
||||||
|
### Equivalent AWS Costs
|
||||||
|
|
||||||
|
| Service | AWS Equivalent | Monthly Cost |
|
||||||
|
| -------------------------------- | --------------- | ------------------------- |
|
||||||
|
| Cosmos DB → DynamoDB (On-Demand) | DynamoDB | ~$5–15 |
|
||||||
|
| Blob → S3 Standard | S3 | ~$0.25 |
|
||||||
|
| Azure OpenAI → OpenAI API | Same pricing | ~$5–10 |
|
||||||
|
| Speech → Transcribe | Transcribe | ~$1–3 |
|
||||||
|
| Key Vault → Secrets Manager | Secrets Manager | ~$10 (per-secret pricing) |
|
||||||
|
| Notification Hubs → SNS | SNS | ~$0.50 |
|
||||||
|
| App Insights → CloudWatch | CloudWatch | ~$3 |
|
||||||
|
| **Total** | | **~$25–42/month** |
|
||||||
|
|
||||||
|
### Equivalent Multi-Cloud Costs
|
||||||
|
|
||||||
|
| Service | Provider | Monthly Cost |
|
||||||
|
| ------------------------------------ | ------------ | ------------------ |
|
||||||
|
| Cosmos DB → MongoDB Atlas Serverless | MongoDB | ~$3–8 |
|
||||||
|
| Blob → Cloudflare R2 | Cloudflare | ~$0.15 (no egress) |
|
||||||
|
| Azure OpenAI → OpenAI API (direct) | OpenAI | ~$5–10 |
|
||||||
|
| Speech → Google STT | Google Cloud | ~$1–3 |
|
||||||
|
| Key Vault → Doppler (free tier) | Doppler | $0 |
|
||||||
|
| Push → Firebase FCM | Google | $0 |
|
||||||
|
| Monitoring → Grafana Cloud (free) | Grafana | $0 |
|
||||||
|
| **Total** | | **~$10–22/month** |
|
||||||
|
|
||||||
|
### Cost Summary
|
||||||
|
|
||||||
|
| Scenario | Monthly Cost | vs Current |
|
||||||
|
| ------------------------------ | ------------ | ---------- |
|
||||||
|
| **Azure (current)** | ~$10–20 | Baseline |
|
||||||
|
| **Full AWS** | ~$25–42 | +50–110% |
|
||||||
|
| **Multi-cloud** | ~$10–22 | ~Same |
|
||||||
|
| **MongoDB Atlas + Azure rest** | ~$10–18 | ~Same |
|
||||||
|
|
||||||
|
**Verdict:** At current scale, cost is not a compelling reason to migrate. All options are under $50/month. Cost becomes more significant at scale (10K+ users), where MongoDB Atlas and R2 would likely be cheaper due to no egress fees and better serverless pricing.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Abstraction Layer Assessment
|
||||||
|
|
||||||
|
### Current State: Partially Abstracted
|
||||||
|
|
||||||
|
The codebase already has meaningful abstraction:
|
||||||
|
|
||||||
|
| Layer | Abstraction Level | Notes |
|
||||||
|
| ---------------------- | ----------------------------------------------- | ---------------------------------------------------------------------------- |
|
||||||
|
| **Cosmos DB** | **Partial** — `@bytelyst/cosmos` package | Application code still writes raw SQL queries and uses `@azure/cosmos` types |
|
||||||
|
| **Blob Storage** | **Good** — `@bytelyst/blob` package | Thin wrapper, easy to swap internals |
|
||||||
|
| **OpenAI/LLM** | **Good** — MindLyst has provider auto-detection | LysnrAI desktop/backend hardcodes `AzureOpenAI` |
|
||||||
|
| **Key Vault** | **Excellent** — graceful fallback to env vars | Already cloud-agnostic in practice |
|
||||||
|
| **Speech** | **None** — raw SDK usage | Deep Azure SDK coupling in 3 files |
|
||||||
|
| **Auth (JWT)** | **Excellent** — uses `jose` library | No cloud dependency |
|
||||||
|
| **Push notifications** | **Good** — platform-service abstraction | Swap provider client only |
|
||||||
|
|
||||||
|
### What's Missing: Repository Interface Pattern
|
||||||
|
|
||||||
|
The biggest gap is that repository files directly use `@azure/cosmos` types and SQL query syntax. To make the DB layer swappable, you'd need:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Proposed: packages/cosmos/src/repository.ts
|
||||||
|
export interface DocumentRepository<T> {
|
||||||
|
findById(id: string, partitionKey: string): Promise<T | null>;
|
||||||
|
findMany(filter: Record<string, unknown>, opts?: QueryOptions): Promise<T[]>;
|
||||||
|
create(doc: T): Promise<T>;
|
||||||
|
replace(id: string, doc: T, partitionKey: string): Promise<T>;
|
||||||
|
upsert(doc: T): Promise<T>;
|
||||||
|
delete(id: string, partitionKey: string): Promise<void>;
|
||||||
|
count(filter: Record<string, unknown>): Promise<number>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This would allow swapping Cosmos → MongoDB → PostgreSQL behind the interface without touching 56+ repository files.
|
||||||
|
|
||||||
|
**Effort to add:** 1–2 weeks. This is the **highest-ROI investment** regardless of migration decision.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Risk Analysis
|
||||||
|
|
||||||
|
### 9.1 Risks of Staying on Azure
|
||||||
|
|
||||||
|
| Risk | Likelihood | Impact | Mitigation |
|
||||||
|
| -------------------------------- | ---------- | ------ | ---------------------------------------------------------- |
|
||||||
|
| Azure pricing increases | Low | Medium | Add abstraction layer for future portability |
|
||||||
|
| Azure outage | Low | High | Multi-region already possible (Cosmos global distribution) |
|
||||||
|
| Feature stagnation | Very Low | Low | Azure is investing heavily in AI services |
|
||||||
|
| Vendor lock-in deepens over time | Medium | Medium | Add abstraction layers proactively |
|
||||||
|
|
||||||
|
### 9.2 Risks of Migrating
|
||||||
|
|
||||||
|
| Risk | Likelihood | Impact | Mitigation |
|
||||||
|
| --------------------------------------- | ---------- | -------- | ---------------------------------------------------------- |
|
||||||
|
| Data loss during migration | Low | Critical | Test migration on staging first, keep Azure as backup |
|
||||||
|
| Query performance differences | Medium | Medium | Benchmark before committing |
|
||||||
|
| Feature gaps in new provider | Medium | Medium | Prototype critical features first |
|
||||||
|
| Wasted engineering time | Medium | High | Only migrate if there's a clear business driver |
|
||||||
|
| Regression bugs in 56+ repository files | High | Medium | Comprehensive test suite (1,029 tests) catches most issues |
|
||||||
|
| Speech quality degradation | Medium | High | A/B test both providers before committing |
|
||||||
|
|
||||||
|
### 9.3 Azure-Specific Lock-in Risks (ranked)
|
||||||
|
|
||||||
|
| # | Component | Lock-in Level | Escape Hatch |
|
||||||
|
| --- | ------------------------------------------- | ------------- | ---------------------------------------------------------- |
|
||||||
|
| 1 | **Cosmos DB SQL API** | High | Rewrite queries to MongoDB MQL or add repository interface |
|
||||||
|
| 2 | **Azure Speech SDK (streaming)** | High | Google STT has comparable streaming API |
|
||||||
|
| 3 | **Azure Identity (DefaultAzureCredential)** | Medium | Only used by Key Vault, which is already optional |
|
||||||
|
| 4 | **Blob Storage SAS tokens** | Low | Pre-signed URLs are equivalent across all providers |
|
||||||
|
| 5 | **Azure OpenAI** | Very Low | OpenAI SDK works with both — 1-line config change |
|
||||||
|
| 6 | **Key Vault** | Very Low | Already has env var fallback |
|
||||||
|
| 7 | **Notification Hubs** | Very Low | Not deeply integrated yet |
|
||||||
|
| 8 | **Application Insights** | None | Custom telemetry already built |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Recommendations
|
||||||
|
|
||||||
|
### Recommended Strategy: **Stay on Azure + Invest in Abstraction** (Scenario D)
|
||||||
|
|
||||||
|
This is the highest-scoring approach. Here's the prioritized action plan:
|
||||||
|
|
||||||
|
#### Phase 1: Add Repository Interface (1–2 weeks)
|
||||||
|
|
||||||
|
- Create `DocumentRepository<T>` interface in `@bytelyst/cosmos`
|
||||||
|
- Implement `CosmosDocumentRepository<T>` that wraps current `@azure/cosmos` calls
|
||||||
|
- Gradually migrate the 56 repository files to use the interface
|
||||||
|
- This makes future DB migration a matter of implementing `MongoDocumentRepository<T>` — no application code changes needed
|
||||||
|
|
||||||
|
#### Phase 2: Normalize LLM Abstraction (2–3 days)
|
||||||
|
|
||||||
|
- Move LysnrAI desktop/backend from `AzureOpenAI` → auto-detecting provider pattern (like MindLyst web already does)
|
||||||
|
- Support `OPENAI_PROVIDER=azure|openai|gemini` across all repos
|
||||||
|
- This makes LLM provider swappable via config
|
||||||
|
|
||||||
|
#### Phase 3: Speech Abstraction Layer (1 week, optional)
|
||||||
|
|
||||||
|
- Create `SpeechTranscriber` protocol/interface
|
||||||
|
- Implement `AzureSpeechTranscriber` (current code, extracted)
|
||||||
|
- Prepare `GoogleSpeechTranscriber` stub for future use
|
||||||
|
- This is lower priority since Azure Speech F0 tier is free
|
||||||
|
|
||||||
|
#### Phase 4: Document Decision Criteria for Future Migration
|
||||||
|
|
||||||
|
- Define triggers that would justify migration (e.g., cost > $X/month, Azure outage > Y hours, need for feature Z)
|
||||||
|
- Review annually
|
||||||
|
|
||||||
|
### Why NOT Migrate Now
|
||||||
|
|
||||||
|
1. **Cost is negligible** — ~$10–20/month doesn't justify weeks of engineering
|
||||||
|
2. **No business driver** — Azure isn't blocking any feature development
|
||||||
|
3. **Risk/reward is unfavorable** — 4–8 weeks of migration work for ~$0 cost savings
|
||||||
|
4. **Test coverage is good but not perfect** — 1,029 tests cover most paths, but query-level changes in 56 files still risk regressions
|
||||||
|
5. **Azure free tiers are generous** — Speech F0, Notification Hubs Free, App Insights free tier
|
||||||
|
|
||||||
|
### When Migration WOULD Make Sense
|
||||||
|
|
||||||
|
- **Cosmos DB costs exceed $100/month** → Consider MongoDB Atlas Serverless
|
||||||
|
- **Azure Speech quality is insufficient** → Evaluate Google STT or Deepgram
|
||||||
|
- **Enterprise customer requires specific cloud** → Build the repository interface, then implement their cloud backend
|
||||||
|
- **Azure has extended outage affecting your region** → Multi-region or multi-cloud
|
||||||
|
- **You want to go fully open-source** → PostgreSQL (Supabase) + Whisper + MinIO (significant rewrite)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Migration Playbook (If Chosen)
|
||||||
|
|
||||||
|
If you decide to migrate in the future, here's the execution order (shortest critical path):
|
||||||
|
|
||||||
|
### Week 1–2: Database Abstraction
|
||||||
|
|
||||||
|
1. Create `DocumentRepository<T>` interface
|
||||||
|
2. Implement `CosmosDocumentRepository<T>` (wraps current code)
|
||||||
|
3. Migrate all 56 repository files to use interface
|
||||||
|
4. Verify all 1,029 tests pass
|
||||||
|
|
||||||
|
### Week 3–4: Database Migration (Cosmos → MongoDB)
|
||||||
|
|
||||||
|
1. Implement `MongoDocumentRepository<T>`
|
||||||
|
2. Set up MongoDB Atlas Serverless cluster
|
||||||
|
3. Write data migration script (Cosmos → MongoDB)
|
||||||
|
4. Run migration on staging, verify data integrity
|
||||||
|
5. Switch repository implementation via config flag
|
||||||
|
6. Run full test suite against MongoDB
|
||||||
|
|
||||||
|
### Week 5: Storage + Secrets
|
||||||
|
|
||||||
|
1. Swap `@bytelyst/blob` internals to S3-compatible client
|
||||||
|
2. Migrate blobs (azcopy → aws s3 sync or similar)
|
||||||
|
3. Replace Key Vault with new secrets manager (or just env vars)
|
||||||
|
4. Update all environment variable names
|
||||||
|
|
||||||
|
### Week 6: LLM + Speech (if needed)
|
||||||
|
|
||||||
|
1. Switch OpenAI from Azure endpoint to direct (config change only)
|
||||||
|
2. If migrating Speech: rewrite `azure_stt.py` and Swift `AzureSpeechTranscriber`
|
||||||
|
3. A/B test new speech provider against Azure
|
||||||
|
|
||||||
|
### Week 7–8: Cleanup + Verification
|
||||||
|
|
||||||
|
1. Remove all `@azure/*` npm packages
|
||||||
|
2. Remove all `azure-*` pip packages
|
||||||
|
3. Update Docker configs, CI/CD
|
||||||
|
4. Update documentation
|
||||||
|
5. Monitor production for 2 weeks
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix A: File-Level Azure Dependency Map
|
||||||
|
|
||||||
|
### TypeScript — `@azure/cosmos` (CRITICAL)
|
||||||
|
|
||||||
|
| File | Repo | Direct Import |
|
||||||
|
| ------------------------------------------------------------------- | ----------- | ---------------------- |
|
||||||
|
| `packages/cosmos/src/client.ts` | common-plat | `@azure/cosmos` |
|
||||||
|
| `packages/cosmos/src/containers.ts` | common-plat | `@azure/cosmos` |
|
||||||
|
| `services/platform-service/src/modules/*/repository.ts` (56 files) | common-plat | Via `@bytelyst/cosmos` |
|
||||||
|
| `services/extraction-service/src/modules/*/repository.ts` (2 files) | common-plat | Via `@bytelyst/cosmos` |
|
||||||
|
| `dashboards/admin-web/src/lib/cosmos.ts` | common-plat | `@azure/cosmos` |
|
||||||
|
| `dashboards/admin-web/src/lib/repositories/*.ts` (4 files) | common-plat | Via cosmos.ts |
|
||||||
|
| `mindlyst-native/web/src/lib/cosmos.ts` | MindLyst | `@azure/cosmos` |
|
||||||
|
|
||||||
|
### TypeScript — `@azure/storage-blob`
|
||||||
|
|
||||||
|
| File | Repo | Direct Import |
|
||||||
|
| --------------------------- | ----------- | --------------------- |
|
||||||
|
| `packages/blob/src/blob.ts` | common-plat | `@azure/storage-blob` |
|
||||||
|
|
||||||
|
### TypeScript — `@azure/identity` + `@azure/keyvault-secrets`
|
||||||
|
|
||||||
|
| File | Repo | Direct Import |
|
||||||
|
| ------------------------------------------------------- | ----------- | ------------------------- |
|
||||||
|
| `packages/config/src/keyvault.ts` | common-plat | Dynamic import (both) |
|
||||||
|
| `dashboards/admin-web/src/app/api/ops/secrets/route.ts` | common-plat | Both (Secrets Manager UI) |
|
||||||
|
|
||||||
|
### Python — Azure SDKs
|
||||||
|
|
||||||
|
| File | Repo | SDK |
|
||||||
|
| -------------------------------------- | ------- | ------------------------------------------ |
|
||||||
|
| `src/audio/azure_stt.py` | LysnrAI | `azure.cognitiveservices.speech` |
|
||||||
|
| `src/cloud/cosmos_client.py` | LysnrAI | `azure.cosmos` |
|
||||||
|
| `src/cloud/blob_client.py` | LysnrAI | `azure.storage.blob` |
|
||||||
|
| `src/secrets/keyvault.py` | LysnrAI | `azure.identity`, `azure.keyvault.secrets` |
|
||||||
|
| `backend/src/secrets/keyvault.py` | LysnrAI | `azure.identity`, `azure.keyvault.secrets` |
|
||||||
|
| `backend/src/cloud/cosmos.py` | LysnrAI | `azure.cosmos` |
|
||||||
|
| `src/llm/text_cleaner.py` | LysnrAI | `openai.AzureOpenAI` |
|
||||||
|
| `backend/src/clients/openai_client.py` | LysnrAI | `openai.AsyncAzureOpenAI` |
|
||||||
|
|
||||||
|
### Swift — Azure Speech SDK
|
||||||
|
|
||||||
|
| File | Repo | SDK |
|
||||||
|
| ---------------------------------------------------- | -------- | ---------------------------------- |
|
||||||
|
| `iosApp/Services/AzureSpeechTranscriber.swift` | MindLyst | `MicrosoftCognitiveServicesSpeech` |
|
||||||
|
| `LysnrAI/LysnrKeyboard/KeyboardViewController.swift` | LysnrAI | SPX framework (via CocoaPods) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix B: SDK & Package Inventory
|
||||||
|
|
||||||
|
### npm packages (TypeScript)
|
||||||
|
|
||||||
|
| Package | Version | Used By | Swappable |
|
||||||
|
| ------------------------- | ------- | ------------------------------------------- | ---------------------- |
|
||||||
|
| `@azure/cosmos` | ≥4.0.0 | `@bytelyst/cosmos`, admin-web, MindLyst web | Medium (query rewrite) |
|
||||||
|
| `@azure/storage-blob` | ≥12.0.0 | `@bytelyst/blob` | Easy (S3 compat) |
|
||||||
|
| `@azure/identity` | latest | `@bytelyst/config`, admin-web secrets | Easy (remove) |
|
||||||
|
| `@azure/keyvault-secrets` | latest | `@bytelyst/config`, admin-web secrets | Easy (remove) |
|
||||||
|
|
||||||
|
### pip packages (Python)
|
||||||
|
|
||||||
|
| Package | Version | Used By | Swappable |
|
||||||
|
| -------------------------------- | -------- | ---------------------------------- | --------------------------- |
|
||||||
|
| `azure-cognitiveservices-speech` | ≥1.42.0 | Desktop STT | Hard (deep SDK integration) |
|
||||||
|
| `azure-cosmos` | latest | Desktop + backend Cosmos client | Medium (pymongo swap) |
|
||||||
|
| `azure-storage-blob` | ≥12.24.0 | Desktop blob client | Easy (boto3 swap) |
|
||||||
|
| `azure-identity` | ≥1.19.0 | Key Vault auth | Easy (remove) |
|
||||||
|
| `azure-keyvault-secrets` | ≥4.9.0 | Secrets resolver | Easy (remove) |
|
||||||
|
| `openai` | ≥1.60.0 | `AzureOpenAI` / `AsyncAzureOpenAI` | Trivial (change class name) |
|
||||||
|
| `opencensus-ext-azure` | ≥1.1.0 | Optional telemetry | Trivial (remove) |
|
||||||
|
|
||||||
|
### Swift packages / CocoaPods
|
||||||
|
|
||||||
|
| Package | Used By | Swappable |
|
||||||
|
| ---------------------------------------- | ------------------------- | ------------------------------------- |
|
||||||
|
| `MicrosoftCognitiveServicesSpeech` (SPX) | LysnrAI iOS, MindLyst iOS | Hard (need alternative streaming STT) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
_Document generated by automated codebase analysis. Numbers are accurate as of 2026-03-01. Update as the codebase evolves._
|
||||||
186
docs/audits/AZURE_CONNECTION_AUDIT.md
Normal file
186
docs/audits/AZURE_CONNECTION_AUDIT.md
Normal file
@ -0,0 +1,186 @@
|
|||||||
|
# Azure Connection Audit — Full Workspace Report
|
||||||
|
|
||||||
|
> **Date:** 2026-02-22
|
||||||
|
> **Scope:** `learning_ai_common_plat`, `learning_voice_ai_agent`, `learning_multimodal_memory_agents`, `learning_ai_clock`, `learning_ai_fastgap`
|
||||||
|
> **Auditor:** Cascade (AI)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
| Category | Issues Found | Fixed (session 1) | Fixed (session 2) | Remaining |
|
||||||
|
| ---------------------- | ------------ | ----------------- | ----------------------------------------- | ------------------- |
|
||||||
|
| `x-request-id` missing | 12 clients | 2 (MindLyst) | **9** (root cause + feature-flags) | 0 ✅ |
|
||||||
|
| `x-product-id` missing | 6 clients | 0 | **6** (admin + user dashboards + Python) | 0 ✅ |
|
||||||
|
| Cosmos PK mismatch | 1 container | 0 (flagged) | 0 | 1 (needs migration) |
|
||||||
|
| `.env.example` gaps | 4 files | 1 (MindLyst) | **3** (ChronoMind, user-dash, admin-dash) | 0 ✅ |
|
||||||
|
| Hardcoded productId | 2 instances | 0 | **2** (telemetry.ts, platform_client.py) | 0 ✅ |
|
||||||
|
| Python client gaps | 1 file | 0 | **1** (headers + config) | 0 ✅ |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. `x-request-id` Header — Root Cause
|
||||||
|
|
||||||
|
### Finding
|
||||||
|
|
||||||
|
**`@bytelyst/api-client` does NOT auto-inject `x-request-id`.**
|
||||||
|
|
||||||
|
The `createApiClient()` factory in `packages/api-client/src/client.ts` only sets `Content-Type`, auth token (via `getToken`), and caller-supplied `defaultHeaders`. No `x-request-id` is generated. This means **every consumer** that relies on `@bytelyst/api-client` without explicitly adding the header is missing request tracing.
|
||||||
|
|
||||||
|
### Root Cause Fix
|
||||||
|
|
||||||
|
Add `x-request-id: crypto.randomUUID()` to `buildHeaders()` in `packages/api-client/src/client.ts`. This single change propagates to all consumers automatically.
|
||||||
|
|
||||||
|
### Affected Clients (missing `x-request-id`)
|
||||||
|
|
||||||
|
| Repo | File | Client Pattern |
|
||||||
|
| ---------------- | -------------------------------------------------- | ------------------------------------- |
|
||||||
|
| `common_plat` | `dashboards/admin-web/src/lib/billing-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `common_plat` | `dashboards/admin-web/src/lib/growth-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `common_plat` | `dashboards/admin-web/src/lib/platform-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `common_plat` | `dashboards/tracker-web/src/lib/tracker-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `common_plat` | `packages/extraction/src/client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/billing-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/growth-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/platform-client.ts` | `createApiClient` — no `x-request-id` |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/feature-flags.ts` | Custom `fetch` — no `x-request-id` |
|
||||||
|
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | `httpx` — no `x-request-id` |
|
||||||
|
|
||||||
|
### Already Fixed (previous session)
|
||||||
|
|
||||||
|
| Repo | File | Status |
|
||||||
|
| ------------------- | ------------------------------- | ----------------------------- |
|
||||||
|
| `multimodal_memory` | `web/src/lib/billing-client.ts` | ✅ Added via `defaultHeaders` |
|
||||||
|
| `multimodal_memory` | `web/src/lib/feature-flags.ts` | ✅ Added manually |
|
||||||
|
|
||||||
|
### Already Correct
|
||||||
|
|
||||||
|
| Repo | File | Status |
|
||||||
|
| ----------------------- | ------------------------------------------ | ------------------------------------------- |
|
||||||
|
| `ai_fastgap` (NomGap) | `src/api/client.ts` | ✅ Custom client with `crypto.randomUUID()` |
|
||||||
|
| `ai_clock` (ChronoMind) | `web/src/lib/platform-sync.ts` | ✅ Custom client with `crypto.randomUUID()` |
|
||||||
|
| `voice_ai_agent` | `backend/src/main.py` | ✅ Middleware propagates/generates |
|
||||||
|
| `voice_ai_agent` | `backend/src/clients/extraction_client.py` | ✅ Passes `request_id` param |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. `x-product-id` Header Gaps
|
||||||
|
|
||||||
|
### Clients Missing `x-product-id`
|
||||||
|
|
||||||
|
| Repo | File | Impact |
|
||||||
|
| ---------------- | ----------------------------------------------- | --------------------------------- |
|
||||||
|
| `common_plat` | `admin-web/src/lib/billing-client.ts` | Server can't filter by product |
|
||||||
|
| `common_plat` | `admin-web/src/lib/growth-client.ts` | Server can't filter by product |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/billing-client.ts` | Server can't filter by product |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/growth-client.ts` | Server can't filter by product |
|
||||||
|
| `voice_ai_agent` | `user-dashboard-web/src/lib/platform-client.ts` | Passes in body, not header |
|
||||||
|
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | Passes in body/params, not header |
|
||||||
|
|
||||||
|
### Already Correct
|
||||||
|
|
||||||
|
| Repo | File |
|
||||||
|
| ------------------------------ | ------------------------------------------------------------- |
|
||||||
|
| `ai_fastgap` (NomGap) | `src/api/client.ts` — `x-product-id: API_CONFIG.productId` |
|
||||||
|
| `ai_clock` (ChronoMind) | `web/src/lib/platform-sync.ts` — `x-product-id` header |
|
||||||
|
| `multimodal_memory` (MindLyst) | `web/src/lib/billing-client.ts` — via `defaultHeaders` |
|
||||||
|
| `multimodal_memory` (MindLyst) | `web/src/lib/feature-flags.ts` — explicit header |
|
||||||
|
| `common_plat` | `tracker-web/src/lib/tracker-client.ts` — from `localStorage` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Cosmos DB Partition Key Mismatch
|
||||||
|
|
||||||
|
### `referrals` Container — 3-way Mismatch
|
||||||
|
|
||||||
|
| Location | Partition Key |
|
||||||
|
| ----------------------------------------------------- | ------------- |
|
||||||
|
| `platform-service/src/lib/cosmos-init.ts` | `/id` |
|
||||||
|
| MindLyst `web/src/lib/cosmos.ts` | `/userId` |
|
||||||
|
| Admin dashboard `admin-web/src/lib/cosmos.ts` | `/referrerId` |
|
||||||
|
| User dashboard `user-dashboard-web/src/lib/cosmos.ts` | `/referrerId` |
|
||||||
|
|
||||||
|
**Status:** Flagged in previous session. Cannot be fixed without data migration. Comment added to `cosmos-init.ts`.
|
||||||
|
|
||||||
|
**Risk:** Cross-partition queries will silently succeed but may return incomplete results or fail on point reads if the wrong partition key is specified.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Missing Environment Variables in `.env.example` Files
|
||||||
|
|
||||||
|
### ChronoMind `web/.env.example`
|
||||||
|
|
||||||
|
Currently only has:
|
||||||
|
|
||||||
|
```
|
||||||
|
NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003/api
|
||||||
|
```
|
||||||
|
|
||||||
|
**Missing:**
|
||||||
|
|
||||||
|
- `NEXT_PUBLIC_PRODUCT_ID=chronomind` — used implicitly by `platform-sync.ts` (hardcoded there, but should be env-driven for consistency)
|
||||||
|
|
||||||
|
### LysnrAI `user-dashboard-web/.env.example`
|
||||||
|
|
||||||
|
**Missing:**
|
||||||
|
|
||||||
|
- `NEXT_PUBLIC_PRODUCT_ID=lysnrai` — referenced by `feature-flags.ts` line 10
|
||||||
|
- `NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003` — referenced by `feature-flags.ts` line 11
|
||||||
|
|
||||||
|
Has `PLATFORM_SERVICE_URL` (server-side) but not the `NEXT_PUBLIC_` variant (client-side).
|
||||||
|
|
||||||
|
### LysnrAI root `.env.example`
|
||||||
|
|
||||||
|
**Missing:**
|
||||||
|
|
||||||
|
- `NEXT_PUBLIC_PRODUCT_ID` — not needed at root level (desktop app), so this is informational only.
|
||||||
|
|
||||||
|
### Admin dashboard `.env.example`
|
||||||
|
|
||||||
|
**Missing:**
|
||||||
|
|
||||||
|
- `AZURE_KEYVAULT_URL` — referenced by `instrumentation.ts` but not in `.env.example`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Hardcoded `productId` Values
|
||||||
|
|
||||||
|
| Repo | File | Line | Value | Should Use |
|
||||||
|
| ------------------- | ---------------------------------------- | ------- | ----------------------------- | ------------------------------------ |
|
||||||
|
| `multimodal_memory` | `web/src/lib/telemetry.ts` | 19 | `productId: 'mindlyst'` | `process.env.NEXT_PUBLIC_PRODUCT_ID` |
|
||||||
|
| `voice_ai_agent` | `backend/src/clients/platform_client.py` | 86, 101 | `product_id: str = "lysnrai"` | `settings.PRODUCT_ID` or config |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Python Backend Client Gaps (`platform_client.py`)
|
||||||
|
|
||||||
|
The `PlatformClient` class in `backend/src/clients/platform_client.py` has several issues:
|
||||||
|
|
||||||
|
1. **No `x-request-id` header** on any request
|
||||||
|
2. **No `x-product-id` header** on any request
|
||||||
|
3. **Creates new `httpx.AsyncClient` per request** — no connection pooling
|
||||||
|
4. **Hardcoded `product_id="lysnrai"` defaults** — should use config
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Previously Fixed (Session 1)
|
||||||
|
|
||||||
|
| Fix | Repo | File |
|
||||||
|
| ------------------------------------------- | ------------------- | -------------------------------------------------- |
|
||||||
|
| Added `x-request-id` to billing client | `multimodal_memory` | `web/src/lib/billing-client.ts` |
|
||||||
|
| Added `x-request-id` to feature flags | `multimodal_memory` | `web/src/lib/feature-flags.ts` |
|
||||||
|
| Added 13 MindLyst containers to cosmos-init | `common_plat` | `services/platform-service/src/lib/cosmos-init.ts` |
|
||||||
|
| Added Blob Storage creds to Python config | `voice_ai_agent` | `backend/src/config.py` |
|
||||||
|
| Added missing env vars to MindLyst | `multimodal_memory` | `web/.env.example` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Recommended Fix Order
|
||||||
|
|
||||||
|
1. **P0 — Root cause:** Add `x-request-id` auto-generation to `@bytelyst/api-client` `buildHeaders()` → fixes 9 TS clients at once
|
||||||
|
2. **P0 — LysnrAI feature-flags:** Add `x-request-id` to the custom `fetch` call in `user-dashboard-web/src/lib/feature-flags.ts`
|
||||||
|
3. **P1 — Python backend:** Add `x-request-id` and `x-product-id` headers to `platform_client.py`
|
||||||
|
4. **P1 — Env vars:** Add missing `NEXT_PUBLIC_*` vars to ChronoMind, LysnrAI user-dashboard, admin-dashboard `.env.example` files
|
||||||
|
5. **P2 — `x-product-id`:** Add to admin/user dashboard clients via `defaultHeaders` in `createApiClient` config
|
||||||
|
6. **P2 — Hardcoded productId:** Replace in `telemetry.ts` and `platform_client.py`
|
||||||
|
7. **P3 — Referrals PK mismatch:** Requires data migration strategy (separate task)
|
||||||
@ -544,21 +544,21 @@ The following gaps were identified by scanning every import in the actual codeba
|
|||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
| Phase | Packages | Tasks | Done | Status |
|
| Phase | Packages | Tasks | Done | Status |
|
||||||
| --------- | ------------------------------------------------ | ------- | ------- | --------------------------------------------------------------------------------- |
|
| --------- | ------------------------------------------------ | ------- | ------- | ------------------------------------------------------------------------ |
|
||||||
| **0** | Repo scaffolding + branching + rollback strategy | 14 | 14 | ✅ Complete |
|
| **0** | Repo scaffolding + branching + rollback strategy | 14 | 14 | ✅ Complete |
|
||||||
| **1A** | `@bytelyst/errors` | 23 | 22 | ✅ Complete (Docker verify pending) |
|
| **1A** | `@bytelyst/errors` | 23 | 22 | ✅ Complete (Docker verify pending) |
|
||||||
| **1B** | `@bytelyst/cosmos` | 33 | 32 | ✅ Complete (Docker verify pending) |
|
| **1B** | `@bytelyst/cosmos` | 33 | 32 | ✅ Complete (Docker verify pending) |
|
||||||
| **2A** | `@bytelyst/config` (34 files to rewire) | 25 | 24 | ✅ Complete (Docker verify pending) |
|
| **2A** | `@bytelyst/config` (34 files to rewire) | 25 | 24 | ✅ Complete (Docker verify pending) |
|
||||||
| **2B** | `@bytelyst/auth` (20+ admin routes affected) | 29 | 29 | ✅ Complete (25 tests, tracker migrated) |
|
| **2B** | `@bytelyst/auth` (20+ admin routes affected) | 29 | 29 | ✅ Complete (25 tests, tracker migrated) |
|
||||||
| **2C** | `@bytelyst/fastify-core` | 24 | 22 | ✅ Services refactored, health-check verified (Docker pending) |
|
| **2C** | `@bytelyst/fastify-core` | 24 | 22 | ✅ Services refactored, health-check verified (Docker pending) |
|
||||||
| **3A** | `@bytelyst/api-client` | 17 | 17 | ✅ Complete |
|
| **3A** | `@bytelyst/api-client` | 17 | 17 | ✅ Complete |
|
||||||
| **3B** | `@bytelyst/react-auth` (24 consumer files) | 28 | 25 | ✅ Admin uses factory; user/tracker keep custom |
|
| **3B** | `@bytelyst/react-auth` (24 consumer files) | 28 | 25 | ✅ Admin uses factory; user/tracker keep custom |
|
||||||
| **4** | `@bytelyst/design-tokens` (4 platforms) | 24 | 23 | ✅ CSS synced to MindLyst; CONTRIBUTING updated; visual verify pending |
|
| **4** | `@bytelyst/design-tokens` (4 platforms) | 24 | 23 | ✅ CSS synced to MindLyst; CONTRIBUTING updated; visual verify pending |
|
||||||
| **5** | CI/CD + Docker (pre-copy strategy) | 23 | 23 | ✅ Docker build + compose up verified on home network |
|
| **5** | CI/CD + Docker (pre-copy strategy) | 23 | 23 | ✅ Docker build + compose up verified on home network |
|
||||||
| **6** | Verification + docs + cleanup | 28 | 25 | ⚠️ Remaining E2E: admin + user portal flows |
|
| **6** | Verification + docs + cleanup | 28 | 25 | ⚠️ Remaining E2E: admin + user portal flows |
|
||||||
| **7** | Future enhancements (+testing pkg) | 10 | 3 | 🔲 @bytelyst/testing (10 tests) + token pre-commit hook + AGENTS updated |
|
| **7** | Future enhancements (+testing pkg) | 10 | 3 | 🔲 @bytelyst/testing (10 tests) + token pre-commit hook + AGENTS updated |
|
||||||
| **Total** | **10 packages (+1 bonus: logger)** | **278** | **257** | **~92% complete** |
|
| **Total** | **10 packages (+1 bonus: logger)** | **278** | **257** | **~92% complete** |
|
||||||
|
|
||||||
### Bonus Package (not in original roadmap)
|
### Bonus Package (not in original roadmap)
|
||||||
|
|
||||||
@ -11,13 +11,13 @@
|
|||||||
|
|
||||||
## Why Consolidate
|
## Why Consolidate
|
||||||
|
|
||||||
| Problem | Impact |
|
| Problem | Impact |
|
||||||
|---------|--------|
|
| ---------------------------------------- | ---------------------------------------------- |
|
||||||
| 5 separate Node processes for 2 products | Unnecessary operational overhead |
|
| 5 separate Node processes for 2 products | Unnecessary operational overhead |
|
||||||
| 5 ports to manage (4001–4005) | Complex docker-compose, run scripts, env files |
|
| 5 ports to manage (4001–4005) | Complex docker-compose, run scripts, env files |
|
||||||
| 5 separate Cosmos connections | Wasted connection pool resources |
|
| 5 separate Cosmos connections | Wasted connection pool resources |
|
||||||
| 5 CI pipelines | Slow feedback, more config to maintain |
|
| 5 CI pipelines | Slow feedback, more config to maintain |
|
||||||
| 5 config schemas with duplicate env vars | Inconsistent config, easy to miss vars |
|
| 5 config schemas with duplicate env vars | Inconsistent config, easy to miss vars |
|
||||||
|
|
||||||
**After consolidation:** 2 services — `platform-service` (port 4003) + `extraction-service` (port 4005)
|
**After consolidation:** 2 services — `platform-service` (port 4003) + `extraction-service` (port 4005)
|
||||||
|
|
||||||
@ -31,12 +31,12 @@
|
|||||||
|
|
||||||
Services export product ID differently — modules reference different names:
|
Services export product ID differently — modules reference different names:
|
||||||
|
|
||||||
| Service | Export Name | Source |
|
| Service | Export Name | Source |
|
||||||
|---------|-----------|--------|
|
| -------------------- | -------------------- | ---------------------------------------------------------------------------- |
|
||||||
| **platform-service** | `PRODUCT_ID` | `loadProductIdentity().productId` from `@bytelyst/config` |
|
| **platform-service** | `PRODUCT_ID` | `loadProductIdentity().productId` from `@bytelyst/config` |
|
||||||
| **growth-service** | `PRODUCT_ID` | same as platform ✅ |
|
| **growth-service** | `PRODUCT_ID` | same as platform ✅ |
|
||||||
| **billing-service** | `PRODUCT_ID` | same as platform ✅ |
|
| **billing-service** | `PRODUCT_ID` | same as platform ✅ |
|
||||||
| **tracker-service** | `DEFAULT_PRODUCT_ID` | `process.env.DEFAULT_PRODUCT_ID \|\| getProductId()` — **different name** ⚠️ |
|
| **tracker-service** | `DEFAULT_PRODUCT_ID` | `process.env.DEFAULT_PRODUCT_ID \|\| getProductId()` — **different name** ⚠️ |
|
||||||
|
|
||||||
**Fix:** When merging tracker modules, change all `DEFAULT_PRODUCT_ID` imports to `PRODUCT_ID` in the copied module files, and add `DEFAULT_PRODUCT_ID` env var support to platform-service's `product-config.ts` for backward compat.
|
**Fix:** When merging tracker modules, change all `DEFAULT_PRODUCT_ID` imports to `PRODUCT_ID` in the copied module files, and add `DEFAULT_PRODUCT_ID` env var support to platform-service's `product-config.ts` for backward compat.
|
||||||
|
|
||||||
@ -44,15 +44,16 @@ Services export product ID differently — modules reference different names:
|
|||||||
|
|
||||||
Platform-service `package.json` is **missing** these deps needed by merged modules:
|
Platform-service `package.json` is **missing** these deps needed by merged modules:
|
||||||
|
|
||||||
| Dep | Needed By | Currently In |
|
| Dep | Needed By | Currently In |
|
||||||
|-----|-----------|-------------|
|
| ------------------------------- | ------------------------------------------- | ------------------------------- |
|
||||||
| `stripe` (^17.5.0) | billing modules (stripe webhooks, checkout) | billing-service, growth-service |
|
| `stripe` (^17.5.0) | billing modules (stripe webhooks, checkout) | billing-service, growth-service |
|
||||||
| `@bytelyst/auth` (workspace:*) | tracker modules (`extractAuth`) | tracker-service |
|
| `@bytelyst/auth` (workspace:\*) | tracker modules (`extractAuth`) | tracker-service |
|
||||||
| `@fastify/rate-limit` (^10.3.0) | tracker rate limiting | tracker-service |
|
| `@fastify/rate-limit` (^10.3.0) | tracker rate limiting | tracker-service |
|
||||||
|
|
||||||
### Gap 3: Billing Internal Key Auth (Global Hook)
|
### Gap 3: Billing Internal Key Auth (Global Hook)
|
||||||
|
|
||||||
`billing-service/src/server.ts` has a **global** `onRequest` hook:
|
`billing-service/src/server.ts` has a **global** `onRequest` hook:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
app.addHook('onRequest', async (req, reply) => {
|
app.addHook('onRequest', async (req, reply) => {
|
||||||
if (path === '/health' || path.includes('/stripe/webhook')) return;
|
if (path === '/health' || path.includes('/stripe/webhook')) return;
|
||||||
@ -60,6 +61,7 @@ app.addHook('onRequest', async (req, reply) => {
|
|||||||
if (key !== INTERNAL_KEY) reply.code(401).send(...)
|
if (key !== INTERNAL_KEY) reply.code(401).send(...)
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
This **cannot** be a global hook after merge — it would block auth, audit, tracker, etc. routes.
|
This **cannot** be a global hook after merge — it would block auth, audit, tracker, etc. routes.
|
||||||
|
|
||||||
**Fix:** Convert to a Fastify plugin registered only on billing route prefixes, or add `x-internal-key` check inside each billing route handler.
|
**Fix:** Convert to a Fastify plugin registered only on billing route prefixes, or add `x-internal-key` check inside each billing route handler.
|
||||||
@ -67,6 +69,7 @@ This **cannot** be a global hook after merge — it would block auth, audit, tra
|
|||||||
### Gap 4: Growth Webhooks Library
|
### Gap 4: Growth Webhooks Library
|
||||||
|
|
||||||
`growth-service/src/lib/webhooks.ts` dispatches fire-and-forget HTTP callbacks on invitation redeem. References env vars:
|
`growth-service/src/lib/webhooks.ts` dispatches fire-and-forget HTTP callbacks on invitation redeem. References env vars:
|
||||||
|
|
||||||
- `WEBHOOK_INVITATION_REDEEMED_URL`
|
- `WEBHOOK_INVITATION_REDEEMED_URL`
|
||||||
- `WEBHOOK_REFERRAL_STATUS_URL`
|
- `WEBHOOK_REFERRAL_STATUS_URL`
|
||||||
|
|
||||||
@ -82,26 +85,26 @@ Growth-service config requires `STRIPE_SECRET_KEY` as **required** (not optional
|
|||||||
|
|
||||||
**Dashboard API clients (TypeScript):**
|
**Dashboard API clients (TypeScript):**
|
||||||
|
|
||||||
| File | Current Env Var | Current Default |
|
| File | Current Env Var | Current Default |
|
||||||
|------|----------------|-----------------|
|
| -------------------------------------------------------------- | --------------------- | ---------------------------------- |
|
||||||
| `admin-dashboard-web/src/lib/billing-client.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `admin-dashboard-web/src/lib/billing-client.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
| `admin-dashboard-web/src/lib/growth-client.ts` | `GROWTH_SERVICE_URL` | `http://localhost:4001` |
|
| `admin-dashboard-web/src/lib/growth-client.ts` | `GROWTH_SERVICE_URL` | `http://localhost:4001` |
|
||||||
| `user-dashboard-web/src/lib/billing-client.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `user-dashboard-web/src/lib/billing-client.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
| `user-dashboard-web/src/lib/growth-client.ts` | `GROWTH_SERVICE_URL` | `http://localhost:4001` |
|
| `user-dashboard-web/src/lib/growth-client.ts` | `GROWTH_SERVICE_URL` | `http://localhost:4001` |
|
||||||
| `user-dashboard-web/src/app/api/stripe/webhook/route.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `user-dashboard-web/src/app/api/stripe/webhook/route.ts` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
| `admin-dashboard-web/src/app/api/stripe/config/route.ts` | — | `http://localhost:4002` inline |
|
| `admin-dashboard-web/src/app/api/stripe/config/route.ts` | — | `http://localhost:4002` inline |
|
||||||
| `admin-dashboard-web/src/lib/stripe-context.tsx` | — | `http://localhost:4002` (3 places) |
|
| `admin-dashboard-web/src/lib/stripe-context.tsx` | — | `http://localhost:4002` (3 places) |
|
||||||
| `tracker-dashboard-web/src/app/api/tracker/[...path]/route.ts` | `TRACKER_API_URL` | `http://localhost:4004` |
|
| `tracker-dashboard-web/src/app/api/tracker/[...path]/route.ts` | `TRACKER_API_URL` | `http://localhost:4004` |
|
||||||
| `tracker-dashboard-web/src/app/api/auth/login/route.ts` | `PLATFORM_API_URL` | `http://localhost:4003` ✅ |
|
| `tracker-dashboard-web/src/app/api/auth/login/route.ts` | `PLATFORM_API_URL` | `http://localhost:4003` ✅ |
|
||||||
| `tracker-dashboard-web/src/app/api/auth/me/route.ts` | `PLATFORM_API_URL` | `http://localhost:4003` ✅ |
|
| `tracker-dashboard-web/src/app/api/auth/me/route.ts` | `PLATFORM_API_URL` | `http://localhost:4003` ✅ |
|
||||||
|
|
||||||
**Python clients (desktop + backend):**
|
**Python clients (desktop + backend):**
|
||||||
|
|
||||||
| File | Current Env Var | Current Default |
|
| File | Current Env Var | Current Default |
|
||||||
|------|----------------|-----------------|
|
| --------------------------------------- | --------------------- | ----------------------- |
|
||||||
| `backend/src/clients/billing_client.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `backend/src/clients/billing_client.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
| `src/cloud/api_sync.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `src/cloud/api_sync.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
| `src/cloud/plan_resolver.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
| `src/cloud/plan_resolver.py` | `BILLING_SERVICE_URL` | `http://localhost:4002` |
|
||||||
|
|
||||||
All these must change to `PLATFORM_SERVICE_URL` / `http://localhost:4003`.
|
All these must change to `PLATFORM_SERVICE_URL` / `http://localhost:4003`.
|
||||||
|
|
||||||
@ -112,10 +115,12 @@ All these must change to `PLATFORM_SERVICE_URL` / `http://localhost:4003`.
|
|||||||
### Gap 8: Stripe Webhook Test Hardcodes Port
|
### Gap 8: Stripe Webhook Test Hardcodes Port
|
||||||
|
|
||||||
`user-dashboard-web/src/__tests__/stripe-webhook.test.ts` sets:
|
`user-dashboard-web/src/__tests__/stripe-webhook.test.ts` sets:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
process.env.BILLING_SERVICE_URL = 'http://localhost:4002';
|
process.env.BILLING_SERVICE_URL = 'http://localhost:4002';
|
||||||
expect(url).toBe('http://localhost:4002/api/stripe/webhook');
|
expect(url).toBe('http://localhost:4002/api/stripe/webhook');
|
||||||
```
|
```
|
||||||
|
|
||||||
Must update to port 4003.
|
Must update to port 4003.
|
||||||
|
|
||||||
### Gap 9: Load Test Scripts
|
### Gap 9: Load Test Scripts
|
||||||
@ -133,6 +138,7 @@ Must update defaults to port 4003.
|
|||||||
### Gap 11: LysnrAI Services Stubs
|
### Gap 11: LysnrAI Services Stubs
|
||||||
|
|
||||||
`learning_voice_ai_agent/services/` contains `.env.example` stubs for each service:
|
`learning_voice_ai_agent/services/` contains `.env.example` stubs for each service:
|
||||||
|
|
||||||
- `services/billing-service/.env.example`
|
- `services/billing-service/.env.example`
|
||||||
- `services/growth-service/.env.example`
|
- `services/growth-service/.env.example`
|
||||||
- `services/tracker-service/.env.example`
|
- `services/tracker-service/.env.example`
|
||||||
@ -154,6 +160,7 @@ Mobile apps call the Python backend (`localhost:8000`), which calls billing-serv
|
|||||||
### Gap 14: Docker Compose `depends_on` for Tracker Dashboard
|
### Gap 14: Docker Compose `depends_on` for Tracker Dashboard
|
||||||
|
|
||||||
`learning_voice_ai_agent/docker-compose.yml` has:
|
`learning_voice_ai_agent/docker-compose.yml` has:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
tracker-dashboard:
|
tracker-dashboard:
|
||||||
depends_on:
|
depends_on:
|
||||||
@ -162,17 +169,23 @@ tracker-dashboard:
|
|||||||
platform-service:
|
platform-service:
|
||||||
condition: service_started
|
condition: service_started
|
||||||
```
|
```
|
||||||
|
|
||||||
After merge, `tracker-service` container no longer exists. Must change `depends_on` to only `platform-service`.
|
After merge, `tracker-service` container no longer exists. Must change `depends_on` to only `platform-service`.
|
||||||
|
|
||||||
### Gap 15: Admin Dashboard `docs.ts` Service Directory List
|
### Gap 15: Admin Dashboard `docs.ts` Service Directory List
|
||||||
|
|
||||||
`admin-dashboard-web/src/lib/docs.ts` has a hardcoded list of service directories:
|
`admin-dashboard-web/src/lib/docs.ts` has a hardcoded list of service directories:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
const serviceDirs = [
|
const serviceDirs = [
|
||||||
'admin-dashboard-web', 'user-dashboard-web', 'mobile_app',
|
'admin-dashboard-web',
|
||||||
'services/billing-service', 'services/growth-service',
|
'user-dashboard-web',
|
||||||
|
'mobile_app',
|
||||||
|
'services/billing-service',
|
||||||
|
'services/growth-service',
|
||||||
];
|
];
|
||||||
```
|
```
|
||||||
|
|
||||||
Must update to remove old service names or replace with `services/platform-service`.
|
Must update to remove old service names or replace with `services/platform-service`.
|
||||||
|
|
||||||
### Gap 16: MindLyst Docs Reference Old Services
|
### Gap 16: MindLyst Docs Reference Old Services
|
||||||
@ -195,6 +208,7 @@ Platform-service's Dockerfile only copies `services/platform-service/` — it do
|
|||||||
### Route Path Collision Check ✅
|
### Route Path Collision Check ✅
|
||||||
|
|
||||||
All services use unique route prefixes — **no collisions**:
|
All services use unique route prefixes — **no collisions**:
|
||||||
|
|
||||||
- platform: `/auth/*`, `/audit/*`, `/notifications/*`, `/flags/*`, `/ratelimit/*`, `/blob/*`, `/devices/*`
|
- platform: `/auth/*`, `/audit/*`, `/notifications/*`, `/flags/*`, `/ratelimit/*`, `/blob/*`, `/devices/*`
|
||||||
- billing: `/subscriptions/*`, `/usage/*`, `/plans/*`, `/licenses/*`, `/payments/*`, `/stripe/*`
|
- billing: `/subscriptions/*`, `/usage/*`, `/plans/*`, `/licenses/*`, `/payments/*`, `/stripe/*`
|
||||||
- growth: `/invitations/*`, `/referrals/*`, `/promos/*`
|
- growth: `/invitations/*`, `/referrals/*`, `/promos/*`
|
||||||
@ -244,12 +258,12 @@ services/
|
|||||||
|
|
||||||
All containers served by one Cosmos client in platform-service:
|
All containers served by one Cosmos client in platform-service:
|
||||||
|
|
||||||
| Origin | Containers |
|
| Origin | Containers |
|
||||||
|--------|-----------|
|
| ----------------------- | ----------------------------------------------------------------------------------- |
|
||||||
| **platform** (existing) | `users`, `audit_log`, `feature_flags`, `notification_devices`, `notification_prefs` |
|
| **platform** (existing) | `users`, `audit_log`, `feature_flags`, `notification_devices`, `notification_prefs` |
|
||||||
| **billing** → platform | `subscriptions`, `payments`, `plans`, `licenses`, `usage_daily` |
|
| **billing** → platform | `subscriptions`, `payments`, `plans`, `licenses`, `usage_daily` |
|
||||||
| **growth** → platform | `invitation_codes`, `referrals`, `promo_codes` |
|
| **growth** → platform | `invitation_codes`, `referrals`, `promo_codes` |
|
||||||
| **tracker** → platform | `tracker_items`, `tracker_comments`, `tracker_votes` |
|
| **tracker** → platform | `tracker_items`, `tracker_comments`, `tracker_votes` |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -390,7 +404,7 @@ All containers served by one Cosmos client in platform-service:
|
|||||||
|
|
||||||
- [x] **3.3.1** Created `platform-service/src/lib/auth.ts` re-exporting from `@bytelyst/auth`
|
- [x] **3.3.1** Created `platform-service/src/lib/auth.ts` re-exporting from `@bytelyst/auth`
|
||||||
- [x] **3.3.2** Copied from tracker-service (identical content)
|
- [x] **3.3.2** Copied from tracker-service (identical content)
|
||||||
- [x] **3.3.3** Added `@bytelyst/auth` (workspace:*) to package.json
|
- [x] **3.3.3** Added `@bytelyst/auth` (workspace:\*) to package.json
|
||||||
- [x] **3.3.4** Added `@fastify/rate-limit` (^10.3.0) to package.json
|
- [x] **3.3.4** Added `@fastify/rate-limit` (^10.3.0) to package.json
|
||||||
- [x] **3.3.5** `jose` already in platform ✅
|
- [x] **3.3.5** `jose` already in platform ✅
|
||||||
|
|
||||||
@ -560,29 +574,30 @@ Also fixed: monitoring/health.ts, AI.dev/SKILLS docs, MIGRATION_GUIDE.md [`81609
|
|||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
| Phase | What | Effort | Tests Moved | Critical Gaps Addressed |
|
| Phase | What | Effort | Tests Moved | Critical Gaps Addressed |
|
||||||
|-------|------|--------|-------------|------------------------|
|
| --------- | ------------------------------------------- | ------------- | --------------- | ------------------------------------ |
|
||||||
| **0** | Preparation & backup | 30 min | — | — |
|
| **0** | Preparation & backup | 30 min | — | — |
|
||||||
| **1** | Merge growth-service (3 modules) | 2–3 hrs | ~14 | Gap 4 (webhooks), Gap 5 (Stripe key) |
|
| **1** | Merge growth-service (3 modules) | 2–3 hrs | ~14 | Gap 4 (webhooks), Gap 5 (Stripe key) |
|
||||||
| **2** | Merge billing-service (5 modules) | 4–5 hrs | ~11 | Gap 3 (internal key auth) |
|
| **2** | Merge billing-service (5 modules) | 4–5 hrs | ~11 | Gap 3 (internal key auth) |
|
||||||
| **3** | Merge tracker-service (4 modules) | 3–4 hrs | ~45 | Gap 1 (product ID), Gap 2 (deps) |
|
| **3** | Merge tracker-service (4 modules) | 3–4 hrs | ~45 | Gap 1 (product ID), Gap 2 (deps) |
|
||||||
| **4** | Update consumers (20+ files across 3 repos) | 4–5 hrs | — | Gaps 6–11, 13–17 |
|
| **4** | Update consumers (20+ files across 3 repos) | 4–5 hrs | — | Gaps 6–11, 13–17 |
|
||||||
| **5** | Documentation & final verification | 2–3 hrs | — | — |
|
| **5** | Documentation & final verification | 2–3 hrs | — | — |
|
||||||
| **Total** | **5 services → 2** | **~4–5 days** | **~125+ tests** | **17 gaps addressed** |
|
| **Total** | **5 services → 2** | **~4–5 days** | **~125+ tests** | **17 gaps addressed** |
|
||||||
|
|
||||||
## Port Allocation (After)
|
## Port Allocation (After)
|
||||||
|
|
||||||
| Service | Port |
|
| Service | Port |
|
||||||
|---------|------|
|
| -------------------------------------------- | -------- |
|
||||||
| **platform-service** | **4003** |
|
| **platform-service** | **4003** |
|
||||||
| **extraction-service** | **4005** |
|
| **extraction-service** | **4005** |
|
||||||
| extraction-service python sidecar (internal) | 4006 |
|
| extraction-service python sidecar (internal) | 4006 |
|
||||||
|
|
||||||
Ports 4001, 4002, 4004 freed up.
|
Ports 4001, 4002, 4004 freed up.
|
||||||
|
|
||||||
## Rollback Strategy
|
## Rollback Strategy
|
||||||
|
|
||||||
Each phase has its own commit. If a phase breaks something:
|
Each phase has its own commit. If a phase breaks something:
|
||||||
|
|
||||||
1. `git revert <commit>` to undo that phase
|
1. `git revert <commit>` to undo that phase
|
||||||
2. The old service code is in git history
|
2. The old service code is in git history
|
||||||
3. Backup branches created in Phase 0
|
3. Backup branches created in Phase 0
|
||||||
@ -590,13 +605,13 @@ Each phase has its own commit. If a phase breaks something:
|
|||||||
|
|
||||||
## Risks & Mitigations
|
## Risks & Mitigations
|
||||||
|
|
||||||
| Risk | Mitigation |
|
| Risk | Mitigation |
|
||||||
|------|-----------|
|
| ---------------------------------------- | ----------------------------------------------------------------------------- |
|
||||||
| Route path collisions | Verified ✅ — all services use unique prefixes |
|
| Route path collisions | Verified ✅ — all services use unique prefixes |
|
||||||
| Config schema gets large | Group env vars by domain with clear section comments |
|
| Config schema gets large | Group env vars by domain with clear section comments |
|
||||||
| Stripe webhook raw body | Fastify handles this — verify after move |
|
| Stripe webhook raw body | Fastify handles this — verify after move |
|
||||||
| Billing internal key blocks other routes | Scoped Fastify plugin (Phase 2.2) isolates key check to billing prefixes only |
|
| Billing internal key blocks other routes | Scoped Fastify plugin (Phase 2.2) isolates key check to billing prefixes only |
|
||||||
| Public tracker routes skip auth | Register outside scoped plugins — verify in Phase 3.5.3 |
|
| Public tracker routes skip auth | Register outside scoped plugins — verify in Phase 3.5.3 |
|
||||||
| Python billing client breaks | Change env var name, keep same API paths — transparent to Python code |
|
| Python billing client breaks | Change env var name, keep same API paths — transparent to Python code |
|
||||||
| Stripe webhook test fails | Explicit port update in Phase 4.4 |
|
| Stripe webhook test fails | Explicit port update in Phase 4.4 |
|
||||||
| Product ID mismatch | Alias `DEFAULT_PRODUCT_ID = PRODUCT_ID` in Phase 3.2.4 |
|
| Product ID mismatch | Alias `DEFAULT_PRODUCT_ID = PRODUCT_ID` in Phase 3.2.4 |
|
||||||
@ -82,6 +82,7 @@ routes.ts ────────► │ container() │
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Problems:**
|
**Problems:**
|
||||||
|
|
||||||
- 38 platform-service repository files write raw Cosmos SQL queries
|
- 38 platform-service repository files write raw Cosmos SQL queries
|
||||||
- 6 additional repository files in dashboards + MindLyst web
|
- 6 additional repository files in dashboards + MindLyst web
|
||||||
- Blob, Speech, OpenAI all have direct Azure SDK imports
|
- Blob, Speech, OpenAI all have direct Azure SDK imports
|
||||||
@ -112,6 +113,7 @@ routes.ts ────────► │ collection.findMany({ │
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Benefits:**
|
**Benefits:**
|
||||||
|
|
||||||
- Repositories use a generic query API — no SQL strings, no Azure types
|
- Repositories use a generic query API — no SQL strings, no Azure types
|
||||||
- Switching provider = implement a new adapter (~200 lines) + change env var
|
- Switching provider = implement a new adapter (~200 lines) + change env var
|
||||||
- In-memory adapter makes tests fast and cloud-free
|
- In-memory adapter makes tests fast and cloud-free
|
||||||
@ -121,16 +123,16 @@ routes.ts ────────► │ collection.findMany({ │
|
|||||||
|
|
||||||
## 3. Sprint Plan Overview
|
## 3. Sprint Plan Overview
|
||||||
|
|
||||||
| Sprint | Package / Scope | Effort | Files Changed | Risk |
|
| Sprint | Package / Scope | Effort | Files Changed | Risk |
|
||||||
|--------|----------------|--------|---------------|------|
|
| --------- | ------------------------------------------------- | --------------- | ----------------------------------- | -------- |
|
||||||
| **1** | `@bytelyst/datastore` — DB abstraction | 5–7 days | 44 repository files + 1 new package | Medium |
|
| **1** | `@bytelyst/datastore` — DB abstraction | 5–7 days | 44 repository files + 1 new package | Medium |
|
||||||
| **2** | `@bytelyst/storage` — Blob/Object abstraction | 2 days | 3 files + 1 new package | Low |
|
| **2** | `@bytelyst/storage` — Blob/Object abstraction | 2 days | 3 files + 1 new package | Low |
|
||||||
| **3** | `@bytelyst/llm` — LLM provider abstraction | 2 days | 4 files + 1 new package | Low |
|
| **3** | `@bytelyst/llm` — LLM provider abstraction | 2 days | 4 files + 1 new package | Low |
|
||||||
| **4** | `@bytelyst/secrets` — Secrets manager abstraction | 1 day | 2 files (refactor existing) | Very Low |
|
| **4** | `@bytelyst/secrets` — Secrets manager abstraction | 1 day | 2 files (refactor existing) | Very Low |
|
||||||
| **5** | `@bytelyst/speech` — Speech STT abstraction | 3–4 days | 3 files + 1 new package | Medium |
|
| **5** | `@bytelyst/speech` — Speech STT abstraction | 3–4 days | 3 files + 1 new package | Medium |
|
||||||
| **6** | `@bytelyst/push` — Push notification abstraction | 1 day | 1 file + 1 new package | Very Low |
|
| **6** | `@bytelyst/push` — Push notification abstraction | 1 day | 1 file + 1 new package | Very Low |
|
||||||
| **7** | Monitoring/Telemetry cleanup | 0.5 days | Already done (custom telemetry) | None |
|
| **7** | Monitoring/Telemetry cleanup | 0.5 days | Already done (custom telemetry) | None |
|
||||||
| **Total** | | **~15–17 days** | ~55 files | |
|
| **Total** | | **~15–17 days** | ~55 files | |
|
||||||
|
|
||||||
### Priority Order
|
### Priority Order
|
||||||
|
|
||||||
@ -211,8 +213,8 @@ export type SortMap = Record<string, 1 | -1>; // 1 = ASC, -1 = DESC
|
|||||||
export interface AggregateOptions {
|
export interface AggregateOptions {
|
||||||
filter: FilterMap;
|
filter: FilterMap;
|
||||||
groupBy?: string[];
|
groupBy?: string[];
|
||||||
count?: string; // alias for COUNT(1)
|
count?: string; // alias for COUNT(1)
|
||||||
sum?: string; // field to SUM
|
sum?: string; // field to SUM
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Factory that creates collections — one per provider. */
|
/** Factory that creates collections — one per provider. */
|
||||||
@ -412,6 +414,7 @@ export async function create(doc: FeatureFlagDoc): Promise<FeatureFlagDoc> {
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Key observations:**
|
**Key observations:**
|
||||||
|
|
||||||
- No SQL strings
|
- No SQL strings
|
||||||
- No `@azure/cosmos` types
|
- No `@azure/cosmos` types
|
||||||
- No `.items.query().fetchAll()` chaining
|
- No `.items.query().fetchAll()` chaining
|
||||||
@ -445,11 +448,11 @@ export function createDatastoreProvider(): DatastoreProvider {
|
|||||||
const provider = process.env.DB_PROVIDER || 'cosmos';
|
const provider = process.env.DB_PROVIDER || 'cosmos';
|
||||||
switch (provider) {
|
switch (provider) {
|
||||||
case 'cosmos':
|
case 'cosmos':
|
||||||
return new CosmosDatastoreProvider(); // uses existing COSMOS_ENDPOINT, COSMOS_KEY
|
return new CosmosDatastoreProvider(); // uses existing COSMOS_ENDPOINT, COSMOS_KEY
|
||||||
case 'mongo':
|
case 'mongo':
|
||||||
return new MongoDatastoreProvider(); // uses MONGO_URI
|
return new MongoDatastoreProvider(); // uses MONGO_URI
|
||||||
case 'memory':
|
case 'memory':
|
||||||
return new MemoryDatastoreProvider(); // no config needed
|
return new MemoryDatastoreProvider(); // no config needed
|
||||||
default:
|
default:
|
||||||
throw new Error(`Unknown DB_PROVIDER: ${provider}`);
|
throw new Error(`Unknown DB_PROVIDER: ${provider}`);
|
||||||
}
|
}
|
||||||
@ -459,6 +462,7 @@ export function createDatastoreProvider(): DatastoreProvider {
|
|||||||
### 4.7 Migration Plan for 38 Repository Files
|
### 4.7 Migration Plan for 38 Repository Files
|
||||||
|
|
||||||
Migrate in batches, one module per commit. Each commit:
|
Migrate in batches, one module per commit. Each commit:
|
||||||
|
|
||||||
1. Update the repository file to use `getCollection()` instead of `getContainer()`
|
1. Update the repository file to use `getCollection()` instead of `getContainer()`
|
||||||
2. Replace SQL queries with `findMany()` / `findOne()` / `count()` / `aggregate()`
|
2. Replace SQL queries with `findMany()` / `findOne()` / `count()` / `aggregate()`
|
||||||
3. Run the module's test file — must pass
|
3. Run the module's test file — must pass
|
||||||
@ -466,40 +470,40 @@ Migrate in batches, one module per commit. Each commit:
|
|||||||
|
|
||||||
**Batch order** (simplest first, complex last):
|
**Batch order** (simplest first, complex last):
|
||||||
|
|
||||||
| Batch | Modules | Complexity | Notes |
|
| Batch | Modules | Complexity | Notes |
|
||||||
|-------|---------|-----------|-------|
|
| ----- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ | --------------- |
|
||||||
| 1 | flags, plans, settings, changelog, products | Simple CRUD | 5 files, warmup |
|
| 1 | flags, plans, settings, changelog, products | Simple CRUD | 5 files, warmup |
|
||||||
| 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | 5 files |
|
| 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | 5 files |
|
||||||
| 3 | items, comments, votes, brains, reflections | CRUD + filter combos | 5 files |
|
| 3 | items, comments, votes, brains, reflections | CRUD + filter combos | 5 files |
|
||||||
| 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | 5 files |
|
| 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | 5 files |
|
||||||
| 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | 5 files |
|
| 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | 5 files |
|
||||||
| 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | 4 files |
|
| 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | 4 files |
|
||||||
| 7 | timers, shared-timers, routines, households | Sync logic, batch ops | 4 files |
|
| 7 | timers, shared-timers, routines, households | Sync logic, batch ops | 4 files |
|
||||||
| 8 | fasting-sessions, fasting-protocols, meal-log, social-fasting, daily-briefs, streaks, push-triggers, impersonation, status, memory, analytics, waitlist | Product-specific + remaining | 12 files |
|
| 8 | fasting-sessions, fasting-protocols, meal-log, social-fasting, daily-briefs, streaks, push-triggers, impersonation, status, memory, analytics, waitlist | Product-specific + remaining | 12 files |
|
||||||
| 9 | Dashboard cosmos clients (admin-web, MindLyst web) | Direct `@azure/cosmos` | 6 files |
|
| 9 | Dashboard cosmos clients (admin-web, MindLyst web) | Direct `@azure/cosmos` | 6 files |
|
||||||
| 10 | Python clients (desktop cosmos, backend cosmos) | `azure.cosmos` → abstracted | 2 files |
|
| 10 | Python clients (desktop cosmos, backend cosmos) | `azure.cosmos` → abstracted | 2 files |
|
||||||
|
|
||||||
### 4.8 Handling Complex Queries
|
### 4.8 Handling Complex Queries
|
||||||
|
|
||||||
Some repository files use advanced Cosmos SQL features. Here's how the interface handles them:
|
Some repository files use advanced Cosmos SQL features. Here's how the interface handles them:
|
||||||
|
|
||||||
| Cosmos SQL Pattern | Datastore Interface Equivalent |
|
| Cosmos SQL Pattern | Datastore Interface Equivalent |
|
||||||
|--------------------|-------------------------------|
|
| ---------------------------------------------------- | -------------------------------------------------------- |
|
||||||
| `SELECT * FROM c WHERE c.x = @v` | `findMany({ filter: { x: v } })` |
|
| `SELECT * FROM c WHERE c.x = @v` | `findMany({ filter: { x: v } })` |
|
||||||
| `SELECT * FROM c WHERE c.x = @v AND c.y = @w` | `findMany({ filter: { x: v, y: w } })` |
|
| `SELECT * FROM c WHERE c.x = @v AND c.y = @w` | `findMany({ filter: { x: v, y: w } })` |
|
||||||
| `ORDER BY c.x ASC` | `findMany({ sort: { x: 1 } })` |
|
| `ORDER BY c.x ASC` | `findMany({ sort: { x: 1 } })` |
|
||||||
| `ORDER BY c.x DESC` | `findMany({ sort: { x: -1 } })` |
|
| `ORDER BY c.x DESC` | `findMany({ sort: { x: -1 } })` |
|
||||||
| `OFFSET @o LIMIT @l` | `findMany({ offset: o, limit: l })` |
|
| `OFFSET @o LIMIT @l` | `findMany({ offset: o, limit: l })` |
|
||||||
| `SELECT VALUE COUNT(1) FROM c WHERE ...` | `count({ filter })` |
|
| `SELECT VALUE COUNT(1) FROM c WHERE ...` | `count({ filter })` |
|
||||||
| `SELECT c.plan, COUNT(1) AS cnt ... GROUP BY c.plan` | `aggregate({ filter, groupBy: ['plan'], count: 'cnt' })` |
|
| `SELECT c.plan, COUNT(1) AS cnt ... GROUP BY c.plan` | `aggregate({ filter, groupBy: ['plan'], count: 'cnt' })` |
|
||||||
| `NOT IS_DEFINED(c.usedAt)` | `findMany({ filter: { usedAt: { $exists: false } } })` |
|
| `NOT IS_DEFINED(c.usedAt)` | `findMany({ filter: { usedAt: { $exists: false } } })` |
|
||||||
| `c.x >= @v` | `findMany({ filter: { x: { $gte: v } } })` |
|
| `c.x >= @v` | `findMany({ filter: { x: { $gte: v } } })` |
|
||||||
| `ARRAY_CONTAINS(c.tags, @tag)` | `findMany({ filter: { tags: { $contains: tag } } })` |
|
| `ARRAY_CONTAINS(c.tags, @tag)` | `findMany({ filter: { tags: { $contains: tag } } })` |
|
||||||
| `container().item(id, pk).read()` | `findById(id, pk)` |
|
| `container().item(id, pk).read()` | `findById(id, pk)` |
|
||||||
| `container().items.create(doc)` | `create(doc)` |
|
| `container().items.create(doc)` | `create(doc)` |
|
||||||
| `container().item(id, pk).replace(doc)` | `replace(id, pk, doc)` |
|
| `container().item(id, pk).replace(doc)` | `replace(id, pk, doc)` |
|
||||||
| `container().items.upsert(doc)` | `upsert(doc)` |
|
| `container().items.upsert(doc)` | `upsert(doc)` |
|
||||||
| `container().item(id, pk).delete()` | `delete(id, pk)` |
|
| `container().item(id, pk).delete()` | `delete(id, pk)` |
|
||||||
|
|
||||||
For the filter operators, use a simple operator convention:
|
For the filter operators, use a simple operator convention:
|
||||||
|
|
||||||
@ -645,6 +649,7 @@ export interface ChatCompletionResponse {
|
|||||||
MindLyst `web/src/lib/llm.ts` already auto-detects Azure vs OpenAI based on env vars. This pattern should be promoted to a shared package.
|
MindLyst `web/src/lib/llm.ts` already auto-detects Azure vs OpenAI based on env vars. This pattern should be promoted to a shared package.
|
||||||
|
|
||||||
**Provider implementations:**
|
**Provider implementations:**
|
||||||
|
|
||||||
- `AzureOpenAIProvider` — uses `api-key` header + deployment-scoped URL
|
- `AzureOpenAIProvider` — uses `api-key` header + deployment-scoped URL
|
||||||
- `OpenAIProvider` — uses `Authorization: Bearer` header + model param
|
- `OpenAIProvider` — uses `Authorization: Bearer` header + model param
|
||||||
- `GeminiProvider` — uses Google Generative AI SDK (future)
|
- `GeminiProvider` — uses Google Generative AI SDK (future)
|
||||||
@ -676,6 +681,7 @@ The `openai` Python SDK already has a common interface between `OpenAI` and `Azu
|
|||||||
### 7.1 Key Insight: Already 90% Done
|
### 7.1 Key Insight: Already 90% Done
|
||||||
|
|
||||||
The current `resolveKeyVaultSecrets()` already:
|
The current `resolveKeyVaultSecrets()` already:
|
||||||
|
|
||||||
- Skips if `AZURE_KEYVAULT_URL` is not set
|
- Skips if `AZURE_KEYVAULT_URL` is not set
|
||||||
- Falls back to env vars for each secret
|
- Falls back to env vars for each secret
|
||||||
- Logs warnings but doesn't throw
|
- Logs warnings but doesn't throw
|
||||||
@ -691,19 +697,19 @@ export interface SecretsProvider {
|
|||||||
|
|
||||||
export async function resolveSecrets(
|
export async function resolveSecrets(
|
||||||
secrets: SecretMapping[],
|
secrets: SecretMapping[],
|
||||||
opts?: { provider?: string },
|
opts?: { provider?: string }
|
||||||
): Promise<void> {
|
): Promise<void> {
|
||||||
const provider = opts?.provider || process.env.SECRETS_PROVIDER || 'env';
|
const provider = opts?.provider || process.env.SECRETS_PROVIDER || 'env';
|
||||||
|
|
||||||
switch (provider) {
|
switch (provider) {
|
||||||
case 'azure-keyvault':
|
case 'azure-keyvault':
|
||||||
return resolveFromAzureKeyVault(secrets); // existing code
|
return resolveFromAzureKeyVault(secrets); // existing code
|
||||||
case 'aws-secrets-manager':
|
case 'aws-secrets-manager':
|
||||||
return resolveFromAWSSecretsManager(secrets); // future
|
return resolveFromAWSSecretsManager(secrets); // future
|
||||||
case 'gcp-secret-manager':
|
case 'gcp-secret-manager':
|
||||||
return resolveFromGCPSecretManager(secrets); // future
|
return resolveFromGCPSecretManager(secrets); // future
|
||||||
case 'doppler':
|
case 'doppler':
|
||||||
return resolveFromDoppler(secrets); // future
|
return resolveFromDoppler(secrets); // future
|
||||||
case 'env':
|
case 'env':
|
||||||
default:
|
default:
|
||||||
return; // All secrets already in env — nothing to resolve
|
return; // All secrets already in env — nothing to resolve
|
||||||
@ -720,14 +726,14 @@ The current env vars have Azure-specific names. Add **generic aliases** that fal
|
|||||||
|
|
||||||
export const ENV_ALIASES: Record<string, string[]> = {
|
export const ENV_ALIASES: Record<string, string[]> = {
|
||||||
// Generic name → fallback names (checked in order)
|
// Generic name → fallback names (checked in order)
|
||||||
'BLOB_CONNECTION_STRING': ['AZURE_BLOB_CONNECTION_STRING'],
|
BLOB_CONNECTION_STRING: ['AZURE_BLOB_CONNECTION_STRING'],
|
||||||
'BLOB_ACCOUNT_NAME': ['AZURE_BLOB_ACCOUNT_NAME'],
|
BLOB_ACCOUNT_NAME: ['AZURE_BLOB_ACCOUNT_NAME'],
|
||||||
'BLOB_ACCOUNT_KEY': ['AZURE_BLOB_ACCOUNT_KEY'],
|
BLOB_ACCOUNT_KEY: ['AZURE_BLOB_ACCOUNT_KEY'],
|
||||||
'SPEECH_KEY': ['AZURE_SPEECH_KEY'],
|
SPEECH_KEY: ['AZURE_SPEECH_KEY'],
|
||||||
'SPEECH_REGION': ['AZURE_SPEECH_REGION'],
|
SPEECH_REGION: ['AZURE_SPEECH_REGION'],
|
||||||
'LLM_API_KEY': ['AZURE_OPENAI_KEY', 'OPENAI_API_KEY'],
|
LLM_API_KEY: ['AZURE_OPENAI_KEY', 'OPENAI_API_KEY'],
|
||||||
'LLM_ENDPOINT': ['AZURE_OPENAI_ENDPOINT', 'OPENAI_BASE_URL'],
|
LLM_ENDPOINT: ['AZURE_OPENAI_ENDPOINT', 'OPENAI_BASE_URL'],
|
||||||
'LLM_MODEL': ['AZURE_OPENAI_DEPLOYMENT', 'OPENAI_MODEL'],
|
LLM_MODEL: ['AZURE_OPENAI_DEPLOYMENT', 'OPENAI_MODEL'],
|
||||||
};
|
};
|
||||||
|
|
||||||
export function getEnv(name: string): string | undefined {
|
export function getEnv(name: string): string | undefined {
|
||||||
@ -829,6 +835,7 @@ protocol SpeechTranscriber {
|
|||||||
### 8.4 Note on Complexity
|
### 8.4 Note on Complexity
|
||||||
|
|
||||||
Speech is the hardest abstraction because:
|
Speech is the hardest abstraction because:
|
||||||
|
|
||||||
- Azure Speech SDK has a unique push-stream architecture
|
- Azure Speech SDK has a unique push-stream architecture
|
||||||
- Google Cloud Speech uses gRPC streaming
|
- Google Cloud Speech uses gRPC streaming
|
||||||
- Deepgram uses WebSockets
|
- Deepgram uses WebSockets
|
||||||
@ -871,11 +878,13 @@ Implementations: `AzureNotificationHubProvider`, `FirebaseProvider` (future), `E
|
|||||||
**Effort:** 0.5 days (mostly done already)
|
**Effort:** 0.5 days (mostly done already)
|
||||||
|
|
||||||
The ecosystem already has cloud-agnostic monitoring:
|
The ecosystem already has cloud-agnostic monitoring:
|
||||||
|
|
||||||
- **Custom telemetry** via `@bytelyst/telemetry-client` → platform-service → Cosmos
|
- **Custom telemetry** via `@bytelyst/telemetry-client` → platform-service → Cosmos
|
||||||
- **Loki + Grafana** in `services/monitoring/`
|
- **Loki + Grafana** in `services/monitoring/`
|
||||||
- **Health checks** via `/health` endpoints on all services
|
- **Health checks** via `/health` endpoints on all services
|
||||||
|
|
||||||
**Remaining work:**
|
**Remaining work:**
|
||||||
|
|
||||||
- Remove `opencensus-ext-azure` from Python requirements (optional, only used for App Insights)
|
- Remove `opencensus-ext-azure` from Python requirements (optional, only used for App Insights)
|
||||||
- Ensure all structured logging uses `pino` (TS) or `structlog` (Python) — no Azure-specific loggers
|
- Ensure all structured logging uses `pino` (TS) or `structlog` (Python) — no Azure-specific loggers
|
||||||
|
|
||||||
@ -887,43 +896,43 @@ Once all sprints are complete, here's how much work each cloud migration scenari
|
|||||||
|
|
||||||
### Scenario: Switch DB from Cosmos to MongoDB Atlas
|
### Scenario: Switch DB from Cosmos to MongoDB Atlas
|
||||||
|
|
||||||
| Step | Effort | Description |
|
| Step | Effort | Description |
|
||||||
|------|--------|-------------|
|
| ----------------------------------------- | ------------- | -------------------------------------------------- |
|
||||||
| Implement `MongoDatastoreProvider` | 1 day | ~200 lines — translate FilterMap to MongoDB find() |
|
| Implement `MongoDatastoreProvider` | 1 day | ~200 lines — translate FilterMap to MongoDB find() |
|
||||||
| Set `DB_PROVIDER=mongo` + `MONGO_URI=...` | 5 minutes | Config change |
|
| Set `DB_PROVIDER=mongo` + `MONGO_URI=...` | 5 minutes | Config change |
|
||||||
| Run data migration script | 2–4 hours | Export Cosmos JSON → import to MongoDB |
|
| Run data migration script | 2–4 hours | Export Cosmos JSON → import to MongoDB |
|
||||||
| Run full test suite | 30 minutes | Verify all 1,029+ tests pass |
|
| Run full test suite | 30 minutes | Verify all 1,029+ tests pass |
|
||||||
| **Total** | **~1.5 days** | vs 3–5 weeks without abstraction |
|
| **Total** | **~1.5 days** | vs 3–5 weeks without abstraction |
|
||||||
|
|
||||||
### Scenario: Switch Storage from Azure Blob to S3
|
### Scenario: Switch Storage from Azure Blob to S3
|
||||||
|
|
||||||
| Step | Effort | Description |
|
| Step | Effort | Description |
|
||||||
|------|--------|-------------|
|
| -------------------------------------------- | ------------- | ------------------------------- |
|
||||||
| Implement `S3StorageProvider` | 0.5 day | ~100 lines |
|
| Implement `S3StorageProvider` | 0.5 day | ~100 lines |
|
||||||
| Set `STORAGE_PROVIDER=s3` + `AWS_*` env vars | 5 minutes | Config change |
|
| Set `STORAGE_PROVIDER=s3` + `AWS_*` env vars | 5 minutes | Config change |
|
||||||
| Migrate blobs | 1–2 hours | azcopy or rclone |
|
| Migrate blobs | 1–2 hours | azcopy or rclone |
|
||||||
| **Total** | **~0.5 days** | vs 2–3 days without abstraction |
|
| **Total** | **~0.5 days** | vs 2–3 days without abstraction |
|
||||||
|
|
||||||
### Scenario: Switch LLM from Azure OpenAI to OpenAI Direct
|
### Scenario: Switch LLM from Azure OpenAI to OpenAI Direct
|
||||||
|
|
||||||
| Step | Effort | Description |
|
| Step | Effort | Description |
|
||||||
|------|--------|-------------|
|
| ------------------------------------------------ | -------------- | ----------------------- |
|
||||||
| Set `LLM_PROVIDER=openai` + `OPENAI_API_KEY=...` | 5 minutes | Config change only |
|
| Set `LLM_PROVIDER=openai` + `OPENAI_API_KEY=...` | 5 minutes | Config change only |
|
||||||
| Remove `AZURE_OPENAI_*` env vars | 5 minutes | Cleanup |
|
| Remove `AZURE_OPENAI_*` env vars | 5 minutes | Cleanup |
|
||||||
| **Total** | **10 minutes** | Already near-zero today |
|
| **Total** | **10 minutes** | Already near-zero today |
|
||||||
|
|
||||||
### Scenario: Full Cloud Migration (Azure → AWS)
|
### Scenario: Full Cloud Migration (Azure → AWS)
|
||||||
|
|
||||||
| Step | Effort | Description |
|
| Step | Effort | Description |
|
||||||
|------|--------|-------------|
|
| -------------------------------------- | -------------- | -------------------------------- |
|
||||||
| Implement MongoDB/DynamoDB provider | 1–2 days | |
|
| Implement MongoDB/DynamoDB provider | 1–2 days | |
|
||||||
| Implement S3 storage provider | 0.5 days | |
|
| Implement S3 storage provider | 0.5 days | |
|
||||||
| Implement AWS Secrets Manager provider | 0.5 days | |
|
| Implement AWS Secrets Manager provider | 0.5 days | |
|
||||||
| Switch LLM to OpenAI direct | 10 minutes | |
|
| Switch LLM to OpenAI direct | 10 minutes | |
|
||||||
| Implement Google STT or AWS Transcribe | 2–3 days | Speech is still the hardest |
|
| Implement Google STT or AWS Transcribe | 2–3 days | Speech is still the hardest |
|
||||||
| Implement SNS push provider | 0.5 days | |
|
| Implement SNS push provider | 0.5 days | |
|
||||||
| Data migration + testing | 2–3 days | |
|
| Data migration + testing | 2–3 days | |
|
||||||
| **Total** | **~7–10 days** | vs 4–8 weeks without abstraction |
|
| **Total** | **~7–10 days** | vs 4–8 weeks without abstraction |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -938,7 +947,7 @@ Every repository test should work against **any** provider. The test setup picks
|
|||||||
import { setTestProvider } from '@bytelyst/datastore/testing';
|
import { setTestProvider } from '@bytelyst/datastore/testing';
|
||||||
|
|
||||||
beforeAll(() => {
|
beforeAll(() => {
|
||||||
setTestProvider('memory'); // Fast, no network, deterministic
|
setTestProvider('memory'); // Fast, no network, deterministic
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -957,6 +966,7 @@ __tests__/
|
|||||||
### 12.3 Migration Verification Checklist
|
### 12.3 Migration Verification Checklist
|
||||||
|
|
||||||
For each sprint, before merging:
|
For each sprint, before merging:
|
||||||
|
|
||||||
1. All existing tests pass (no regressions)
|
1. All existing tests pass (no regressions)
|
||||||
2. New interface tests pass with all implemented providers
|
2. New interface tests pass with all implemented providers
|
||||||
3. Manual smoke test against Azure (dev environment)
|
3. Manual smoke test against Azure (dev environment)
|
||||||
@ -1046,14 +1056,14 @@ AZURE_SPEECH_REGION=eastus
|
|||||||
|
|
||||||
## 14. Risk Mitigation
|
## 14. Risk Mitigation
|
||||||
|
|
||||||
| Risk | Mitigation |
|
| Risk | Mitigation |
|
||||||
|------|-----------|
|
| ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
|
||||||
| **FilterMap can't express complex Cosmos SQL** | Add `rawQuery()` escape hatch for edge cases. Track usage — if >5% of queries need it, expand FilterMap operators |
|
| **FilterMap can't express complex Cosmos SQL** | Add `rawQuery()` escape hatch for edge cases. Track usage — if >5% of queries need it, expand FilterMap operators |
|
||||||
| **Performance regression from abstraction layer** | Benchmark critical queries before/after. The abstraction adds one function call — negligible |
|
| **Performance regression from abstraction layer** | Benchmark critical queries before/after. The abstraction adds one function call — negligible |
|
||||||
| **Team unfamiliar with new patterns** | Each sprint includes updating AGENTS.md with new conventions. Old pattern (direct Cosmos) still works during migration |
|
| **Team unfamiliar with new patterns** | Each sprint includes updating AGENTS.md with new conventions. Old pattern (direct Cosmos) still works during migration |
|
||||||
| **In-memory provider behaves differently** | Integration test suite runs against real Cosmos in CI. Memory provider is for unit tests only |
|
| **In-memory provider behaves differently** | Integration test suite runs against real Cosmos in CI. Memory provider is for unit tests only |
|
||||||
| **Stale data during DB migration** | Use dual-write pattern: write to both old and new provider during transition. Read from new, fall back to old |
|
| **Stale data during DB migration** | Use dual-write pattern: write to both old and new provider during transition. Read from new, fall back to old |
|
||||||
| **Sprint 1 takes too long** | The 38 repository files can be migrated incrementally — even 5 files at a time is progress. Old and new patterns coexist |
|
| **Sprint 1 takes too long** | The 38 repository files can be migrated incrementally — even 5 files at a time is progress. Old and new patterns coexist |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -1116,32 +1126,81 @@ packages/llm/
|
|||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
// Exact match
|
// Exact match
|
||||||
{ field: value }
|
{
|
||||||
|
field: value;
|
||||||
|
}
|
||||||
|
|
||||||
// Comparison
|
// Comparison
|
||||||
{ field: { $gt: value } } // >
|
{
|
||||||
{ field: { $gte: value } } // >=
|
field: {
|
||||||
{ field: { $lt: value } } // <
|
$gt: value;
|
||||||
{ field: { $lte: value } } // <=
|
}
|
||||||
{ field: { $ne: value } } // !=
|
} // >
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$gte: value;
|
||||||
|
}
|
||||||
|
} // >=
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$lt: value;
|
||||||
|
}
|
||||||
|
} // <
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$lte: value;
|
||||||
|
}
|
||||||
|
} // <=
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$ne: value;
|
||||||
|
}
|
||||||
|
} // !=
|
||||||
|
|
||||||
// Existence
|
// Existence
|
||||||
{ field: { $exists: true } } // IS_DEFINED(c.field)
|
{
|
||||||
{ field: { $exists: false } } // NOT IS_DEFINED(c.field)
|
field: {
|
||||||
|
$exists: true;
|
||||||
|
}
|
||||||
|
} // IS_DEFINED(c.field)
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$exists: false;
|
||||||
|
}
|
||||||
|
} // NOT IS_DEFINED(c.field)
|
||||||
|
|
||||||
// String
|
// String
|
||||||
{ field: { $startsWith: 'prefix' } }
|
{
|
||||||
{ field: { $contains: 'substr' } }
|
field: {
|
||||||
|
$startsWith: 'prefix';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$contains: 'substr';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Array
|
// Array
|
||||||
{ field: { $contains: value } } // ARRAY_CONTAINS
|
{
|
||||||
{ field: { $in: [v1, v2, v3] } } // IN operator
|
field: {
|
||||||
|
$contains: value;
|
||||||
|
}
|
||||||
|
} // ARRAY_CONTAINS
|
||||||
|
{
|
||||||
|
field: {
|
||||||
|
$in: [v1, v2, v3];
|
||||||
|
}
|
||||||
|
} // IN operator
|
||||||
|
|
||||||
// Logical (for complex queries)
|
// Logical (for complex queries)
|
||||||
{ $or: [{ field1: v1 }, { field2: v2 }] }
|
{
|
||||||
|
$or: [{ field1: v1 }, { field2: v2 }];
|
||||||
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**Cosmos adapter** translates each operator to SQL:
|
**Cosmos adapter** translates each operator to SQL:
|
||||||
|
|
||||||
- `{ $gte: v }` → `c.field >= @pN`
|
- `{ $gte: v }` → `c.field >= @pN`
|
||||||
- `{ $exists: false }` → `NOT IS_DEFINED(c.field)`
|
- `{ $exists: false }` → `NOT IS_DEFINED(c.field)`
|
||||||
- `{ $contains: v }` on array → `ARRAY_CONTAINS(c.field, @pN)`
|
- `{ $contains: v }` on array → `ARRAY_CONTAINS(c.field, @pN)`
|
||||||
@ -1155,19 +1214,19 @@ packages/llm/
|
|||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
| Sprint | What | Days | After This Sprint... |
|
| Sprint | What | Days | After This Sprint... |
|
||||||
|--------|------|------|---------------------|
|
| --------- | -------------------- | --------------- | ---------------------------------------------------------- |
|
||||||
| 1 | Database abstraction | 5–7 | DB swap = implement 1 adapter (~200 LOC) + config change |
|
| 1 | Database abstraction | 5–7 | DB swap = implement 1 adapter (~200 LOC) + config change |
|
||||||
| 2 | Storage abstraction | 2 | Blob swap = implement 1 adapter (~100 LOC) + config change |
|
| 2 | Storage abstraction | 2 | Blob swap = implement 1 adapter (~100 LOC) + config change |
|
||||||
| 3 | LLM abstraction | 2 | LLM swap = config change only (10 minutes) |
|
| 3 | LLM abstraction | 2 | LLM swap = config change only (10 minutes) |
|
||||||
| 4 | Secrets abstraction | 1 | Secrets swap = config change only |
|
| 4 | Secrets abstraction | 1 | Secrets swap = config change only |
|
||||||
| 5 | Speech abstraction | 3–4 | Speech swap = implement 1 adapter (~300 LOC) |
|
| 5 | Speech abstraction | 3–4 | Speech swap = implement 1 adapter (~300 LOC) |
|
||||||
| 6 | Push abstraction | 1 | Push swap = implement 1 adapter (~50 LOC) |
|
| 6 | Push abstraction | 1 | Push swap = implement 1 adapter (~50 LOC) |
|
||||||
| 7 | Monitoring cleanup | 0.5 | Already cloud-agnostic |
|
| 7 | Monitoring cleanup | 0.5 | Already cloud-agnostic |
|
||||||
| **Total** | | **~15–17 days** | **Full cloud migration = ~7–10 days instead of 4–8 weeks** |
|
| **Total** | | **~15–17 days** | **Full cloud migration = ~7–10 days instead of 4–8 weeks** |
|
||||||
|
|
||||||
The key insight: **~80% of migration effort is in Sprint 1 (database)**. If you only do one sprint, do that one. Everything else is comparatively easy.
|
The key insight: **~80% of migration effort is in Sprint 1 (database)**. If you only do one sprint, do that one. Everything else is comparatively easy.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Document generated by automated codebase analysis. Companion to `CLOUD_PROVIDER_MIGRATION_ANALYSIS.md`. Review as the codebase evolves.*
|
_Document generated by automated codebase analysis. Companion to `CLOUD_PROVIDER_MIGRATION_ANALYSIS.md`. Review as the codebase evolves._
|
||||||
@ -10,7 +10,7 @@ You currently have 3 repos checked out side-by-side:
|
|||||||
|
|
||||||
The goal is to **systematically refactor into a “platform repo”** (common libraries + common services) while keeping **product-specific code in product repos**, with a workflow that feels like how high-performing AI companies build: small PRs, strong automation, stable internal interfaces, and “golden paths” for shipping.
|
The goal is to **systematically refactor into a “platform repo”** (common libraries + common services) while keeping **product-specific code in product repos**, with a workflow that feels like how high-performing AI companies build: small PRs, strong automation, stable internal interfaces, and “golden paths” for shipping.
|
||||||
|
|
||||||
Important constraint: we cannot know exactly how OpenAI/Anthropic run their internal engineering, but we *can* adopt the common patterns used by top-tier product+platform orgs: platform teams, strong CI gates, typed service contracts, SDK generation, trunk-based integration, feature flags, and opinionated templates.
|
Important constraint: we cannot know exactly how OpenAI/Anthropic run their internal engineering, but we _can_ adopt the common patterns used by top-tier product+platform orgs: platform teams, strong CI gates, typed service contracts, SDK generation, trunk-based integration, feature flags, and opinionated templates.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -283,4 +283,3 @@ Pick one of these patterns:
|
|||||||
3. Decide whether LysnrAI FastAPI backend is:
|
3. Decide whether LysnrAI FastAPI backend is:
|
||||||
- product-only (dictation/transcripts), or
|
- product-only (dictation/transcripts), or
|
||||||
- a transitional legacy backend to be decomposed into platform services.
|
- a transitional legacy backend to be decomposed into platform services.
|
||||||
|
|
||||||
Loading…
Reference in New Issue
Block a user