# Cloud-Agnostic Refactor Roadmap — ByteLyst Ecosystem > **Author:** AI Analysis (Cascade) > **Date:** 2026-03-01 > **Companion doc:** [`CLOUD_PROVIDER_MIGRATION_ANALYSIS.md`](./CLOUD_PROVIDER_MIGRATION_ANALYSIS.md) > **Goal:** Refactor the codebase so it continues to work on Azure today, but switching to any other cloud provider requires **minimum effort** (days, not weeks). --- ## Table of Contents 1. [Philosophy](#1-philosophy) 2. [Current State vs Target State](#2-current-state-vs-target-state) 3. [Sprint Plan Overview](#3-sprint-plan-overview) 4. [Sprint 1: Database Abstraction Layer](#4-sprint-1-database-abstraction-layer) 5. [Sprint 2: Storage Abstraction Layer](#5-sprint-2-storage-abstraction-layer) 6. [Sprint 3: LLM Provider Abstraction](#6-sprint-3-llm-provider-abstraction) 7. [Sprint 4: Secrets Manager Abstraction](#7-sprint-4-secrets-manager-abstraction) 8. [Sprint 5: Speech Provider Abstraction](#8-sprint-5-speech-provider-abstraction) 9. [Sprint 6: Push Notification Abstraction](#9-sprint-6-push-notification-abstraction) 10. [Sprint 7: Monitoring & Telemetry Abstraction](#10-sprint-7-monitoring--telemetry-abstraction) 11. [Migration Effort After Refactor](#11-migration-effort-after-refactor) 12. [Testing Strategy](#12-testing-strategy) 13. [Env Var Naming Convention](#13-env-var-naming-convention) 14. [Risk Mitigation](#14-risk-mitigation) 15. [Appendix: Interface Specifications](#appendix-interface-specifications) --- ## 1. Philosophy ### Core Principle: Provider-Agnostic Interfaces, Provider-Specific Implementations ``` Application Code (routes, business logic) │ ▼ @bytelyst/* interfaces ◄── Cloud-agnostic contracts │ ▼ Provider implementations ◄── Azure today, swap tomorrow ├── cosmos-provider/ (Azure Cosmos DB) ├── mongo-provider/ (MongoDB Atlas — future) ├── s3-provider/ (AWS S3 — future) └── ... ``` ### Design Rules 1. **Application code NEVER imports cloud SDKs** — only `@bytelyst/*` interfaces 2. **Provider chosen at startup via env var** — `DB_PROVIDER=cosmos`, `STORAGE_PROVIDER=azure`, etc. 3. **All interfaces have an in-memory mock** — for testing without any cloud dependency 4. **Zero breaking changes** — every sprint keeps all existing tests passing 5. **Incremental adoption** — modules migrate one at a time, old and new patterns coexist ### What This Is NOT - This is **not** a migration to another cloud — Azure continues to be the production provider - This is **not** a rewrite — it's a series of refactors that insert interfaces between app code and cloud SDKs - This is **not** over-engineering — each interface is thin (30–60 lines) and directly maps to patterns already in the codebase --- ## 2. Current State vs Target State ### Current: Direct Azure SDK Usage ``` 38 repository.ts files ┌──────────────────────┐ routes.ts ────────► │ container() │ │ .items.query(SQL) │ ◄── @azure/cosmos types leak everywhere │ .items.create(doc) │ │ .item(id,pk).read() │ └──────────────────────┘ │ ▼ @bytelyst/cosmos (client.ts) │ ▼ @azure/cosmos SDK ``` **Problems:** - 38 platform-service repository files write raw Cosmos SQL queries - 6 additional repository files in dashboards + MindLyst web - Blob, Speech, OpenAI all have direct Azure SDK imports - Switching DB means rewriting 44+ files ### Target: Provider-Agnostic Interfaces ``` 38 repository.ts files ┌──────────────────────────┐ routes.ts ────────► │ collection.findMany({ │ │ filter: {productId}, │ ◄── Cloud-agnostic API │ sort: {createdAt: -1}, │ │ limit: 20, │ │ }) │ └──────────────────────────┘ │ ▼ @bytelyst/datastore (interface) │ ┌─────────┼─────────┐ ▼ ▼ ▼ CosmosAdapter MongoAdapter MemoryAdapter (Azure) (MongoDB) (Testing) │ ▼ @azure/cosmos SDK ``` **Benefits:** - Repositories use a generic query API — no SQL strings, no Azure types - Switching provider = implement a new adapter (~200 lines) + change env var - In-memory adapter makes tests fast and cloud-free - Azure continues to work exactly as before --- ## 3. Sprint Plan Overview | Sprint | Package / Scope | Effort | Files Changed | Risk | |--------|----------------|--------|---------------|------| | **1** | `@bytelyst/datastore` — DB abstraction | 5–7 days | 44 repository files + 1 new package | Medium | | **2** | `@bytelyst/storage` — Blob/Object abstraction | 2 days | 3 files + 1 new package | Low | | **3** | `@bytelyst/llm` — LLM provider abstraction | 2 days | 4 files + 1 new package | Low | | **4** | `@bytelyst/secrets` — Secrets manager abstraction | 1 day | 2 files (refactor existing) | Very Low | | **5** | `@bytelyst/speech` — Speech STT abstraction | 3–4 days | 3 files + 1 new package | Medium | | **6** | `@bytelyst/push` — Push notification abstraction | 1 day | 1 file + 1 new package | Very Low | | **7** | Monitoring/Telemetry cleanup | 0.5 days | Already done (custom telemetry) | None | | **Total** | | **~15–17 days** | ~55 files | | ### Priority Order ``` Sprint 1 (DB) ──► Sprint 2 (Storage) ──► Sprint 3 (LLM) ──► Sprint 4 (Secrets) ▲ HIGHEST ROI EASY EASY TRIVIAL │ └── 80% of migration effort lives here. Do this first. Sprint 5 (Speech) ──► Sprint 6 (Push) ──► Sprint 7 (Monitoring) MEDIUM LOW PRIORITY ALREADY DONE ``` --- ## 4. Sprint 1: Database Abstraction Layer **Package:** `@bytelyst/datastore` **Effort:** 5–7 days **This is the most important sprint — it eliminates 80% of cloud lock-in.** ### 4.1 Interface Design ```typescript // packages/datastore/src/types.ts /** A cloud-agnostic document collection (like a Cosmos container or Mongo collection). */ export interface DocumentCollection { /** Find a single document by ID + partition key. */ findById(id: string, partitionKey: string): Promise; /** Find multiple documents matching a filter. */ findMany(opts: FindManyOptions): Promise; /** Find one document matching a filter. */ findOne(opts: FindOneOptions): Promise; /** Count documents matching a filter. */ count(filter: FilterMap): Promise; /** Insert a new document. */ create(doc: T): Promise; /** Replace an entire document (full overwrite). */ replace(id: string, partitionKey: string, doc: T): Promise; /** Upsert: create if not exists, replace if exists. */ upsert(doc: T): Promise; /** Delete a document by ID + partition key. */ delete(id: string, partitionKey: string): Promise; /** Run an aggregation (COUNT, SUM, GROUP BY). */ aggregate(opts: AggregateOptions): Promise; } export interface BaseDocument { id: string; [key: string]: unknown; } export interface FindManyOptions { filter: FilterMap; sort?: SortMap; limit?: number; offset?: number; partitionKey?: string; } export interface FindOneOptions { filter: FilterMap; partitionKey?: string; } export type FilterMap = Record; export type SortMap = Record; // 1 = ASC, -1 = DESC export interface AggregateOptions { filter: FilterMap; groupBy?: string[]; count?: string; // alias for COUNT(1) sum?: string; // field to SUM } /** Factory that creates collections — one per provider. */ export interface DatastoreProvider { collection(name: string): DocumentCollection; initialize?(configs: Record): Promise; close?(): Promise; } export interface CollectionConfig { partitionKeyPath: string; defaultTtl?: number | null; } ``` ### 4.2 Cosmos Adapter (keeps everything working today) ```typescript // packages/datastore/src/providers/cosmos.ts import type { Container } from '@azure/cosmos'; import type { BaseDocument, DocumentCollection, FindManyOptions, FilterMap, ... } from '../types.js'; export class CosmosCollection implements DocumentCollection { constructor(private container: Container) {} async findById(id: string, partitionKey: string): Promise { try { const { resource } = await this.container.item(id, partitionKey).read(); return resource ?? null; } catch { return null; } } async findMany(opts: FindManyOptions): Promise { const { sql, params } = buildSqlQuery(opts); // ◄── converts FilterMap → Cosmos SQL const { resources } = await this.container .items.query({ query: sql, parameters: params }) .fetchAll(); return resources; } async create(doc: T): Promise { const { resource } = await this.container.items.create(doc); return resource as T; } async replace(id: string, partitionKey: string, doc: T): Promise { const { resource } = await this.container.item(id, partitionKey).replace(doc); return resource as T; } async upsert(doc: T): Promise { const { resource } = await this.container.items.upsert(doc); return resource as T; } async delete(id: string, partitionKey: string): Promise { try { await this.container.item(id, partitionKey).delete(); return true; } catch { return false; } } // ... count(), findOne(), aggregate() } /** Convert a FilterMap to Cosmos SQL. */ function buildSqlQuery(opts: FindManyOptions): { sql: string; params: SqlParam[] } { // { productId: 'x', userId: 'y' } // → "SELECT * FROM c WHERE c.productId = @p0 AND c.userId = @p1 ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20" // This is a mechanical translation — no query language exposed to application code. } ``` ### 4.3 In-Memory Adapter (for testing) ```typescript // packages/datastore/src/providers/memory.ts export class MemoryCollection implements DocumentCollection { private docs: Map = new Map(); async findById(id: string): Promise { return this.docs.get(id) ?? null; } async findMany(opts: FindManyOptions): Promise { let results = [...this.docs.values()].filter(doc => matchesFilter(doc, opts.filter)); if (opts.sort) results = sortDocs(results, opts.sort); if (opts.offset) results = results.slice(opts.offset); if (opts.limit) results = results.slice(0, opts.limit); return results; } async create(doc: T): Promise { this.docs.set(doc.id, doc); return doc; } // ... etc } ``` ### 4.4 MongoDB Adapter (future — ready to implement when needed) ```typescript // packages/datastore/src/providers/mongo.ts (STUB — implement when migrating) import type { Collection as MongoCollection } from 'mongodb'; import type { BaseDocument, DocumentCollection, FindManyOptions } from '../types.js'; export class MongoDocumentCollection implements DocumentCollection { constructor(private collection: MongoCollection) {} async findById(id: string): Promise { return this.collection.findOne({ _id: id } as any) as Promise; } async findMany(opts: FindManyOptions): Promise { let cursor = this.collection.find(opts.filter); if (opts.sort) cursor = cursor.sort(opts.sort); if (opts.offset) cursor = cursor.skip(opts.offset); if (opts.limit) cursor = cursor.limit(opts.limit); return cursor.toArray() as Promise; } // ... etc } ``` ### 4.5 How Repository Files Change **Before (Cosmos SQL in every file):** ```typescript // services/platform-service/src/modules/flags/repository.ts import { getContainer } from '../../lib/cosmos.js'; function container() { return getContainer('feature_flags'); } export async function list(productId: string): Promise { const { resources } = await container() .items.query({ query: 'SELECT * FROM c WHERE c.productId = @productId ORDER BY c.key ASC', parameters: [{ name: '@productId', value: productId }], }) .fetchAll(); return resources; } export async function getByKey(key: string, productId: string): Promise { const { resources } = await container() .items.query({ query: 'SELECT * FROM c WHERE c.productId = @productId AND c.key = @key', parameters: [ { name: '@productId', value: productId }, { name: '@key', value: key }, ], }) .fetchAll(); return resources[0] ?? null; } export async function create(doc: FeatureFlagDoc): Promise { const { resource } = await container().items.create(doc); return resource as FeatureFlagDoc; } ``` **After (cloud-agnostic):** ```typescript // services/platform-service/src/modules/flags/repository.ts import { getCollection } from '../../lib/datastore.js'; import type { FeatureFlagDoc } from './types.js'; function collection() { return getCollection('feature_flags'); } export async function list(productId: string): Promise { return collection().findMany({ filter: { productId }, sort: { key: 1 }, }); } export async function getByKey(key: string, productId: string): Promise { return collection().findOne({ filter: { productId, key }, }); } export async function create(doc: FeatureFlagDoc): Promise { return collection().create(doc); } ``` **Key observations:** - No SQL strings - No `@azure/cosmos` types - No `.items.query().fetchAll()` chaining - The `getCollection()` function returns the right provider based on `DB_PROVIDER` env var - **All existing behavior is preserved** — the Cosmos adapter generates the same SQL under the hood ### 4.6 Service Wiring ```typescript // services/platform-service/src/lib/datastore.ts (replaces lib/cosmos.ts) import { createDatastoreProvider } from '@bytelyst/datastore'; import type { DocumentCollection, BaseDocument } from '@bytelyst/datastore'; let _provider: ReturnType | null = null; export function getProvider() { if (!_provider) { _provider = createDatastoreProvider(); // reads DB_PROVIDER env var } return _provider; } export function getCollection(name: string): DocumentCollection { return getProvider().collection(name); } ``` ```typescript // packages/datastore/src/factory.ts export function createDatastoreProvider(): DatastoreProvider { const provider = process.env.DB_PROVIDER || 'cosmos'; switch (provider) { case 'cosmos': return new CosmosDatastoreProvider(); // uses existing COSMOS_ENDPOINT, COSMOS_KEY case 'mongo': return new MongoDatastoreProvider(); // uses MONGO_URI case 'memory': return new MemoryDatastoreProvider(); // no config needed default: throw new Error(`Unknown DB_PROVIDER: ${provider}`); } } ``` ### 4.7 Migration Plan for 38 Repository Files Migrate in batches, one module per commit. Each commit: 1. Update the repository file to use `getCollection()` instead of `getContainer()` 2. Replace SQL queries with `findMany()` / `findOne()` / `count()` / `aggregate()` 3. Run the module's test file — must pass 4. Commit: `refactor(module-name): migrate to datastore abstraction` **Batch order** (simplest first, complex last): | Batch | Modules | Complexity | Notes | |-------|---------|-----------|-------| | 1 | flags, plans, settings, changelog, products | Simple CRUD | 5 files, warmup | | 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | 5 files | | 3 | items, comments, votes, brains, reflections | CRUD + filter combos | 5 files | | 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | 5 files | | 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | 5 files | | 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | 4 files | | 7 | timers, shared-timers, routines, households | Sync logic, batch ops | 4 files | | 8 | fasting-sessions, fasting-protocols, meal-log, social-fasting, daily-briefs, streaks, push-triggers, impersonation, status, memory, analytics, waitlist | Product-specific + remaining | 12 files | | 9 | Dashboard cosmos clients (admin-web, MindLyst web) | Direct `@azure/cosmos` | 6 files | | 10 | Python clients (desktop cosmos, backend cosmos) | `azure.cosmos` → abstracted | 2 files | ### 4.8 Handling Complex Queries Some repository files use advanced Cosmos SQL features. Here's how the interface handles them: | Cosmos SQL Pattern | Datastore Interface Equivalent | |--------------------|-------------------------------| | `SELECT * FROM c WHERE c.x = @v` | `findMany({ filter: { x: v } })` | | `SELECT * FROM c WHERE c.x = @v AND c.y = @w` | `findMany({ filter: { x: v, y: w } })` | | `ORDER BY c.x ASC` | `findMany({ sort: { x: 1 } })` | | `ORDER BY c.x DESC` | `findMany({ sort: { x: -1 } })` | | `OFFSET @o LIMIT @l` | `findMany({ offset: o, limit: l })` | | `SELECT VALUE COUNT(1) FROM c WHERE ...` | `count({ filter })` | | `SELECT c.plan, COUNT(1) AS cnt ... GROUP BY c.plan` | `aggregate({ filter, groupBy: ['plan'], count: 'cnt' })` | | `NOT IS_DEFINED(c.usedAt)` | `findMany({ filter: { usedAt: { $exists: false } } })` | | `c.x >= @v` | `findMany({ filter: { x: { $gte: v } } })` | | `ARRAY_CONTAINS(c.tags, @tag)` | `findMany({ filter: { tags: { $contains: tag } } })` | | `container().item(id, pk).read()` | `findById(id, pk)` | | `container().items.create(doc)` | `create(doc)` | | `container().item(id, pk).replace(doc)` | `replace(id, pk, doc)` | | `container().items.upsert(doc)` | `upsert(doc)` | | `container().item(id, pk).delete()` | `delete(id, pk)` | For the filter operators, use a simple operator convention: ```typescript // Exact match { productId: 'lysnrai' } // Comparison operators { syncVersion: { $gte: 5 } } { createdAt: { $gte: '2026-01-01', $lt: '2026-02-01' } } // Exists check (replaces NOT IS_DEFINED) { usedAt: { $exists: false } } // Array contains { tags: { $contains: 'important' } } ``` The Cosmos adapter translates these to SQL. The MongoDB adapter passes them directly (native MQL). The memory adapter does in-memory filtering. --- ## 5. Sprint 2: Storage Abstraction Layer **Package:** `@bytelyst/storage` **Effort:** 2 days **Files changed:** `packages/blob/src/blob.ts`, `src/cloud/blob_client.py`, `services/platform-service/src/modules/blob/` ### 5.1 Interface Design ```typescript // packages/storage/src/types.ts export interface StorageProvider { /** Get or create a bucket/container. */ getBucket(name: string): StorageBucket; /** Check if storage is configured. */ isConfigured(): boolean; } export interface StorageBucket { /** Upload a blob/object. */ upload(path: string, data: Buffer | ReadableStream, contentType?: string): Promise; /** Download a blob/object. */ download(path: string): Promise; /** Delete a blob/object. */ delete(path: string): Promise; /** Check if a blob/object exists. */ exists(path: string): Promise; /** List blobs/objects with optional prefix. */ list(prefix?: string): Promise; /** Generate a time-limited signed URL for direct access. */ getSignedUrl(path: string, opts: SignedUrlOptions): Promise; } export interface SignedUrlOptions { permissions: 'read' | 'write' | 'readwrite'; expiresInMinutes?: number; // default: 60 } export interface StorageObjectInfo { name: string; size: number; lastModified: Date; contentType?: string; } ``` ### 5.2 Provider Implementations ```typescript // packages/storage/src/providers/azure-blob.ts // Wraps existing @bytelyst/blob code — nearly 1:1 mapping // packages/storage/src/providers/s3.ts (future) // Uses @aws-sdk/client-s3 + @aws-sdk/s3-request-presigner // packages/storage/src/providers/r2.ts (future) // S3-compatible — extends S3 provider with Cloudflare-specific config // packages/storage/src/providers/memory.ts // In-memory Map for testing ``` ### 5.3 Migration The existing `@bytelyst/blob` package (162 lines) becomes the Azure Blob provider inside `@bytelyst/storage`. Consumers switch from: ```typescript // Before import { generateSasUrl, getContainerClient } from '@bytelyst/blob'; ``` ```typescript // After import { getStorage } from '@bytelyst/storage'; const bucket = getStorage().getBucket('audio'); const url = await bucket.getSignedUrl('user123/recording.wav', { permissions: 'read' }); ``` **Python equivalent:** Refactor `src/cloud/blob_client.py` to use a `StorageProvider` ABC with `AzureBlobProvider` implementation. --- ## 6. Sprint 3: LLM Provider Abstraction **Package:** `@bytelyst/llm` **Effort:** 2 days **Files changed:** `src/llm/text_cleaner.py`, `backend/src/clients/openai_client.py`, MindLyst `web/src/lib/llm.ts`, extraction-service config ### 6.1 Interface Design ```typescript // packages/llm/src/types.ts export interface LLMProvider { chatCompletion(req: ChatCompletionRequest): Promise; chatCompletionStream?(req: ChatCompletionRequest): AsyncIterable; isConfigured(): boolean; } export interface ChatCompletionRequest { messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>; temperature?: number; maxTokens?: number; model?: string; // override default model } export interface ChatCompletionResponse { content: string; usage?: { promptTokens: number; completionTokens: number }; } ``` ### 6.2 Key Insight: MindLyst Already Has This Pattern MindLyst `web/src/lib/llm.ts` already auto-detects Azure vs OpenAI based on env vars. This pattern should be promoted to a shared package. **Provider implementations:** - `AzureOpenAIProvider` — uses `api-key` header + deployment-scoped URL - `OpenAIProvider` — uses `Authorization: Bearer` header + model param - `GeminiProvider` — uses Google Generative AI SDK (future) - `OllamaProvider` — for local development (future) ### 6.3 Python Migration ```python # Before (text_cleaner.py) from openai import AzureOpenAI self._client = AzureOpenAI(azure_endpoint=endpoint, api_key=api_key, api_version="2024-10-21") # After from bytelyst.llm import create_llm_client self._client = create_llm_client() # reads LLM_PROVIDER, OPENAI_API_KEY, etc. # Returns OpenAI() or AzureOpenAI() based on config — same API surface ``` The `openai` Python SDK already has a common interface between `OpenAI` and `AzureOpenAI`. The abstraction is just a factory function that picks the right class. --- ## 7. Sprint 4: Secrets Manager Abstraction **Package:** Refactor existing `@bytelyst/config` **Effort:** 1 day **Files changed:** `packages/config/src/keyvault.ts`, `src/secrets/keyvault.py` ### 7.1 Key Insight: Already 90% Done The current `resolveKeyVaultSecrets()` already: - Skips if `AZURE_KEYVAULT_URL` is not set - Falls back to env vars for each secret - Logs warnings but doesn't throw **Refactor:** Rename to `resolveSecrets()` with provider dispatch: ```typescript // packages/config/src/secrets.ts export interface SecretsProvider { getSecret(name: string): Promise; } export async function resolveSecrets( secrets: SecretMapping[], opts?: { provider?: string }, ): Promise { const provider = opts?.provider || process.env.SECRETS_PROVIDER || 'env'; switch (provider) { case 'azure-keyvault': return resolveFromAzureKeyVault(secrets); // existing code case 'aws-secrets-manager': return resolveFromAWSSecretsManager(secrets); // future case 'gcp-secret-manager': return resolveFromGCPSecretManager(secrets); // future case 'doppler': return resolveFromDoppler(secrets); // future case 'env': default: return; // All secrets already in env — nothing to resolve } } ``` ### 7.2 Rename Azure-Prefixed Env Vars The current env vars have Azure-specific names. Add **generic aliases** that fall back to the Azure names: ```typescript // packages/config/src/env-aliases.ts export const ENV_ALIASES: Record = { // Generic name → fallback names (checked in order) 'BLOB_CONNECTION_STRING': ['AZURE_BLOB_CONNECTION_STRING'], 'BLOB_ACCOUNT_NAME': ['AZURE_BLOB_ACCOUNT_NAME'], 'BLOB_ACCOUNT_KEY': ['AZURE_BLOB_ACCOUNT_KEY'], 'SPEECH_KEY': ['AZURE_SPEECH_KEY'], 'SPEECH_REGION': ['AZURE_SPEECH_REGION'], 'LLM_API_KEY': ['AZURE_OPENAI_KEY', 'OPENAI_API_KEY'], 'LLM_ENDPOINT': ['AZURE_OPENAI_ENDPOINT', 'OPENAI_BASE_URL'], 'LLM_MODEL': ['AZURE_OPENAI_DEPLOYMENT', 'OPENAI_MODEL'], }; export function getEnv(name: string): string | undefined { if (process.env[name]) return process.env[name]; const aliases = ENV_ALIASES[name]; if (aliases) { for (const alias of aliases) { if (process.env[alias]) return process.env[alias]; } } return undefined; } ``` This means existing `.env` files with `AZURE_*` names continue to work. New deployments can use generic names. --- ## 8. Sprint 5: Speech Provider Abstraction **Package:** `@bytelyst/speech` **Effort:** 3–4 days **Files changed:** `src/audio/azure_stt.py`, `iosApp/Services/AzureSpeechTranscriber.swift` ### 8.1 Interface Design (Python) ```python # bytelyst/speech/types.py from abc import ABC, abstractmethod from typing import Callable, Optional class SpeechTranscriber(ABC): """Cloud-agnostic streaming speech-to-text interface.""" @abstractmethod def start(self, language: str = "en-US", languages: list[str] | None = None) -> None: """Start continuous recognition.""" @abstractmethod def stop(self) -> None: """Stop recognition.""" @abstractmethod def push_audio(self, data: bytes) -> None: """Push raw audio data (PCM 16-bit, 16kHz, mono).""" @abstractmethod def on_partial(self, callback: Callable[[str], None]) -> None: """Register callback for partial (interim) results.""" @abstractmethod def on_final(self, callback: Callable[[str], None]) -> None: """Register callback for final (committed) results.""" @abstractmethod def on_error(self, callback: Callable[[Exception], None]) -> None: """Register callback for errors.""" @abstractmethod def set_vocabulary(self, phrases: list[str]) -> None: """Set custom vocabulary / phrase hints.""" ``` ### 8.2 Provider Implementations ```python # bytelyst/speech/azure_provider.py # Wraps existing azure_stt.py code — PushAudioInputStream, SpeechRecognizer, events # bytelyst/speech/google_provider.py (future) # Uses google-cloud-speech streaming_recognize # bytelyst/speech/deepgram_provider.py (future) # Uses Deepgram WebSocket API # bytelyst/speech/whisper_provider.py (future) # Uses faster-whisper for local transcription (already in requirements.txt!) ``` ### 8.3 Swift Protocol (iOS) ```swift // Shared/Speech/SpeechTranscriberProtocol.swift protocol SpeechTranscriber { func start(language: String, languages: [String]?) async throws func stop() async func onPartial(_ handler: @escaping (String) -> Void) func onFinal(_ handler: @escaping (String) -> Void) func onError(_ handler: @escaping (Error) -> Void) func setVocabulary(_ phrases: [String]) } // Shared/Speech/AzureSpeechTranscriber.swift — existing code, implements protocol // Shared/Speech/AppleSpeechTranscriber.swift — future, uses Apple's SFSpeechRecognizer ``` ### 8.4 Note on Complexity Speech is the hardest abstraction because: - Azure Speech SDK has a unique push-stream architecture - Google Cloud Speech uses gRPC streaming - Deepgram uses WebSockets - Each has different audio format requirements and event models The abstraction hides these differences behind a unified push-audio + callback interface. The Azure implementation wraps existing code with zero behavior changes. --- ## 9. Sprint 6: Push Notification Abstraction **Package:** `@bytelyst/push` **Effort:** 1 day **Files changed:** Platform-service push-triggers module ### 9.1 Interface Design ```typescript export interface PushProvider { send(notification: PushNotification): Promise; sendBatch(notifications: PushNotification[]): Promise; } export interface PushNotification { deviceToken: string; platform: 'ios' | 'android' | 'web'; title: string; body: string; data?: Record; badge?: number; } ``` Implementations: `AzureNotificationHubProvider`, `FirebaseProvider` (future), `ExpoProvider` (for NomGap), `OneSignalProvider` (future). --- ## 10. Sprint 7: Monitoring & Telemetry Abstraction **Effort:** 0.5 days (mostly done already) The ecosystem already has cloud-agnostic monitoring: - **Custom telemetry** via `@bytelyst/telemetry-client` → platform-service → Cosmos - **Loki + Grafana** in `services/monitoring/` - **Health checks** via `/health` endpoints on all services **Remaining work:** - Remove `opencensus-ext-azure` from Python requirements (optional, only used for App Insights) - Ensure all structured logging uses `pino` (TS) or `structlog` (Python) — no Azure-specific loggers --- ## 11. Migration Effort After Refactor Once all sprints are complete, here's how much work each cloud migration scenario requires: ### Scenario: Switch DB from Cosmos to MongoDB Atlas | Step | Effort | Description | |------|--------|-------------| | Implement `MongoDatastoreProvider` | 1 day | ~200 lines — translate FilterMap to MongoDB find() | | Set `DB_PROVIDER=mongo` + `MONGO_URI=...` | 5 minutes | Config change | | Run data migration script | 2–4 hours | Export Cosmos JSON → import to MongoDB | | Run full test suite | 30 minutes | Verify all 1,029+ tests pass | | **Total** | **~1.5 days** | vs 3–5 weeks without abstraction | ### Scenario: Switch Storage from Azure Blob to S3 | Step | Effort | Description | |------|--------|-------------| | Implement `S3StorageProvider` | 0.5 day | ~100 lines | | Set `STORAGE_PROVIDER=s3` + `AWS_*` env vars | 5 minutes | Config change | | Migrate blobs | 1–2 hours | azcopy or rclone | | **Total** | **~0.5 days** | vs 2–3 days without abstraction | ### Scenario: Switch LLM from Azure OpenAI to OpenAI Direct | Step | Effort | Description | |------|--------|-------------| | Set `LLM_PROVIDER=openai` + `OPENAI_API_KEY=...` | 5 minutes | Config change only | | Remove `AZURE_OPENAI_*` env vars | 5 minutes | Cleanup | | **Total** | **10 minutes** | Already near-zero today | ### Scenario: Full Cloud Migration (Azure → AWS) | Step | Effort | Description | |------|--------|-------------| | Implement MongoDB/DynamoDB provider | 1–2 days | | | Implement S3 storage provider | 0.5 days | | | Implement AWS Secrets Manager provider | 0.5 days | | | Switch LLM to OpenAI direct | 10 minutes | | | Implement Google STT or AWS Transcribe | 2–3 days | Speech is still the hardest | | Implement SNS push provider | 0.5 days | | | Data migration + testing | 2–3 days | | | **Total** | **~7–10 days** | vs 4–8 weeks without abstraction | --- ## 12. Testing Strategy ### 12.1 Provider-Agnostic Tests Every repository test should work against **any** provider. The test setup picks the provider: ```typescript // Test setup: use in-memory provider import { setTestProvider } from '@bytelyst/datastore/testing'; beforeAll(() => { setTestProvider('memory'); // Fast, no network, deterministic }); ``` ### 12.2 Provider Integration Tests Separate test suites verify each provider works correctly: ``` __tests__/ datastore/ cosmos.integration.test.ts # Runs against real Cosmos (CI only) mongo.integration.test.ts # Runs against real MongoDB (CI only) memory.test.ts # Always runs — verifies memory provider ``` ### 12.3 Migration Verification Checklist For each sprint, before merging: 1. All existing tests pass (no regressions) 2. New interface tests pass with all implemented providers 3. Manual smoke test against Azure (dev environment) 4. No new `@azure/*` imports in application code (only in provider files) ### 12.4 CI Gate Add a lint rule to prevent direct Azure SDK imports outside of provider directories: ```bash # scripts/check-cloud-agnostic.sh # Fail if any file outside packages/*/providers/ imports @azure/* rg '@azure/' services/ dashboards/ --include='*.ts' \ --glob='!**/providers/**' --glob='!**/node_modules/**' \ && echo "FAIL: Direct Azure SDK import found outside provider layer" && exit 1 \ || echo "PASS: No direct Azure imports in application code" ``` --- ## 13. Env Var Naming Convention ### Current (Azure-specific) ```bash COSMOS_ENDPOINT=https://cosmos-mywisprai.documents.azure.com:443/ COSMOS_KEY=... COSMOS_DATABASE=lysnrai AZURE_BLOB_CONNECTION_STRING=... AZURE_BLOB_ACCOUNT_NAME=bytelystblobs AZURE_BLOB_ACCOUNT_KEY=... AZURE_OPENAI_ENDPOINT=... AZURE_OPENAI_KEY=... AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini AZURE_SPEECH_KEY=... AZURE_SPEECH_REGION=eastus AZURE_KEYVAULT_URL=... ``` ### Target (generic with Azure fallbacks) ```bash # ── Provider Selection ──────────────────────────── DB_PROVIDER=cosmos # cosmos | mongo | memory STORAGE_PROVIDER=azure # azure | s3 | r2 | memory LLM_PROVIDER=azure # azure | openai | gemini SECRETS_PROVIDER=azure-keyvault # azure-keyvault | aws | doppler | env SPEECH_PROVIDER=azure # azure | google | deepgram | whisper PUSH_PROVIDER=azure-nh # azure-nh | firebase | expo # ── Database (provider-specific) ────────────────── # Cosmos (when DB_PROVIDER=cosmos): COSMOS_ENDPOINT=... COSMOS_KEY=... COSMOS_DATABASE=lysnrai # MongoDB (when DB_PROVIDER=mongo): # MONGO_URI=mongodb+srv://... # ── Storage (provider-specific) ─────────────────── # Azure (when STORAGE_PROVIDER=azure): AZURE_BLOB_CONNECTION_STRING=... # S3 (when STORAGE_PROVIDER=s3): # AWS_ACCESS_KEY_ID=... # AWS_SECRET_ACCESS_KEY=... # S3_BUCKET_PREFIX=bytelyst- # ── LLM (provider-specific) ────────────────────── # Azure OpenAI: AZURE_OPENAI_ENDPOINT=... AZURE_OPENAI_KEY=... AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini # OpenAI direct: # OPENAI_API_KEY=... # OPENAI_MODEL=gpt-4o-mini # ── Secrets (optional) ─────────────────────────── AZURE_KEYVAULT_URL=... # only if SECRETS_PROVIDER=azure-keyvault # ── Speech ──────────────────────────────────────── AZURE_SPEECH_KEY=... AZURE_SPEECH_REGION=eastus ``` **Backward compatibility:** All existing `AZURE_*` env vars continue to work. The generic `*_PROVIDER` vars are additive. --- ## 14. Risk Mitigation | Risk | Mitigation | |------|-----------| | **FilterMap can't express complex Cosmos SQL** | Add `rawQuery()` escape hatch for edge cases. Track usage — if >5% of queries need it, expand FilterMap operators | | **Performance regression from abstraction layer** | Benchmark critical queries before/after. The abstraction adds one function call — negligible | | **Team unfamiliar with new patterns** | Each sprint includes updating AGENTS.md with new conventions. Old pattern (direct Cosmos) still works during migration | | **In-memory provider behaves differently** | Integration test suite runs against real Cosmos in CI. Memory provider is for unit tests only | | **Stale data during DB migration** | Use dual-write pattern: write to both old and new provider during transition. Read from new, fall back to old | | **Sprint 1 takes too long** | The 38 repository files can be migrated incrementally — even 5 files at a time is progress. Old and new patterns coexist | --- ## Appendix: Interface Specifications ### A.1 `@bytelyst/datastore` — Package Structure ``` packages/datastore/ ├── src/ │ ├── index.ts # Public exports │ ├── types.ts # All interfaces (DocumentCollection, DatastoreProvider, etc.) │ ├── factory.ts # createDatastoreProvider() factory │ ├── filter.ts # FilterMap → provider-specific query translation │ ├── providers/ │ │ ├── cosmos.ts # CosmosDatastoreProvider + CosmosCollection │ │ ├── mongo.ts # MongoDatastoreProvider + MongoCollection (stub) │ │ └── memory.ts # MemoryDatastoreProvider + MemoryCollection │ └── testing.ts # Test helpers (setTestProvider, seedCollection, etc.) ├── package.json # peer deps: @azure/cosmos (optional), mongodb (optional) ├── tsconfig.json └── vitest.config.ts ``` ### A.2 `@bytelyst/storage` — Package Structure ``` packages/storage/ ├── src/ │ ├── index.ts │ ├── types.ts # StorageProvider, StorageBucket, SignedUrlOptions │ ├── factory.ts # createStorageProvider() │ ├── providers/ │ │ ├── azure-blob.ts # Wraps existing @bytelyst/blob code │ │ ├── s3.ts # AWS S3 (stub) │ │ └── memory.ts # In-memory for testing │ └── testing.ts ├── package.json └── tsconfig.json ``` ### A.3 `@bytelyst/llm` — Package Structure ``` packages/llm/ ├── src/ │ ├── index.ts │ ├── types.ts # LLMProvider, ChatCompletionRequest/Response │ ├── factory.ts # createLLMProvider() │ ├── providers/ │ │ ├── azure-openai.ts # AzureOpenAI endpoint + api-key auth │ │ ├── openai.ts # OpenAI direct + Bearer auth │ │ └── gemini.ts # Google Generative AI (stub) │ └── testing.ts # MockLLMProvider for tests ├── package.json └── tsconfig.json ``` ### A.4 Complete Interface: FilterMap Operators ```typescript // Exact match { field: value } // Comparison { field: { $gt: value } } // > { field: { $gte: value } } // >= { field: { $lt: value } } // < { field: { $lte: value } } // <= { field: { $ne: value } } // != // Existence { field: { $exists: true } } // IS_DEFINED(c.field) { field: { $exists: false } } // NOT IS_DEFINED(c.field) // String { field: { $startsWith: 'prefix' } } { field: { $contains: 'substr' } } // Array { field: { $contains: value } } // ARRAY_CONTAINS { field: { $in: [v1, v2, v3] } } // IN operator // Logical (for complex queries) { $or: [{ field1: v1 }, { field2: v2 }] } ``` **Cosmos adapter** translates each operator to SQL: - `{ $gte: v }` → `c.field >= @pN` - `{ $exists: false }` → `NOT IS_DEFINED(c.field)` - `{ $contains: v }` on array → `ARRAY_CONTAINS(c.field, @pN)` - `{ $in: [...] }` → `c.field IN (@pN, @pM, ...)` **MongoDB adapter** passes operators natively (MQL uses the same `$gte`, `$exists` syntax). **Memory adapter** evaluates operators with simple JS comparisons. --- ## Summary | Sprint | What | Days | After This Sprint... | |--------|------|------|---------------------| | 1 | Database abstraction | 5–7 | DB swap = implement 1 adapter (~200 LOC) + config change | | 2 | Storage abstraction | 2 | Blob swap = implement 1 adapter (~100 LOC) + config change | | 3 | LLM abstraction | 2 | LLM swap = config change only (10 minutes) | | 4 | Secrets abstraction | 1 | Secrets swap = config change only | | 5 | Speech abstraction | 3–4 | Speech swap = implement 1 adapter (~300 LOC) | | 6 | Push abstraction | 1 | Push swap = implement 1 adapter (~50 LOC) | | 7 | Monitoring cleanup | 0.5 | Already cloud-agnostic | | **Total** | | **~15–17 days** | **Full cloud migration = ~7–10 days instead of 4–8 weeks** | The key insight: **~80% of migration effort is in Sprint 1 (database)**. If you only do one sprint, do that one. Everything else is comparatively easy. --- *Document generated by automated codebase analysis. Companion to `CLOUD_PROVIDER_MIGRATION_ANALYSIS.md`. Review as the codebase evolves.*