40 KiB
Cloud-Agnostic Refactor Roadmap — ByteLyst Ecosystem
Author: AI Analysis (Cascade) Date: 2026-03-01 Companion doc:
CLOUD_PROVIDER_MIGRATION_ANALYSIS.mdGoal: Refactor the codebase so it continues to work on Azure today, but switching to any other cloud provider requires minimum effort (days, not weeks).
Table of Contents
- Philosophy
- Current State vs Target State
- Sprint Plan Overview
- Sprint 1: Database Abstraction Layer
- Sprint 2: Storage Abstraction Layer
- Sprint 3: LLM Provider Abstraction
- Sprint 4: Secrets Manager Abstraction
- Sprint 5: Speech Provider Abstraction
- Sprint 6: Push Notification Abstraction
- Sprint 7: Monitoring & Telemetry Abstraction
- Migration Effort After Refactor
- Testing Strategy
- Env Var Naming Convention
- Risk Mitigation
- Appendix: Interface Specifications
1. Philosophy
Core Principle: Provider-Agnostic Interfaces, Provider-Specific Implementations
Application Code (routes, business logic)
│
▼
@bytelyst/* interfaces ◄── Cloud-agnostic contracts
│
▼
Provider implementations ◄── Azure today, swap tomorrow
├── cosmos-provider/ (Azure Cosmos DB)
├── mongo-provider/ (MongoDB Atlas — future)
├── s3-provider/ (AWS S3 — future)
└── ...
Design Rules
- Application code NEVER imports cloud SDKs — only
@bytelyst/*interfaces - Provider chosen at startup via env var —
DB_PROVIDER=cosmos,STORAGE_PROVIDER=azure, etc. - All interfaces have an in-memory mock — for testing without any cloud dependency
- Zero breaking changes — every sprint keeps all existing tests passing
- Incremental adoption — modules migrate one at a time, old and new patterns coexist
What This Is NOT
- This is not a migration to another cloud — Azure continues to be the production provider
- This is not a rewrite — it's a series of refactors that insert interfaces between app code and cloud SDKs
- This is not over-engineering — each interface is thin (30–60 lines) and directly maps to patterns already in the codebase
2. Current State vs Target State
Current: Direct Azure SDK Usage
38 repository.ts files
┌──────────────────────┐
routes.ts ────────► │ container() │
│ .items.query(SQL) │ ◄── @azure/cosmos types leak everywhere
│ .items.create(doc) │
│ .item(id,pk).read() │
└──────────────────────┘
│
▼
@bytelyst/cosmos (client.ts)
│
▼
@azure/cosmos SDK
Problems:
- 38 platform-service repository files write raw Cosmos SQL queries
- 6 additional repository files in dashboards + MindLyst web
- Blob, Speech, OpenAI all have direct Azure SDK imports
- Switching DB means rewriting 44+ files
Target: Provider-Agnostic Interfaces
38 repository.ts files
┌──────────────────────────┐
routes.ts ────────► │ collection.findMany({ │
│ filter: {productId}, │ ◄── Cloud-agnostic API
│ sort: {createdAt: -1}, │
│ limit: 20, │
│ }) │
└──────────────────────────┘
│
▼
@bytelyst/datastore (interface)
│
┌─────────┼─────────┐
▼ ▼ ▼
CosmosAdapter MongoAdapter MemoryAdapter
(Azure) (MongoDB) (Testing)
│
▼
@azure/cosmos SDK
Benefits:
- Repositories use a generic query API — no SQL strings, no Azure types
- Switching provider = implement a new adapter (~200 lines) + change env var
- In-memory adapter makes tests fast and cloud-free
- Azure continues to work exactly as before
3. Sprint Plan Overview
| Sprint | Package / Scope | Effort | Files Changed | Risk |
|---|---|---|---|---|
| 1 | @bytelyst/datastore — DB abstraction |
5–7 days | 44 repository files + 1 new package | Medium |
| 2 | @bytelyst/storage — Blob/Object abstraction |
2 days | 3 files + 1 new package | Low |
| 3 | @bytelyst/llm — LLM provider abstraction |
2 days | 4 files + 1 new package | Low |
| 4 | @bytelyst/secrets — Secrets manager abstraction |
1 day | 2 files (refactor existing) | Very Low |
| 5 | @bytelyst/speech — Speech STT abstraction |
3–4 days | 3 files + 1 new package | Medium |
| 6 | @bytelyst/push — Push notification abstraction |
1 day | 1 file + 1 new package | Very Low |
| 7 | Monitoring/Telemetry cleanup | 0.5 days | Already done (custom telemetry) | None |
| Total | ~15–17 days | ~55 files |
Priority Order
Sprint 1 (DB) ──► Sprint 2 (Storage) ──► Sprint 3 (LLM) ──► Sprint 4 (Secrets)
▲ HIGHEST ROI EASY EASY TRIVIAL
│
└── 80% of migration effort lives here. Do this first.
Sprint 5 (Speech) ──► Sprint 6 (Push) ──► Sprint 7 (Monitoring)
MEDIUM LOW PRIORITY ALREADY DONE
4. Sprint 1: Database Abstraction Layer
Package: @bytelyst/datastore
Effort: 5–7 days
This is the most important sprint — it eliminates 80% of cloud lock-in.
4.1 Interface Design
// packages/datastore/src/types.ts
/** A cloud-agnostic document collection (like a Cosmos container or Mongo collection). */
export interface DocumentCollection<T extends BaseDocument = BaseDocument> {
/** Find a single document by ID + partition key. */
findById(id: string, partitionKey: string): Promise<T | null>;
/** Find multiple documents matching a filter. */
findMany(opts: FindManyOptions): Promise<T[]>;
/** Find one document matching a filter. */
findOne(opts: FindOneOptions): Promise<T | null>;
/** Count documents matching a filter. */
count(filter: FilterMap): Promise<number>;
/** Insert a new document. */
create(doc: T): Promise<T>;
/** Replace an entire document (full overwrite). */
replace(id: string, partitionKey: string, doc: T): Promise<T>;
/** Upsert: create if not exists, replace if exists. */
upsert(doc: T): Promise<T>;
/** Delete a document by ID + partition key. */
delete(id: string, partitionKey: string): Promise<boolean>;
/** Run an aggregation (COUNT, SUM, GROUP BY). */
aggregate<R = unknown>(opts: AggregateOptions): Promise<R[]>;
}
export interface BaseDocument {
id: string;
[key: string]: unknown;
}
export interface FindManyOptions {
filter: FilterMap;
sort?: SortMap;
limit?: number;
offset?: number;
partitionKey?: string;
}
export interface FindOneOptions {
filter: FilterMap;
partitionKey?: string;
}
export type FilterMap = Record<string, unknown>;
export type SortMap = Record<string, 1 | -1>; // 1 = ASC, -1 = DESC
export interface AggregateOptions {
filter: FilterMap;
groupBy?: string[];
count?: string; // alias for COUNT(1)
sum?: string; // field to SUM
}
/** Factory that creates collections — one per provider. */
export interface DatastoreProvider {
collection<T extends BaseDocument>(name: string): DocumentCollection<T>;
initialize?(configs: Record<string, CollectionConfig>): Promise<void>;
close?(): Promise<void>;
}
export interface CollectionConfig {
partitionKeyPath: string;
defaultTtl?: number | null;
}
4.2 Cosmos Adapter (keeps everything working today)
// packages/datastore/src/providers/cosmos.ts
import type { Container } from '@azure/cosmos';
import type { BaseDocument, DocumentCollection, FindManyOptions, FilterMap, ... } from '../types.js';
export class CosmosCollection<T extends BaseDocument> implements DocumentCollection<T> {
constructor(private container: Container) {}
async findById(id: string, partitionKey: string): Promise<T | null> {
try {
const { resource } = await this.container.item(id, partitionKey).read<T>();
return resource ?? null;
} catch { return null; }
}
async findMany(opts: FindManyOptions): Promise<T[]> {
const { sql, params } = buildSqlQuery(opts); // ◄── converts FilterMap → Cosmos SQL
const { resources } = await this.container
.items.query<T>({ query: sql, parameters: params })
.fetchAll();
return resources;
}
async create(doc: T): Promise<T> {
const { resource } = await this.container.items.create(doc);
return resource as T;
}
async replace(id: string, partitionKey: string, doc: T): Promise<T> {
const { resource } = await this.container.item(id, partitionKey).replace<T>(doc);
return resource as T;
}
async upsert(doc: T): Promise<T> {
const { resource } = await this.container.items.upsert<T>(doc);
return resource as T;
}
async delete(id: string, partitionKey: string): Promise<boolean> {
try {
await this.container.item(id, partitionKey).delete();
return true;
} catch { return false; }
}
// ... count(), findOne(), aggregate()
}
/** Convert a FilterMap to Cosmos SQL. */
function buildSqlQuery(opts: FindManyOptions): { sql: string; params: SqlParam[] } {
// { productId: 'x', userId: 'y' }
// → "SELECT * FROM c WHERE c.productId = @p0 AND c.userId = @p1 ORDER BY c.createdAt DESC OFFSET 0 LIMIT 20"
// This is a mechanical translation — no query language exposed to application code.
}
4.3 In-Memory Adapter (for testing)
// packages/datastore/src/providers/memory.ts
export class MemoryCollection<T extends BaseDocument> implements DocumentCollection<T> {
private docs: Map<string, T> = new Map();
async findById(id: string): Promise<T | null> {
return this.docs.get(id) ?? null;
}
async findMany(opts: FindManyOptions): Promise<T[]> {
let results = [...this.docs.values()].filter(doc => matchesFilter(doc, opts.filter));
if (opts.sort) results = sortDocs(results, opts.sort);
if (opts.offset) results = results.slice(opts.offset);
if (opts.limit) results = results.slice(0, opts.limit);
return results;
}
async create(doc: T): Promise<T> {
this.docs.set(doc.id, doc);
return doc;
}
// ... etc
}
4.4 MongoDB Adapter (future — ready to implement when needed)
// packages/datastore/src/providers/mongo.ts (STUB — implement when migrating)
import type { Collection as MongoCollection } from 'mongodb';
import type { BaseDocument, DocumentCollection, FindManyOptions } from '../types.js';
export class MongoDocumentCollection<T extends BaseDocument> implements DocumentCollection<T> {
constructor(private collection: MongoCollection<T>) {}
async findById(id: string): Promise<T | null> {
return this.collection.findOne({ _id: id } as any) as Promise<T | null>;
}
async findMany(opts: FindManyOptions): Promise<T[]> {
let cursor = this.collection.find(opts.filter);
if (opts.sort) cursor = cursor.sort(opts.sort);
if (opts.offset) cursor = cursor.skip(opts.offset);
if (opts.limit) cursor = cursor.limit(opts.limit);
return cursor.toArray() as Promise<T[]>;
}
// ... etc
}
4.5 How Repository Files Change
Before (Cosmos SQL in every file):
// services/platform-service/src/modules/flags/repository.ts
import { getContainer } from '../../lib/cosmos.js';
function container() {
return getContainer('feature_flags');
}
export async function list(productId: string): Promise<FeatureFlagDoc[]> {
const { resources } = await container()
.items.query<FeatureFlagDoc>({
query: 'SELECT * FROM c WHERE c.productId = @productId ORDER BY c.key ASC',
parameters: [{ name: '@productId', value: productId }],
})
.fetchAll();
return resources;
}
export async function getByKey(key: string, productId: string): Promise<FeatureFlagDoc | null> {
const { resources } = await container()
.items.query<FeatureFlagDoc>({
query: 'SELECT * FROM c WHERE c.productId = @productId AND c.key = @key',
parameters: [
{ name: '@productId', value: productId },
{ name: '@key', value: key },
],
})
.fetchAll();
return resources[0] ?? null;
}
export async function create(doc: FeatureFlagDoc): Promise<FeatureFlagDoc> {
const { resource } = await container().items.create(doc);
return resource as FeatureFlagDoc;
}
After (cloud-agnostic):
// services/platform-service/src/modules/flags/repository.ts
import { getCollection } from '../../lib/datastore.js';
import type { FeatureFlagDoc } from './types.js';
function collection() {
return getCollection<FeatureFlagDoc>('feature_flags');
}
export async function list(productId: string): Promise<FeatureFlagDoc[]> {
return collection().findMany({
filter: { productId },
sort: { key: 1 },
});
}
export async function getByKey(key: string, productId: string): Promise<FeatureFlagDoc | null> {
return collection().findOne({
filter: { productId, key },
});
}
export async function create(doc: FeatureFlagDoc): Promise<FeatureFlagDoc> {
return collection().create(doc);
}
Key observations:
- No SQL strings
- No
@azure/cosmostypes - No
.items.query().fetchAll()chaining - The
getCollection()function returns the right provider based onDB_PROVIDERenv var - All existing behavior is preserved — the Cosmos adapter generates the same SQL under the hood
4.6 Service Wiring
// services/platform-service/src/lib/datastore.ts (replaces lib/cosmos.ts)
import { createDatastoreProvider } from '@bytelyst/datastore';
import type { DocumentCollection, BaseDocument } from '@bytelyst/datastore';
let _provider: ReturnType<typeof createDatastoreProvider> | null = null;
export function getProvider() {
if (!_provider) {
_provider = createDatastoreProvider(); // reads DB_PROVIDER env var
}
return _provider;
}
export function getCollection<T extends BaseDocument>(name: string): DocumentCollection<T> {
return getProvider().collection<T>(name);
}
// packages/datastore/src/factory.ts
export function createDatastoreProvider(): DatastoreProvider {
const provider = process.env.DB_PROVIDER || 'cosmos';
switch (provider) {
case 'cosmos':
return new CosmosDatastoreProvider(); // uses existing COSMOS_ENDPOINT, COSMOS_KEY
case 'mongo':
return new MongoDatastoreProvider(); // uses MONGO_URI
case 'memory':
return new MemoryDatastoreProvider(); // no config needed
default:
throw new Error(`Unknown DB_PROVIDER: ${provider}`);
}
}
4.7 Migration Plan for 38 Repository Files
Migrate in batches, one module per commit. Each commit:
- Update the repository file to use
getCollection()instead ofgetContainer() - Replace SQL queries with
findMany()/findOne()/count()/aggregate() - Run the module's test file — must pass
- Commit:
refactor(module-name): migrate to datastore abstraction
Batch order (simplest first, complex last):
| Batch | Modules | Complexity | Notes |
|---|---|---|---|
| 1 | flags, plans, settings, changelog, products | Simple CRUD | 5 files, warmup |
| 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | 5 files |
| 3 | items, comments, votes, brains, reflections | CRUD + filter combos | 5 files |
| 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | 5 files |
| 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | 5 files |
| 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | 4 files |
| 7 | timers, shared-timers, routines, households | Sync logic, batch ops | 4 files |
| 8 | fasting-sessions, fasting-protocols, meal-log, social-fasting, daily-briefs, streaks, push-triggers, impersonation, status, memory, analytics, waitlist | Product-specific + remaining | 12 files |
| 9 | Dashboard cosmos clients (admin-web, MindLyst web) | Direct @azure/cosmos |
6 files |
| 10 | Python clients (desktop cosmos, backend cosmos) | azure.cosmos → abstracted |
2 files |
4.8 Handling Complex Queries
Some repository files use advanced Cosmos SQL features. Here's how the interface handles them:
| Cosmos SQL Pattern | Datastore Interface Equivalent |
|---|---|
SELECT * FROM c WHERE c.x = @v |
findMany({ filter: { x: v } }) |
SELECT * FROM c WHERE c.x = @v AND c.y = @w |
findMany({ filter: { x: v, y: w } }) |
ORDER BY c.x ASC |
findMany({ sort: { x: 1 } }) |
ORDER BY c.x DESC |
findMany({ sort: { x: -1 } }) |
OFFSET @o LIMIT @l |
findMany({ offset: o, limit: l }) |
SELECT VALUE COUNT(1) FROM c WHERE ... |
count({ filter }) |
SELECT c.plan, COUNT(1) AS cnt ... GROUP BY c.plan |
aggregate({ filter, groupBy: ['plan'], count: 'cnt' }) |
NOT IS_DEFINED(c.usedAt) |
findMany({ filter: { usedAt: { $exists: false } } }) |
c.x >= @v |
findMany({ filter: { x: { $gte: v } } }) |
ARRAY_CONTAINS(c.tags, @tag) |
findMany({ filter: { tags: { $contains: tag } } }) |
container().item(id, pk).read() |
findById(id, pk) |
container().items.create(doc) |
create(doc) |
container().item(id, pk).replace(doc) |
replace(id, pk, doc) |
container().items.upsert(doc) |
upsert(doc) |
container().item(id, pk).delete() |
delete(id, pk) |
For the filter operators, use a simple operator convention:
// Exact match
{ productId: 'lysnrai' }
// Comparison operators
{ syncVersion: { $gte: 5 } }
{ createdAt: { $gte: '2026-01-01', $lt: '2026-02-01' } }
// Exists check (replaces NOT IS_DEFINED)
{ usedAt: { $exists: false } }
// Array contains
{ tags: { $contains: 'important' } }
The Cosmos adapter translates these to SQL. The MongoDB adapter passes them directly (native MQL). The memory adapter does in-memory filtering.
5. Sprint 2: Storage Abstraction Layer
Package: @bytelyst/storage
Effort: 2 days
Files changed: packages/blob/src/blob.ts, src/cloud/blob_client.py, services/platform-service/src/modules/blob/
5.1 Interface Design
// packages/storage/src/types.ts
export interface StorageProvider {
/** Get or create a bucket/container. */
getBucket(name: string): StorageBucket;
/** Check if storage is configured. */
isConfigured(): boolean;
}
export interface StorageBucket {
/** Upload a blob/object. */
upload(path: string, data: Buffer | ReadableStream, contentType?: string): Promise<void>;
/** Download a blob/object. */
download(path: string): Promise<Buffer>;
/** Delete a blob/object. */
delete(path: string): Promise<boolean>;
/** Check if a blob/object exists. */
exists(path: string): Promise<boolean>;
/** List blobs/objects with optional prefix. */
list(prefix?: string): Promise<StorageObjectInfo[]>;
/** Generate a time-limited signed URL for direct access. */
getSignedUrl(path: string, opts: SignedUrlOptions): Promise<string>;
}
export interface SignedUrlOptions {
permissions: 'read' | 'write' | 'readwrite';
expiresInMinutes?: number; // default: 60
}
export interface StorageObjectInfo {
name: string;
size: number;
lastModified: Date;
contentType?: string;
}
5.2 Provider Implementations
// packages/storage/src/providers/azure-blob.ts
// Wraps existing @bytelyst/blob code — nearly 1:1 mapping
// packages/storage/src/providers/s3.ts (future)
// Uses @aws-sdk/client-s3 + @aws-sdk/s3-request-presigner
// packages/storage/src/providers/r2.ts (future)
// S3-compatible — extends S3 provider with Cloudflare-specific config
// packages/storage/src/providers/memory.ts
// In-memory Map<string, Buffer> for testing
5.3 Migration
The existing @bytelyst/blob package (162 lines) becomes the Azure Blob provider inside @bytelyst/storage. Consumers switch from:
// Before
import { generateSasUrl, getContainerClient } from '@bytelyst/blob';
// After
import { getStorage } from '@bytelyst/storage';
const bucket = getStorage().getBucket('audio');
const url = await bucket.getSignedUrl('user123/recording.wav', { permissions: 'read' });
Python equivalent: Refactor src/cloud/blob_client.py to use a StorageProvider ABC with AzureBlobProvider implementation.
6. Sprint 3: LLM Provider Abstraction
Package: @bytelyst/llm
Effort: 2 days
Files changed: src/llm/text_cleaner.py, backend/src/clients/openai_client.py, MindLyst web/src/lib/llm.ts, extraction-service config
6.1 Interface Design
// packages/llm/src/types.ts
export interface LLMProvider {
chatCompletion(req: ChatCompletionRequest): Promise<ChatCompletionResponse>;
chatCompletionStream?(req: ChatCompletionRequest): AsyncIterable<string>;
isConfigured(): boolean;
}
export interface ChatCompletionRequest {
messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>;
temperature?: number;
maxTokens?: number;
model?: string; // override default model
}
export interface ChatCompletionResponse {
content: string;
usage?: { promptTokens: number; completionTokens: number };
}
6.2 Key Insight: MindLyst Already Has This Pattern
MindLyst web/src/lib/llm.ts already auto-detects Azure vs OpenAI based on env vars. This pattern should be promoted to a shared package.
Provider implementations:
AzureOpenAIProvider— usesapi-keyheader + deployment-scoped URLOpenAIProvider— usesAuthorization: Bearerheader + model paramGeminiProvider— uses Google Generative AI SDK (future)OllamaProvider— for local development (future)
6.3 Python Migration
# Before (text_cleaner.py)
from openai import AzureOpenAI
self._client = AzureOpenAI(azure_endpoint=endpoint, api_key=api_key, api_version="2024-10-21")
# After
from bytelyst.llm import create_llm_client
self._client = create_llm_client() # reads LLM_PROVIDER, OPENAI_API_KEY, etc.
# Returns OpenAI() or AzureOpenAI() based on config — same API surface
The openai Python SDK already has a common interface between OpenAI and AzureOpenAI. The abstraction is just a factory function that picks the right class.
7. Sprint 4: Secrets Manager Abstraction
Package: Refactor existing @bytelyst/config
Effort: 1 day
Files changed: packages/config/src/keyvault.ts, src/secrets/keyvault.py
7.1 Key Insight: Already 90% Done
The current resolveKeyVaultSecrets() already:
- Skips if
AZURE_KEYVAULT_URLis not set - Falls back to env vars for each secret
- Logs warnings but doesn't throw
Refactor: Rename to resolveSecrets() with provider dispatch:
// packages/config/src/secrets.ts
export interface SecretsProvider {
getSecret(name: string): Promise<string | null>;
}
export async function resolveSecrets(
secrets: SecretMapping[],
opts?: { provider?: string },
): Promise<void> {
const provider = opts?.provider || process.env.SECRETS_PROVIDER || 'env';
switch (provider) {
case 'azure-keyvault':
return resolveFromAzureKeyVault(secrets); // existing code
case 'aws-secrets-manager':
return resolveFromAWSSecretsManager(secrets); // future
case 'gcp-secret-manager':
return resolveFromGCPSecretManager(secrets); // future
case 'doppler':
return resolveFromDoppler(secrets); // future
case 'env':
default:
return; // All secrets already in env — nothing to resolve
}
}
7.2 Rename Azure-Prefixed Env Vars
The current env vars have Azure-specific names. Add generic aliases that fall back to the Azure names:
// packages/config/src/env-aliases.ts
export const ENV_ALIASES: Record<string, string[]> = {
// Generic name → fallback names (checked in order)
'BLOB_CONNECTION_STRING': ['AZURE_BLOB_CONNECTION_STRING'],
'BLOB_ACCOUNT_NAME': ['AZURE_BLOB_ACCOUNT_NAME'],
'BLOB_ACCOUNT_KEY': ['AZURE_BLOB_ACCOUNT_KEY'],
'SPEECH_KEY': ['AZURE_SPEECH_KEY'],
'SPEECH_REGION': ['AZURE_SPEECH_REGION'],
'LLM_API_KEY': ['AZURE_OPENAI_KEY', 'OPENAI_API_KEY'],
'LLM_ENDPOINT': ['AZURE_OPENAI_ENDPOINT', 'OPENAI_BASE_URL'],
'LLM_MODEL': ['AZURE_OPENAI_DEPLOYMENT', 'OPENAI_MODEL'],
};
export function getEnv(name: string): string | undefined {
if (process.env[name]) return process.env[name];
const aliases = ENV_ALIASES[name];
if (aliases) {
for (const alias of aliases) {
if (process.env[alias]) return process.env[alias];
}
}
return undefined;
}
This means existing .env files with AZURE_* names continue to work. New deployments can use generic names.
8. Sprint 5: Speech Provider Abstraction
Package: @bytelyst/speech
Effort: 3–4 days
Files changed: src/audio/azure_stt.py, iosApp/Services/AzureSpeechTranscriber.swift
8.1 Interface Design (Python)
# bytelyst/speech/types.py
from abc import ABC, abstractmethod
from typing import Callable, Optional
class SpeechTranscriber(ABC):
"""Cloud-agnostic streaming speech-to-text interface."""
@abstractmethod
def start(self, language: str = "en-US", languages: list[str] | None = None) -> None:
"""Start continuous recognition."""
@abstractmethod
def stop(self) -> None:
"""Stop recognition."""
@abstractmethod
def push_audio(self, data: bytes) -> None:
"""Push raw audio data (PCM 16-bit, 16kHz, mono)."""
@abstractmethod
def on_partial(self, callback: Callable[[str], None]) -> None:
"""Register callback for partial (interim) results."""
@abstractmethod
def on_final(self, callback: Callable[[str], None]) -> None:
"""Register callback for final (committed) results."""
@abstractmethod
def on_error(self, callback: Callable[[Exception], None]) -> None:
"""Register callback for errors."""
@abstractmethod
def set_vocabulary(self, phrases: list[str]) -> None:
"""Set custom vocabulary / phrase hints."""
8.2 Provider Implementations
# bytelyst/speech/azure_provider.py
# Wraps existing azure_stt.py code — PushAudioInputStream, SpeechRecognizer, events
# bytelyst/speech/google_provider.py (future)
# Uses google-cloud-speech streaming_recognize
# bytelyst/speech/deepgram_provider.py (future)
# Uses Deepgram WebSocket API
# bytelyst/speech/whisper_provider.py (future)
# Uses faster-whisper for local transcription (already in requirements.txt!)
8.3 Swift Protocol (iOS)
// Shared/Speech/SpeechTranscriberProtocol.swift
protocol SpeechTranscriber {
func start(language: String, languages: [String]?) async throws
func stop() async
func onPartial(_ handler: @escaping (String) -> Void)
func onFinal(_ handler: @escaping (String) -> Void)
func onError(_ handler: @escaping (Error) -> Void)
func setVocabulary(_ phrases: [String])
}
// Shared/Speech/AzureSpeechTranscriber.swift — existing code, implements protocol
// Shared/Speech/AppleSpeechTranscriber.swift — future, uses Apple's SFSpeechRecognizer
8.4 Note on Complexity
Speech is the hardest abstraction because:
- Azure Speech SDK has a unique push-stream architecture
- Google Cloud Speech uses gRPC streaming
- Deepgram uses WebSockets
- Each has different audio format requirements and event models
The abstraction hides these differences behind a unified push-audio + callback interface. The Azure implementation wraps existing code with zero behavior changes.
9. Sprint 6: Push Notification Abstraction
Package: @bytelyst/push
Effort: 1 day
Files changed: Platform-service push-triggers module
9.1 Interface Design
export interface PushProvider {
send(notification: PushNotification): Promise<PushResult>;
sendBatch(notifications: PushNotification[]): Promise<PushResult[]>;
}
export interface PushNotification {
deviceToken: string;
platform: 'ios' | 'android' | 'web';
title: string;
body: string;
data?: Record<string, string>;
badge?: number;
}
Implementations: AzureNotificationHubProvider, FirebaseProvider (future), ExpoProvider (for NomGap), OneSignalProvider (future).
10. Sprint 7: Monitoring & Telemetry Abstraction
Effort: 0.5 days (mostly done already)
The ecosystem already has cloud-agnostic monitoring:
- Custom telemetry via
@bytelyst/telemetry-client→ platform-service → Cosmos - Loki + Grafana in
services/monitoring/ - Health checks via
/healthendpoints on all services
Remaining work:
- Remove
opencensus-ext-azurefrom Python requirements (optional, only used for App Insights) - Ensure all structured logging uses
pino(TS) orstructlog(Python) — no Azure-specific loggers
11. Migration Effort After Refactor
Once all sprints are complete, here's how much work each cloud migration scenario requires:
Scenario: Switch DB from Cosmos to MongoDB Atlas
| Step | Effort | Description |
|---|---|---|
Implement MongoDatastoreProvider |
1 day | ~200 lines — translate FilterMap to MongoDB find() |
Set DB_PROVIDER=mongo + MONGO_URI=... |
5 minutes | Config change |
| Run data migration script | 2–4 hours | Export Cosmos JSON → import to MongoDB |
| Run full test suite | 30 minutes | Verify all 1,029+ tests pass |
| Total | ~1.5 days | vs 3–5 weeks without abstraction |
Scenario: Switch Storage from Azure Blob to S3
| Step | Effort | Description |
|---|---|---|
Implement S3StorageProvider |
0.5 day | ~100 lines |
Set STORAGE_PROVIDER=s3 + AWS_* env vars |
5 minutes | Config change |
| Migrate blobs | 1–2 hours | azcopy or rclone |
| Total | ~0.5 days | vs 2–3 days without abstraction |
Scenario: Switch LLM from Azure OpenAI to OpenAI Direct
| Step | Effort | Description |
|---|---|---|
Set LLM_PROVIDER=openai + OPENAI_API_KEY=... |
5 minutes | Config change only |
Remove AZURE_OPENAI_* env vars |
5 minutes | Cleanup |
| Total | 10 minutes | Already near-zero today |
Scenario: Full Cloud Migration (Azure → AWS)
| Step | Effort | Description |
|---|---|---|
| Implement MongoDB/DynamoDB provider | 1–2 days | |
| Implement S3 storage provider | 0.5 days | |
| Implement AWS Secrets Manager provider | 0.5 days | |
| Switch LLM to OpenAI direct | 10 minutes | |
| Implement Google STT or AWS Transcribe | 2–3 days | Speech is still the hardest |
| Implement SNS push provider | 0.5 days | |
| Data migration + testing | 2–3 days | |
| Total | ~7–10 days | vs 4–8 weeks without abstraction |
12. Testing Strategy
12.1 Provider-Agnostic Tests
Every repository test should work against any provider. The test setup picks the provider:
// Test setup: use in-memory provider
import { setTestProvider } from '@bytelyst/datastore/testing';
beforeAll(() => {
setTestProvider('memory'); // Fast, no network, deterministic
});
12.2 Provider Integration Tests
Separate test suites verify each provider works correctly:
__tests__/
datastore/
cosmos.integration.test.ts # Runs against real Cosmos (CI only)
mongo.integration.test.ts # Runs against real MongoDB (CI only)
memory.test.ts # Always runs — verifies memory provider
12.3 Migration Verification Checklist
For each sprint, before merging:
- All existing tests pass (no regressions)
- New interface tests pass with all implemented providers
- Manual smoke test against Azure (dev environment)
- No new
@azure/*imports in application code (only in provider files)
12.4 CI Gate
Add a lint rule to prevent direct Azure SDK imports outside of provider directories:
# scripts/check-cloud-agnostic.sh
# Fail if any file outside packages/*/providers/ imports @azure/*
rg '@azure/' services/ dashboards/ --include='*.ts' \
--glob='!**/providers/**' --glob='!**/node_modules/**' \
&& echo "FAIL: Direct Azure SDK import found outside provider layer" && exit 1 \
|| echo "PASS: No direct Azure imports in application code"
13. Env Var Naming Convention
Current (Azure-specific)
COSMOS_ENDPOINT=https://cosmos-mywisprai.documents.azure.com:443/
COSMOS_KEY=...
COSMOS_DATABASE=lysnrai
AZURE_BLOB_CONNECTION_STRING=...
AZURE_BLOB_ACCOUNT_NAME=bytelystblobs
AZURE_BLOB_ACCOUNT_KEY=...
AZURE_OPENAI_ENDPOINT=...
AZURE_OPENAI_KEY=...
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
AZURE_SPEECH_KEY=...
AZURE_SPEECH_REGION=eastus
AZURE_KEYVAULT_URL=...
Target (generic with Azure fallbacks)
# ── Provider Selection ────────────────────────────
DB_PROVIDER=cosmos # cosmos | mongo | memory
STORAGE_PROVIDER=azure # azure | s3 | r2 | memory
LLM_PROVIDER=azure # azure | openai | gemini
SECRETS_PROVIDER=azure-keyvault # azure-keyvault | aws | doppler | env
SPEECH_PROVIDER=azure # azure | google | deepgram | whisper
PUSH_PROVIDER=azure-nh # azure-nh | firebase | expo
# ── Database (provider-specific) ──────────────────
# Cosmos (when DB_PROVIDER=cosmos):
COSMOS_ENDPOINT=...
COSMOS_KEY=...
COSMOS_DATABASE=lysnrai
# MongoDB (when DB_PROVIDER=mongo):
# MONGO_URI=mongodb+srv://...
# ── Storage (provider-specific) ───────────────────
# Azure (when STORAGE_PROVIDER=azure):
AZURE_BLOB_CONNECTION_STRING=...
# S3 (when STORAGE_PROVIDER=s3):
# AWS_ACCESS_KEY_ID=...
# AWS_SECRET_ACCESS_KEY=...
# S3_BUCKET_PREFIX=bytelyst-
# ── LLM (provider-specific) ──────────────────────
# Azure OpenAI:
AZURE_OPENAI_ENDPOINT=...
AZURE_OPENAI_KEY=...
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
# OpenAI direct:
# OPENAI_API_KEY=...
# OPENAI_MODEL=gpt-4o-mini
# ── Secrets (optional) ───────────────────────────
AZURE_KEYVAULT_URL=... # only if SECRETS_PROVIDER=azure-keyvault
# ── Speech ────────────────────────────────────────
AZURE_SPEECH_KEY=...
AZURE_SPEECH_REGION=eastus
Backward compatibility: All existing AZURE_* env vars continue to work. The generic *_PROVIDER vars are additive.
14. Risk Mitigation
| Risk | Mitigation |
|---|---|
| FilterMap can't express complex Cosmos SQL | Add rawQuery() escape hatch for edge cases. Track usage — if >5% of queries need it, expand FilterMap operators |
| Performance regression from abstraction layer | Benchmark critical queries before/after. The abstraction adds one function call — negligible |
| Team unfamiliar with new patterns | Each sprint includes updating AGENTS.md with new conventions. Old pattern (direct Cosmos) still works during migration |
| In-memory provider behaves differently | Integration test suite runs against real Cosmos in CI. Memory provider is for unit tests only |
| Stale data during DB migration | Use dual-write pattern: write to both old and new provider during transition. Read from new, fall back to old |
| Sprint 1 takes too long | The 38 repository files can be migrated incrementally — even 5 files at a time is progress. Old and new patterns coexist |
Appendix: Interface Specifications
A.1 @bytelyst/datastore — Package Structure
packages/datastore/
├── src/
│ ├── index.ts # Public exports
│ ├── types.ts # All interfaces (DocumentCollection, DatastoreProvider, etc.)
│ ├── factory.ts # createDatastoreProvider() factory
│ ├── filter.ts # FilterMap → provider-specific query translation
│ ├── providers/
│ │ ├── cosmos.ts # CosmosDatastoreProvider + CosmosCollection
│ │ ├── mongo.ts # MongoDatastoreProvider + MongoCollection (stub)
│ │ └── memory.ts # MemoryDatastoreProvider + MemoryCollection
│ └── testing.ts # Test helpers (setTestProvider, seedCollection, etc.)
├── package.json # peer deps: @azure/cosmos (optional), mongodb (optional)
├── tsconfig.json
└── vitest.config.ts
A.2 @bytelyst/storage — Package Structure
packages/storage/
├── src/
│ ├── index.ts
│ ├── types.ts # StorageProvider, StorageBucket, SignedUrlOptions
│ ├── factory.ts # createStorageProvider()
│ ├── providers/
│ │ ├── azure-blob.ts # Wraps existing @bytelyst/blob code
│ │ ├── s3.ts # AWS S3 (stub)
│ │ └── memory.ts # In-memory for testing
│ └── testing.ts
├── package.json
└── tsconfig.json
A.3 @bytelyst/llm — Package Structure
packages/llm/
├── src/
│ ├── index.ts
│ ├── types.ts # LLMProvider, ChatCompletionRequest/Response
│ ├── factory.ts # createLLMProvider()
│ ├── providers/
│ │ ├── azure-openai.ts # AzureOpenAI endpoint + api-key auth
│ │ ├── openai.ts # OpenAI direct + Bearer auth
│ │ └── gemini.ts # Google Generative AI (stub)
│ └── testing.ts # MockLLMProvider for tests
├── package.json
└── tsconfig.json
A.4 Complete Interface: FilterMap Operators
// Exact match
{ field: value }
// Comparison
{ field: { $gt: value } } // >
{ field: { $gte: value } } // >=
{ field: { $lt: value } } // <
{ field: { $lte: value } } // <=
{ field: { $ne: value } } // !=
// Existence
{ field: { $exists: true } } // IS_DEFINED(c.field)
{ field: { $exists: false } } // NOT IS_DEFINED(c.field)
// String
{ field: { $startsWith: 'prefix' } }
{ field: { $contains: 'substr' } }
// Array
{ field: { $contains: value } } // ARRAY_CONTAINS
{ field: { $in: [v1, v2, v3] } } // IN operator
// Logical (for complex queries)
{ $or: [{ field1: v1 }, { field2: v2 }] }
Cosmos adapter translates each operator to SQL:
{ $gte: v }→c.field >= @pN{ $exists: false }→NOT IS_DEFINED(c.field){ $contains: v }on array →ARRAY_CONTAINS(c.field, @pN){ $in: [...] }→c.field IN (@pN, @pM, ...)
MongoDB adapter passes operators natively (MQL uses the same $gte, $exists syntax).
Memory adapter evaluates operators with simple JS comparisons.
Summary
| Sprint | What | Days | After This Sprint... |
|---|---|---|---|
| 1 | Database abstraction | 5–7 | DB swap = implement 1 adapter (~200 LOC) + config change |
| 2 | Storage abstraction | 2 | Blob swap = implement 1 adapter (~100 LOC) + config change |
| 3 | LLM abstraction | 2 | LLM swap = config change only (10 minutes) |
| 4 | Secrets abstraction | 1 | Secrets swap = config change only |
| 5 | Speech abstraction | 3–4 | Speech swap = implement 1 adapter (~300 LOC) |
| 6 | Push abstraction | 1 | Push swap = implement 1 adapter (~50 LOC) |
| 7 | Monitoring cleanup | 0.5 | Already cloud-agnostic |
| Total | ~15–17 days | Full cloud migration = ~7–10 days instead of 4–8 weeks |
The key insight: ~80% of migration effort is in Sprint 1 (database). If you only do one sprint, do that one. Everything else is comparatively easy.
Document generated by automated codebase analysis. Companion to CLOUD_PROVIDER_MIGRATION_ANALYSIS.md. Review as the codebase evolves.