docs(roadmap): update cloud-agnostic refactor roadmap with implementation progress — move to in-progress

This commit is contained in:
saravanakumardb1 2026-03-02 01:14:17 -08:00
parent b69abf44c7
commit 78cb13d9c3

View File

@ -6,7 +6,7 @@
> **Repos scanned:** `learning_ai_common_plat` (platform-service, 23 packages) · `learning_voice_ai_agent` (LysnrAI) · `learning_multimodal_memory_agents` (MindLyst) · `learning_ai_clock` (ChronoMind) · `learning_ai_jarvis_jr` (JarvisJr) · `learning_ai_fastgap` (NomGap) · `learning_ai_peakpulse` (PeakPulse)
> **Goal:** Refactor the codebase so it continues to work on Azure today, but switching to any other cloud provider requires **minimum effort** (days, not weeks).
>
> **Status as of 2026-03-02:** None of the 7 sprints have been started. All Azure SDK usage remains direct. 2 partial precursors exist (LysnrAI STT router, MindLyst LLM provider auto-detect). Monitoring (Sprint 7) is already cloud-agnostic.
> **Status as of 2026-03-02:** Sprint 1 is **~60% complete** — `@bytelyst/datastore` package built (Cosmos + Memory providers, 58 tests), all 47 platform-service repository files migrated (746 tests pass), 4 test files updated to use in-memory provider. Remaining: 21 product-backend repos (batches 711), 8 dashboard cosmos clients (batch 12), 2 Python clients (batch 13). Packages also created for Sprints 24, 6 (`@bytelyst/storage`, `@bytelyst/llm`, `@bytelyst/push`, `@bytelyst/config` secrets refactor) but consumer migration not started. Sprint 5 (Speech) not started. Sprint 7 already done.
---
@ -132,14 +132,14 @@ routes.ts ────────► │ collection.findMany({ │
| Sprint | Package / Scope | Status | Effort | Files Changed (updated) | Risk |
| --------- | ------------------------------------------------- | ------------------- | --------------- | ----------------------------------------------------------------------------------------- | -------- |
| **1** | `@bytelyst/datastore` — DB abstraction | ❌ NOT STARTED | 710 days | **78** repository files + 1 new package (was 44 — now includes product backends) | Medium |
| **2** | `@bytelyst/storage` — Blob/Object abstraction | ❌ NOT STARTED | 2 days | 3 files + 1 new package | Low |
| **3** | `@bytelyst/llm` — LLM provider abstraction | ⚠️ PRECURSOR EXISTS | 2 days | 4 files + 1 new package. MindLyst `llm.ts` already auto-detects Azure vs OpenAI | Low |
| **4** | `@bytelyst/secrets` — Secrets manager abstraction | ⚠️ PRECURSOR EXISTS | 1 day | 2 files (refactor existing `resolveKeyVaultSecrets()`) | Very Low |
| **1** | `@bytelyst/datastore` — DB abstraction | 🔶 IN PROGRESS | 710 days | **78** repository files + 1 new package (47/78 done — platform-service complete) | Medium |
| **2** | `@bytelyst/storage` — Blob/Object abstraction | 🔶 PACKAGE BUILT | 2 days | 3 files + 1 new package (package done, consumer migration pending) | Low |
| **3** | `@bytelyst/llm` — LLM provider abstraction | 🔶 PACKAGE BUILT | 2 days | 4 files + 1 new package (package done, consumer migration pending) | Low |
| **4** | `@bytelyst/secrets` — Secrets manager abstraction | ✅ PACKAGE DONE | 1 day | `@bytelyst/config` `resolveKeyVaultSecrets()` refactored with provider dispatch | Very Low |
| **5** | `@bytelyst/speech` — Speech STT abstraction | ⚠️ PRECURSOR EXISTS | 34 days | 3 files + 1 new package. LysnrAI `stt_router.py` already routes Azure↔Whisper | Medium |
| **6** | `@bytelyst/push` — Push notification abstraction | ❌ NOT STARTED | 1 day | 1 file + 1 new package. No push infra exists yet | Very Low |
| **6** | `@bytelyst/push` — Push notification abstraction | 🔶 PACKAGE BUILT | 1 day | 1 file + 1 new package (package done, no push infra to migrate yet) | Very Low |
| **7** | Monitoring/Telemetry cleanup | ✅ ALREADY DONE | 0 days | Custom telemetry via `@bytelyst/telemetry-client`, Loki+Grafana in `services/monitoring/` | None |
| **Total** | | | **~1620 days** | ~90 files | |
| **Total** | | | **~1620 days** | ~90 files (47 done, ~43 remaining) | |
### Priority Order
@ -155,13 +155,13 @@ Sprint 5 (Speech) ──► Sprint 6 (Push) ──► Sprint 7 (Monitoring)
---
## 4. Sprint 1: Database Abstraction Layer ❌ NOT STARTED
## 4. Sprint 1: Database Abstraction Layer 🔶 IN PROGRESS
**Package:** `@bytelyst/datastore`
**Effort:** 710 days (revised up from 57 — now 78 files vs original 44)
**This is the most important sprint — it eliminates 80% of cloud lock-in.**
> **Current state (2026-03-02):** No `@bytelyst/datastore` package exists. All 78 repository files use `getContainer()` from `@bytelyst/cosmos` → raw Cosmos SQL. The `@bytelyst/cosmos` package itself wraps `@azure/cosmos` directly. No in-memory adapter exists for testing.
> **Current state (2026-03-02):** `@bytelyst/datastore` package built with `CosmosDatastoreProvider` + `MemoryDatastoreProvider` (58 package tests). All **47 platform-service repository files** migrated from `getContainer()``getCollection()` (746 tests pass, zero `cosmos.js` imports remain in `modules/`). Remaining: 21 product-backend repos, 8 dashboard cosmos clients, 2 Python clients.
### 4.1 Interface Design
@ -479,21 +479,21 @@ Migrate in batches, one module per commit. Each commit:
**Batch order** (simplest first, complex last):
| Batch | Modules | Complexity | Notes |
| ----- | -------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ | --------------- |
| 1 | flags, plans, settings, changelog, products | Simple CRUD | 5 files, warmup |
| 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | 5 files |
| 3 | items, comments, votes, brains, reflections | CRUD + filter combos | 5 files |
| 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | 5 files |
| 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | 5 files |
| 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | 4 files |
| 7 | ChronoMind backend: timers, routines, households, shared-timers, webhooks | Sync logic, batch ops | 5 files |
| 8 | JarvisJr backend: jarvis-agents, jarvis-sessions, jarvis-memory | Agent memory queries | 3 files |
| 9 | NomGap backend: fasting-sessions, fasting-protocols, meal-log, social-fasting, push-triggers | Product-specific | 5 files |
| 10 | PeakPulse + MindLyst backends: peak-sessions, peak-routes, brains, memory, reflections, daily-briefs, streaks | Product-specific | 7 files |
| 11 | LysnrAI backend + user-dashboard: transcripts, sessions, organizations, api-tokens, webhooks, themes, export + 5 dashboard repos | Product-specific + dashboard | 10 files |
| 12 | Dashboard cosmos clients (admin-web, tracker-web, MindLyst web) | Direct `@azure/cosmos` | 3 files |
| 13 | Python clients (desktop cosmos, backend cosmos) | `azure.cosmos` → abstracted | 2 files |
| Batch | Modules | Complexity | Notes |
| ----- | -------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ | -------- |
| 1 | flags, plans, settings, changelog, products | Simple CRUD | ✅ DONE |
| 2 | licenses, sessions, ip-rules, maintenance, feedback | Simple CRUD + filters | ✅ DONE |
| 3 | items, comments, votes, brains, reflections | CRUD + filter combos | ✅ DONE |
| 4 | audit, delivery, notifications, exports, jobs | CRUD + time queries | ✅ DONE |
| 5 | tokens, usage, invitations, referrals, webhooks | More complex queries | ✅ DONE |
| 6 | auth, subscriptions, telemetry, experiments | Complex (GROUP BY, aggregates) | ✅ DONE |
| 7 | ChronoMind backend: timers, routines, households, shared-timers, webhooks | Sync logic, batch ops | 5 files |
| 8 | JarvisJr backend: jarvis-agents, jarvis-sessions, jarvis-memory | Agent memory queries | 3 files |
| 9 | NomGap backend: fasting-sessions, fasting-protocols, meal-log, social-fasting, push-triggers | Product-specific | 5 files |
| 10 | PeakPulse + MindLyst backends: peak-sessions, peak-routes, brains, memory, reflections, daily-briefs, streaks | Product-specific | 7 files |
| 11 | LysnrAI backend + user-dashboard: transcripts, sessions, organizations, api-tokens, webhooks, themes, export + 5 dashboard repos | Product-specific + dashboard | 10 files |
| 12 | Dashboard cosmos clients (admin-web, tracker-web, MindLyst web) | Direct `@azure/cosmos` | 3 files |
| 13 | Python clients (desktop cosmos, backend cosmos) | `azure.cosmos` → abstracted | 2 files |
### 4.8 Handling Complex Queries
@ -538,13 +538,13 @@ The Cosmos adapter translates these to SQL. The MongoDB adapter passes them dire
---
## 5. Sprint 2: Storage Abstraction Layer ❌ NOT STARTED
## 5. Sprint 2: Storage Abstraction Layer 🔶 PACKAGE BUILT
**Package:** `@bytelyst/storage`
**Effort:** 2 days
**Files changed:** `packages/blob/src/blob.ts`, `src/cloud/blob_client.py`, `services/platform-service/src/modules/blob/`
> **Current state (2026-03-02):** `@bytelyst/blob` (162 LOC) wraps `@azure/storage-blob` directly with `BlobServiceClient` and `generateBlobSASQueryParameters`. `@bytelyst/blob-client` exists as a client-side package. No storage abstraction interface exists.
> **Current state (2026-03-02):** `@bytelyst/storage` package built with `AzureBlobStorageProvider` + `MemoryStorageProvider`. Consumer migration (blob package, platform-service blob module, Python blob client) not yet started.
### 5.1 Interface Design
@ -628,13 +628,13 @@ const url = await bucket.getSignedUrl('user123/recording.wav', { permissions: 'r
---
## 6. Sprint 3: LLM Provider Abstraction ⚠️ PRECURSOR EXISTS
## 6. Sprint 3: LLM Provider Abstraction 🔶 PACKAGE BUILT
**Package:** `@bytelyst/llm`
**Effort:** 2 days
**Files changed:** `src/llm/text_cleaner.py`, `backend/src/clients/openai_client.py`, MindLyst `web/src/lib/llm.ts`, extraction-service config
> **Current state (2026-03-02):** MindLyst `web/src/lib/llm.ts` already implements a dual-provider pattern with `resolveProvider()` that reads `OPENAI_PROVIDER` / `LLM_PROVIDER` env var and auto-detects Azure vs OpenAI from endpoint URLs. This is exactly the pattern this sprint proposes to extract into a shared `@bytelyst/llm` package. LysnrAI `text_cleaner.py` still imports `AzureOpenAI` directly. Extraction-service uses Azure OpenAI via `server.ts` config.
> **Current state (2026-03-02):** `@bytelyst/llm` package built with `AzureOpenAIProvider`, `OpenAIProvider`, and `MockLLMProvider`. Consumer migration (MindLyst llm.ts, extraction-service, Python text_cleaner) not yet started.
### 6.1 Interface Design
@ -688,13 +688,13 @@ The `openai` Python SDK already has a common interface between `OpenAI` and `Azu
---
## 7. Sprint 4: Secrets Manager Abstraction ⚠️ PRECURSOR EXISTS
## 7. Sprint 4: Secrets Manager Abstraction ✅ PACKAGE DONE
**Package:** Refactor existing `@bytelyst/config`
**Effort:** 1 day
**Files changed:** `packages/config/src/keyvault.ts`, `src/secrets/keyvault.py`
> **Current state (2026-03-02):** `@bytelyst/config` exports `resolveKeyVaultSecrets()` which already skips if `AZURE_KEYVAULT_URL` is unset and falls back to env vars. It has 9 unit tests. The function name and imports are Azure-specific (`@azure/identity`, `@azure/keyvault-secrets`) but the fallback-to-env behavior means services already work without Azure Key Vault. This sprint is essentially a rename + provider dispatch — lowest effort of all sprints.
> **Current state (2026-03-02):** `@bytelyst/config` `resolveKeyVaultSecrets()` refactored with provider dispatch pattern. Falls back to env vars when no vault provider is configured. Existing consumers already work without changes.
### 7.1 Key Insight: Already 90% Done
@ -865,13 +865,13 @@ The abstraction hides these differences behind a unified push-audio + callback i
---
## 9. Sprint 6: Push Notification Abstraction ❌ NOT STARTED
## 9. Sprint 6: Push Notification Abstraction 🔶 PACKAGE BUILT
**Package:** `@bytelyst/push`
**Effort:** 1 day
**Files changed:** Platform-service push-triggers module
> **Current state (2026-03-02):** No push notification infrastructure exists in platform-service. NomGap has `push-triggers` module in its product backend that stores push trigger rules in Cosmos, but no actual delivery mechanism (APNS/FCM) is implemented. The `@bytelyst/delivery` module handles email but not push. No Azure Notification Hub, Firebase, or Expo push integration exists.
> **Current state (2026-03-02):** `@bytelyst/push` package built with `ExpoPushProvider` and `MockPushProvider`. No push delivery infrastructure exists in platform-service yet (NomGap has trigger rules but no APNS/FCM integration).
### 9.1 Interface Design
@ -1237,18 +1237,18 @@ packages/llm/
## Summary
| Sprint | What | Status | Days | After This Sprint... |
| --------- | -------------------- | ------------------- | --------------- | ---------------------------------------------------------- |
| 1 | Database abstraction | ❌ NOT STARTED | 710 | DB swap = implement 1 adapter (~200 LOC) + config change |
| 2 | Storage abstraction | ❌ NOT STARTED | 2 | Blob swap = implement 1 adapter (~100 LOC) + config change |
| 3 | LLM abstraction | ⚠️ PRECURSOR EXISTS | 2 | LLM swap = config change only (10 minutes) |
| 4 | Secrets abstraction | ⚠️ PRECURSOR EXISTS | 1 | Secrets swap = config change only |
| 5 | Speech abstraction | ⚠️ PRECURSOR EXISTS | 34 | Speech swap = implement 1 adapter (~300 LOC) |
| 6 | Push abstraction | ❌ NOT STARTED | 1 | Push swap = implement 1 adapter (~50 LOC) |
| 7 | Monitoring cleanup | ✅ ALREADY DONE | 0 | Already cloud-agnostic |
| **Total** | | **1/7 done** | **~1620 days** | **Full cloud migration = ~710 days instead of 48 weeks** |
| Sprint | What | Status | Days | After This Sprint... |
| --------- | -------------------- | ------------------- | --------------- | -------------------------------------------------------------------- |
| 1 | Database abstraction | 🔶 IN PROGRESS | 710 | Package + platform-service done (47/78 files). Product backends next |
| 2 | Storage abstraction | 🔶 PACKAGE BUILT | 2 | Package done. Consumer migration pending |
| 3 | LLM abstraction | 🔶 PACKAGE BUILT | 2 | Package done. Consumer migration pending |
| 4 | Secrets abstraction | ✅ DONE | 1 | Config refactored with provider dispatch |
| 5 | Speech abstraction | ⚠️ PRECURSOR EXISTS | 34 | Speech swap = implement 1 adapter (~300 LOC) |
| 6 | Push abstraction | 🔶 PACKAGE BUILT | 1 | Package done. No push infra to migrate yet |
| 7 | Monitoring cleanup | ✅ ALREADY DONE | 0 | Already cloud-agnostic |
| **Total** | | **~3.5/7 done** | **~1620 days** | **Full cloud migration = ~710 days instead of 48 weeks** |
The key insight: **~80% of migration effort is in Sprint 1 (database)**. If you only do one sprint, do that one. Everything else is comparatively easy.
The key insight: **~80% of migration effort is in Sprint 1 (database)**. Platform-service is fully migrated. Product backends (21 repos) are next.
### Key Changes Since Original Document (2026-03-02 Review)
@ -1259,6 +1259,16 @@ The key insight: **~80% of migration effort is in Sprint 1 (database)**. If you
5. **Sprint 7 confirmed complete** — custom telemetry, Loki+Grafana, pino/structlog logging all in place.
6. **Migration effort table updated** — test count changed from 1,029 → ~1,713 (includes product backend tests).
### Progress Update (2026-03-02 Implementation Session)
7. **`@bytelyst/datastore` package built** — `CosmosDatastoreProvider` + `MemoryDatastoreProvider` + 58 tests (commit `dfa5eb7`).
8. **`@bytelyst/storage` package built** — `AzureBlobStorageProvider` + `MemoryStorageProvider` (commit `dfa5eb7`).
9. **`@bytelyst/llm` package built** — `AzureOpenAIProvider` + `OpenAIProvider` + `MockLLMProvider` (commit `dfa5eb7`).
10. **`@bytelyst/push` package built** — `ExpoPushProvider` + `MockPushProvider` (commit `dfa5eb7`).
11. **`@bytelyst/config` secrets refactored** — provider dispatch pattern added (commit `dfa5eb7`).
12. **Platform-service Sprint 1 batches 16 complete** — all 47 repository files migrated from `getContainer()``getCollection()`. Zero `cosmos.js` imports remain in `modules/`. 66 test files, 746 tests pass (commits `4d126cb`, `e355cb0`, `b69abf4`).
13. **4 test files updated** — notifications, subscriptions, tokens, usage tests rewritten from Cosmos SDK mocks to `MemoryDatastoreProvider`.
---
_Document generated by automated codebase analysis. Last reviewed 2026-03-02 (comprehensive workspace scan). Companion to `CLOUD_PROVIDER_MIGRATION_ANALYSIS.md`._