learning_ai_common_plat/docs/CLOUD/REFERRALS_PARTITION_KEY_MIGRATION.md

20 KiB

Referrals Container — Partition Key Migration Plan

Status: Planned Priority: P1 Risk: Medium (silent data failures on point reads) Date: 2026-03-01 Discovered: Azure Connection Audit (see docs/WINDSURF/AZURE_CONNECTION_AUDIT.md)


1. Problem Statement

The referrals Cosmos DB container has a 3-way partition key mismatch across the ecosystem. Four codebases declare different partition keys for the same container name, and the platform-service itself has an internal inconsistency between its container definition and its repository code.

Current State

Codebase File Declared PK PK Used in Point Reads
platform-service cosmos-init.ts services/platform-service/src/lib/cosmos-init.ts:15 /id
platform-service repository.ts services/platform-service/src/modules/referrals/repository.ts:63,81,84 referrerId
admin-web cosmos.ts dashboards/admin-web/src/lib/cosmos.ts:33 /referrerId N/A (calls platform-service API)
user-dashboard-web cosmos.ts user-dashboard-web/src/lib/cosmos.ts:30 /referrerId N/A (calls platform-service API)
MindLyst cosmos.ts mindlyst-native/web/src/lib/cosmos.ts:60 /userId userId

Critical Bug

Platform-service cosmos-init.ts declares /id but repository.ts uses referrerId as the partition key value in point reads.

// cosmos-init.ts — declares /id
referrals: { partitionKeyPath: '/id' },

// repository.ts — uses referrerId as partition key
export async function getById(id: string, referrerId: string): Promise<ReferralDoc | null> {
  const { resource } = await container().item(id, referrerId).read<ReferralDoc>();
  //                                         ^^^  ^^^^^^^^^^
  //                                         id   PK value = referrerId
  // But container PK is /id, so this should be container().item(id, id)
  return resource ?? null;
}

If the container was created by cosmos-init.ts (with /id), then getById() and update() pass the wrong partition key value. Cosmos DB will:

  • Return 404 Not Found on reads (silent failure, returns null)
  • Fail on replace() operations

This means the platform-service referral module may already be silently broken in production if cosmos-init.ts was the first to create the container.


2. Schema Differences

The two main consumers (platform-service and MindLyst) use fundamentally different document schemas in the same container.

Platform-Service Schema (LysnrAI)

Single document type per referral:

interface ReferralDoc {
  id: string; // Unique referral ID
  productId: string; // "lysnrai"
  referrerId: string; // User ID of the referrer
  referrerEmail: string;
  referredUserId: string | null;
  referredEmail: string;
  status: 'pending' | 'signed_up' | 'subscribed' | 'rewarded';
  referrerRewardTokens: number;
  referrerRewarded: boolean;
  referredRewarded: boolean;
  createdAt: string;
  completedAt: string | null;
}

MindLyst Schema

Two document types in the same container, distinguished by docType:

// Referral links
interface ReferralLink {
  id: string; // "reflink_<uuid>"
  userId: string; // Referrer's user ID (= partition key)
  productId: string; // "mindlyst"
  docType: 'link';
  code: string; // "ML-XXXXXX"
  url: string;
  createdAt: string;
}

// Referral events
interface ReferralEvent {
  id: string; // "ref_<uuid>"
  userId: string; // Referrer's user ID (= partition key)
  productId: string; // "mindlyst"
  docType: 'event';
  referrerId: string; // Same as userId
  referredEmail: string;
  referralCode: string;
  status: 'invited' | 'installed' | 'activated' | 'rewarded';
  createdAt: string;
  activatedAt: string | null;
}

Key Differences

Aspect Platform-Service MindLyst
Doc types 1 (referral) 2 (link + event)
Referrer field referrerId userId
Status values pending, signed_up, subscribed, rewarded invited, installed, activated, rewarded
Reward model Token-based (referrerRewardTokens) Pro month extension
Referral code Generated externally ML-XXXXXX code in doc
ID format UUID reflink_<uuid> / ref_<uuid>

3. Root Cause

The mismatch happened because:

  1. Platform-service growth module was built first with /id in cosmos-init.ts (generic pattern for lookup-by-ID containers).
  2. Platform-service repository code was written to use referrerId for point reads, assuming the partition key was /referrerId. This internal inconsistency was never caught because:
    • Tests mock the Cosmos client, so point reads succeed regardless
    • The container may not have been used in production yet
  3. Admin/user dashboards declared /referrerId in their local cosmos.ts, matching the repository code's intent (not the cosmos-init.ts definition).
  4. MindLyst was built independently with its own referral model, using /userId as the partition key (consistent with all other MindLyst containers).

4. Migration Options

Create a dedicated mindlyst_referrals container for MindLyst. Fix the platform-service container to use /referrerId.

Action Scope Risk
Fix cosmos-init.ts to use /referrerId platform-service Low (if no prod data exists)
Add mindlyst_referrals container with /userId platform-service + MindLyst Low
Update MindLyst to use mindlyst_referrals MindLyst Low

Pros:

  • No data migration needed (each product gets its own container)
  • Each product keeps its own schema and partition strategy
  • Clean separation by productId
  • Follows the existing pattern (e.g., jarvis_agents, peak_sessions)

Cons:

  • Extra container cost (~$0.25/month on Serverless)
  • Two containers to manage for referrals

Option B: Unified Container with /referrerId

Migrate all referral data to a single container with /referrerId. MindLyst renames userIdreferrerId in its documents.

Action Scope Risk
Fix cosmos-init.ts to use /referrerId platform-service Low
Create new container referrals_v2 with /referrerId Azure Low
Migrate existing documents, mapping userIdreferrerId Migration script Medium
Update MindLyst code to use referrerId field MindLyst Medium
Delete old referrals container Azure High (after validation)

Pros:

  • Single container for all products
  • Unified partition strategy

Cons:

  • Requires data migration script
  • MindLyst code changes across multiple files
  • Schema unification needed (different status values, ID formats)
  • Higher risk of breaking changes

Option C: Migrate MindLyst to Platform-Service API

Long-term goal: MindLyst calls the platform-service referrals API instead of accessing Cosmos directly.

This is the correct architectural direction but requires:

  1. Extending platform-service referrals module to support MindLyst's link+event model
  2. Adding docType support or separate endpoints
  3. MindLyst drops direct Cosmos access for referrals

This should be Phase 2, after Option A stabilizes the immediate partition key issue.


Phase 1: Immediate Fix (Option A) — Low Risk

Goal: Fix the partition key mismatch so no code is silently broken.

Step 1.1: Fix platform-service cosmos-init.ts

Change the partition key from /id to /referrerId:

- referrals: { partitionKeyPath: '/id' },
+ referrals: { partitionKeyPath: '/referrerId' },

⚠️ Prerequisite: Determine if a referrals container already exists in production Cosmos.

  • If no existing container: Just change the definition. initializeAllContainers() will create it correctly on next startup.
  • If existing container with /id: You must create a new container referrals_v2 with /referrerId, migrate data, then rename (see Step 1.1b below).

To check (Azure CLI):

az cosmosdb sql container show \
  --account-name cosmos-mywisprai \
  --database-name lysnrai \
  --name referrals \
  --resource-group rg-mywisprai \
  --query "resource.partitionKey.paths" 2>&1

If the container doesn't exist, skip to Step 1.2.

Step 1.1b: Container Recreation (only if existing container has wrong PK)

Cosmos DB does not allow changing partition keys on existing containers. If the container exists with /id:

# 1. Export existing data
az cosmosdb sql container show --account-name cosmos-mywisprai \
  --database-name lysnrai --name referrals --resource-group rg-mywisprai

# 2. Create new container with correct PK
az cosmosdb sql container create \
  --account-name cosmos-mywisprai \
  --database-name lysnrai \
  --name referrals_v2 \
  --partition-key-path "/referrerId" \
  --resource-group rg-mywisprai

# 3. Migrate data (use Azure Data Factory or a script)
# See Section 6 for migration script

# 4. Rename containers
# Azure doesn't support rename — create `referrals` with new PK after deleting old
az cosmosdb sql container delete \
  --account-name cosmos-mywisprai \
  --database-name lysnrai \
  --name referrals \
  --resource-group rg-mywisprai --yes

az cosmosdb sql container create \
  --account-name cosmos-mywisprai \
  --database-name lysnrai \
  --name referrals \
  --partition-key-path "/referrerId" \
  --resource-group rg-mywisprai

# 5. Copy data from referrals_v2 → referrals, then delete referrals_v2

Step 1.2: Add mindlyst_referrals container

In cosmos-init.ts:

+ // MindLyst referrals (separate container — different schema from growth/referrals)
+ mindlyst_referrals: { partitionKeyPath: '/userId' },

Step 1.3: Update MindLyst code

In mindlyst-native/web/src/lib/cosmos.ts:

- { id: "referrals", partitionKey: "/userId" },
+ { id: "mindlyst_referrals", partitionKey: "/userId" },

In mindlyst-native/web/src/app/api/referral/route.ts:

- const container = isCosmosConfigured() ? getCosmosContainer("referrals") : null;
+ const container = isCosmosConfigured() ? getCosmosContainer("mindlyst_referrals") : null;

Step 1.4: Update admin/user dashboard cosmos.ts (alignment)

Both dashboards declare /referrerId which is now correct. No change needed. But remove the NOTE comment in cosmos-init.ts:

- // NOTE: MindLyst also uses 'referrals' with /userId partition key, but
- // the growth module already registers it with /id. This mismatch needs
- // a separate migration to reconcile.

Step 1.5: Verify

# Platform-service tests
cd learning_ai_common_plat && pnpm test --filter @lysnrai/platform-service

# Admin dashboard typecheck
cd dashboards/admin-web && npx tsc --noEmit

# User dashboard typecheck
cd ../learning_voice_ai_agent/user-dashboard-web && npx tsc --noEmit

# MindLyst typecheck
cd ../learning_multimodal_memory_agents/mindlyst-native/web && npx next build --webpack

Phase 2: API Consolidation (Option C) — Future

Goal: MindLyst uses the platform-service referrals API instead of direct Cosmos access.

This requires:

  1. Extend platform-service referrals module to support MindLyst's link+event model:
    • POST /api/referrals/link — create referral link (returns code + URL)
    • POST /api/referrals/invite — track invite event
    • POST /api/referrals/activate — activate referral
    • GET /api/referrals/stats/:userId — leaderboard + stats
  2. Add x-product-id routing so the service knows which referral schema/container to use.
  3. MindLyst drops direct Cosmos for referrals, calls platform-service API via @bytelyst/api-client.
  4. Deprecate mindlyst_referrals container once all data flows through the API.

Timeline: After MindLyst auth integration with platform-service (Phase 2 in MindLyst roadmap).


6. Data Migration Script (if needed)

If there is existing data in a referrals container with the wrong partition key:

/**
 * migrate-referrals.ts
 * Run with: npx tsx migrate-referrals.ts
 *
 * Reads all docs from `referrals` (old PK), writes to `referrals_v2` (new PK).
 * Validates doc count before and after.
 */

import { CosmosClient } from '@azure/cosmos';

const COSMOS_ENDPOINT = process.env.COSMOS_ENDPOINT!;
const COSMOS_KEY = process.env.COSMOS_KEY!;
const COSMOS_DATABASE = process.env.COSMOS_DATABASE || 'lysnrai';

const client = new CosmosClient({ endpoint: COSMOS_ENDPOINT, key: COSMOS_KEY });
const db = client.database(COSMOS_DATABASE);

async function migrate() {
  const source = db.container('referrals');
  const target = db.container('referrals_v2');

  // 1. Count source docs
  const { resources: countRes } = await source.items
    .query<number>('SELECT VALUE COUNT(1) FROM c')
    .fetchAll();
  const totalDocs = countRes[0] ?? 0;
  console.log(`Source container has ${totalDocs} documents`);

  if (totalDocs === 0) {
    console.log('No documents to migrate. Done.');
    return;
  }

  // 2. Read all docs
  const { resources: allDocs } = await source.items.query('SELECT * FROM c').fetchAll();

  // 3. Write to target (with correct PK field populated)
  let migrated = 0;
  let skipped = 0;
  for (const doc of allDocs) {
    // Ensure referrerId exists (required for new PK)
    if (!doc.referrerId) {
      console.warn(`Skipping doc ${doc.id}: no referrerId field`);
      skipped++;
      continue;
    }

    try {
      await target.items.create(doc);
      migrated++;
    } catch (err: any) {
      if (err.code === 409) {
        console.warn(`Doc ${doc.id} already exists in target, skipping`);
        skipped++;
      } else {
        throw err;
      }
    }
  }

  console.log(`Migrated: ${migrated}, Skipped: ${skipped}, Total: ${totalDocs}`);

  // 4. Validate target count
  const { resources: targetCount } = await target.items
    .query<number>('SELECT VALUE COUNT(1) FROM c')
    .fetchAll();
  console.log(`Target container has ${targetCount[0]} documents`);

  if (targetCount[0] !== migrated) {
    console.error('WARNING: Count mismatch! Manual verification needed.');
  } else {
    console.log('Migration validated successfully.');
  }
}

migrate().catch(console.error);

7. MindLyst Data Migration (if existing data in wrong container)

If MindLyst documents already exist in a referrals container with /userId partition key, and we're creating a separate mindlyst_referrals container:

/**
 * migrate-mindlyst-referrals.ts
 * Copies MindLyst referral docs (docType: "link"|"event") from `referrals` → `mindlyst_referrals`.
 */

import { CosmosClient } from '@azure/cosmos';

const client = new CosmosClient({
  endpoint: process.env.COSMOS_ENDPOINT!,
  key: process.env.COSMOS_KEY!,
});
const db = client.database(process.env.COSMOS_DATABASE || 'lysnrai');

async function migrate() {
  const source = db.container('referrals');
  const target = db.container('mindlyst_referrals');

  // Only migrate MindLyst docs (have productId: "mindlyst" or docType field)
  const { resources: docs } = await source.items
    .query("SELECT * FROM c WHERE c.productId = 'mindlyst' OR IS_DEFINED(c.docType)")
    .fetchAll();

  console.log(`Found ${docs.length} MindLyst referral documents to migrate`);

  let migrated = 0;
  for (const doc of docs) {
    try {
      await target.items.create(doc);
      migrated++;
    } catch (err: any) {
      if (err.code === 409) continue; // Already exists
      throw err;
    }
  }

  console.log(`Migrated ${migrated} documents to mindlyst_referrals`);

  // Optionally: delete migrated docs from source
  // for (const doc of docs) {
  //   await source.item(doc.id, doc.userId).delete();
  // }
}

migrate().catch(console.error);

8. Rollback Plan

If the migration causes issues:

  1. Phase 1 rollback: Revert code changes, point MindLyst back to referrals container. No data loss since we're copying, not moving.
  2. Container recreation rollback: Keep referrals_v2 as backup. If new referrals has issues, swap back.
  3. API consolidation rollback (Phase 2): MindLyst can always fall back to direct Cosmos access by reverting the API client changes.

9. Pre-Migration Checklist

  • Determine if referrals container exists in production Cosmos
  • If yes, check its actual partition key: az cosmosdb sql container show ...
  • If yes, count existing documents per productId
  • Backup existing data (export to JSON)
  • Run migration script in staging first
  • Validate document counts match before/after
  • Deploy code changes after data migration completes
  • Run platform-service tests
  • Run dashboard typechecks
  • Verify point reads work in staging
  • Remove old container (only after 7-day validation period)

10. Files to Modify

Phase 1 Changes

Repo File Change
common_plat services/platform-service/src/lib/cosmos-init.ts Change referrals PK to /referrerId, add mindlyst_referrals with /userId, remove NOTE comment
common_plat services/platform-service/src/modules/referrals/repository.ts No change (already uses referrerId correctly)
multimodal_memory mindlyst-native/web/src/lib/cosmos.ts Change referralsmindlyst_referrals
multimodal_memory mindlyst-native/web/src/app/api/referral/route.ts Change getCosmosContainer("referrals")getCosmosContainer("mindlyst_referrals")

No Changes Needed

Repo File Reason
common_plat dashboards/admin-web/src/lib/cosmos.ts Already declares /referrerId
voice_ai_agent user-dashboard-web/src/lib/cosmos.ts Already declares /referrerId
common_plat Admin/user dashboard API routes Use platform-service API, not direct Cosmos