docs(roadmaps): add AI diagnostic assistant, A/B testing, and churn prediction roadmaps
- AI Diagnostic Assistant: LLM-powered root cause analysis, error clustering, natural language queries - Intelligent A/B Testing: Thompson sampling, Bayesian early stopping, AI hypothesis generation - Predictive Churn & Health: XGBoost models, health scoring, automated retention campaigns All roadmaps include: - Implementation tracking tables with status/commit columns - Quick reference sections with file structures - Phase-by-phase task breakdowns with [X.Y.Z] codes
This commit is contained in:
parent
d510867b87
commit
e98380003b
597
docs/roadmaps/AI_DIAGNOSTIC_ASSISTANT_ROADMAP.md
Normal file
597
docs/roadmaps/AI_DIAGNOSTIC_ASSISTANT_ROADMAP.md
Normal file
@ -0,0 +1,597 @@
|
||||
# AI Diagnostic Assistant — Implementation Roadmap
|
||||
|
||||
> **Module:** `platform-service/src/modules/ai-diagnostics/`
|
||||
> **Admin UI:** `/ops/ai-diagnostics/`
|
||||
> **Target:** LLM-powered root cause analysis from telemetry + debug sessions
|
||||
> **Estimated Effort:** 2–3 weeks
|
||||
> **Status:** 🟡 Planning
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This roadmap delivers an **AI-powered diagnostic assistant** that analyzes error patterns, debug session data, and telemetry to automatically suggest root causes—like having a senior engineer on-call 24/7. Engineers can ask natural language questions like _"Why did the iOS keyboard crash yesterday?"_ and receive AI-generated hypotheses with supporting evidence.
|
||||
|
||||
### Key Differentiators vs. Manual Debugging
|
||||
|
||||
| Feature | Manual Debugging | AI Diagnostic Assistant |
|
||||
| ----------------- | --------------------------- | ----------------------------------- |
|
||||
| Query | SQL + log grep | **Natural language** |
|
||||
| Pattern Detection | Hours of manual correlation | **AI finds hidden patterns** |
|
||||
| Context Assembly | Check 5+ systems manually | **Auto-assembles timeline** |
|
||||
| Hypothesis | Engineer intuition | **LLM-generated + evidence** |
|
||||
| Learning | Per-engineer experience | **Accumulates across all sessions** |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Data Pipeline & Embeddings (Week 1)
|
||||
|
||||
**Goal:** Extract, normalize, and embed error data for semantic search and clustering.
|
||||
|
||||
### 1.1 Error Fingerprinting & Clustering
|
||||
|
||||
- [ ] **1.1.1** Create `modules/ai-diagnostics/types.ts`
|
||||
- [ ] `ErrorClusterDoc` — grouped similar errors with signature
|
||||
- [ ] `ErrorFingerprint` — normalized stack trace hash
|
||||
- [ ] `ClusterAnalysis` — AI-generated pattern description
|
||||
- [ ] Zod schemas for all inputs
|
||||
|
||||
_Commit format:_ `git commit -m "feat(ai-diagnostics): add error clustering types [1.1.1]"` → `https://github.com/saravanakumardb1/learning_ai_common_plat/commit/<hash>`
|
||||
|
||||
- [ ] **1.1.2** Add Cosmos containers to `cosmos-init.ts`
|
||||
- [ ] `error_clusters` (pk: `/productId`, TTL: 90 days)
|
||||
- [ ] `error_fingerprints` (pk: `/fingerprintHash`, unique index)
|
||||
- [ ] `diagnostic_insights` (pk: `/clusterId`, AI-generated analyses)
|
||||
|
||||
_Commit format:_ `git commit -m "feat(ai-diagnostics): add cosmos containers for error clustering [1.1.2]"`
|
||||
|
||||
- [ ] **1.1.3** Implement error normalization
|
||||
- [ ] Stack trace parsing (remove line numbers, file paths)
|
||||
- [ ] Message templating (replace UUIDs, timestamps, user IDs with placeholders)
|
||||
- [ ] Fingerprint generation (SHA-256 of normalized error)
|
||||
- [ ] Similarity scoring (Levenshtein for near-matches)
|
||||
|
||||
_Commit format:_ `git commit -m "feat(ai-diagnostics): implement error normalization and fingerprinting [1.1.3]"`
|
||||
|
||||
### 1.2 Vector Embeddings for Semantic Search
|
||||
|
||||
- [ ] **1.2.1** Create embedding pipeline
|
||||
- [ ] Azure OpenAI `text-embedding-3-small` integration
|
||||
- [ ] Error message + stack trace → 1536-dim vector
|
||||
- [ ] Batch embedding job (100 errors at a time)
|
||||
- [ ] **1.2.2** Cosmos DB vector search setup
|
||||
- [ ] Store embeddings in `error_clusters` documents
|
||||
- [ ] Cosine similarity query function
|
||||
- [ ] Similar error lookup by vector distance
|
||||
- [ ] **1.2.3** Clustering algorithm
|
||||
- [ ] HDBSCAN for density-based clustering
|
||||
- [ ] DBSCAN fallback for smaller datasets
|
||||
- [ ] Auto-determine cluster count (no manual k)
|
||||
- [ ] Re-cluster nightly as new errors arrive
|
||||
|
||||
### 1.3 Telemetry Ingestion for Context
|
||||
|
||||
- [ ] **1.3.1** Link telemetry to errors
|
||||
- [ ] `correlationId` propagation across services
|
||||
- [ ] 5-minute window: error → preceding telemetry events
|
||||
- [ ] Session state reconstruction (what user was doing)
|
||||
- [ ] **1.3.2** Enrich error context
|
||||
- [ ] Device info (OS version, model, memory)
|
||||
- [ ] App state (screen, feature flags, config)
|
||||
- [ ] Recent API calls (network trace from diagnostics)
|
||||
- [ ] Recent user actions (breadcrumb trail)
|
||||
|
||||
**Phase 1 Exit Criteria:**
|
||||
|
||||
- [ ] Errors auto-clustered with 90%+ accuracy
|
||||
- [ ] Vector search returns semantically similar errors
|
||||
- [ ] 10,000+ historical errors embedded and clustered
|
||||
- [ ] Correlation pipeline links errors to telemetry context
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: LLM Analysis Engine (Week 1–2)
|
||||
|
||||
### 2.1 Prompt Engineering & Analysis Pipeline
|
||||
|
||||
- [ ] **2.1.1** Create analysis prompts
|
||||
- [ ] `ROOT_CAUSE_ANALYSIS` prompt template
|
||||
|
||||
```
|
||||
Given this error cluster:
|
||||
- Error signature: {fingerprint}
|
||||
- Sample stack traces: {samples}
|
||||
- Common context: {deviceStats}, {appState}
|
||||
- Preceding events: {breadcrumbSummary}
|
||||
- Similar resolved issues: {relatedClusters}
|
||||
|
||||
Analyze and provide:
|
||||
1. Likely root cause category (config, dependency, logic, resource, external)
|
||||
2. Specific hypothesis with reasoning
|
||||
3. Evidence confidence (high/medium/low)
|
||||
4. Suggested investigation steps
|
||||
5. Potential fix direction
|
||||
```
|
||||
|
||||
- [ ] `PATTERN_SUMMARY` prompt for cluster descriptions
|
||||
- [ ] `COMPARATIVE_ANALYSIS` for error vs. baseline
|
||||
|
||||
- [ ] **2.1.2** LLM integration
|
||||
- [ ] Azure OpenAI GPT-4o-mini for analysis (cost-effective)
|
||||
- [ ] GPT-4o for complex multi-factor analysis
|
||||
- [ ] Response JSON schema enforcement
|
||||
- [ ] Retry logic with exponential backoff
|
||||
|
||||
### 2.2 Insight Generation Service
|
||||
|
||||
- [ ] **2.2.1** Create `modules/ai-diagnostics/analyzer.ts`
|
||||
- [ ] `analyzeCluster(clusterId)` — full analysis workflow
|
||||
- [ ] `generateInsight(errorContext)` — single error analysis
|
||||
- [ ] `compareClusters(clusterA, clusterB)` — diff analysis
|
||||
- [ ] **2.2.2** Analysis workflow
|
||||
- [ ] Fetch cluster data + related telemetry
|
||||
- [ ] Build LLM context (respect token limits)
|
||||
- [ ] Call LLM with structured prompt
|
||||
- [ ] Parse and validate response
|
||||
- [ ] Store insight in `diagnostic_insights`
|
||||
- [ ] **2.2.3** Confidence scoring
|
||||
- [ ] Evidence count weighting
|
||||
- [ ] Similar resolved issue bonus
|
||||
- [ ] Recency decay (older patterns = lower confidence)
|
||||
- [ ] Multi-model consensus (if available)
|
||||
|
||||
### 2.3 Continuous Learning
|
||||
|
||||
- [ ] **2.3.1** Feedback loop
|
||||
- [ ] Engineer feedback: "Was this insight helpful? 👍/👎"
|
||||
- [ ] Resolution tracking (link commits to clusters)
|
||||
- [ ] Confidence recalibration based on outcomes
|
||||
- [ ] **2.3.2** Pattern accumulation
|
||||
- [ ] "Known issues" database (manually curated)
|
||||
- [ ] Historical fix patterns (what solved similar issues)
|
||||
- [ ] Regression detection (old issue reappearing)
|
||||
|
||||
**Phase 2 Exit Criteria:**
|
||||
|
||||
- [ ] LLM generates root cause hypotheses with evidence
|
||||
- [ ] Confidence scores align with actual resolution rates
|
||||
- [ ] Analysis completes in < 5 seconds for typical clusters
|
||||
- [ ] Feedback loop capturing engineer ratings
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Natural Language Query Interface (Week 2)
|
||||
|
||||
### 3.1 Query Understanding
|
||||
|
||||
- [ ] **3.1.1** Create `modules/ai-diagnostics/query-parser.ts`
|
||||
- [ ] Intent classification (root cause, pattern search, comparison, trend)
|
||||
- [ ] Entity extraction (product, time range, error type, user segment)
|
||||
- [ ] Temporal parsing ("yesterday", "last week", "since v2.1")
|
||||
- [ ] Constraint identification ("only iOS", "excluding beta users")
|
||||
- [ ] **3.1.2** Query patterns
|
||||
- [ ] Root cause: _"Why did X happen?"_ → analyze cluster
|
||||
- [ ] Pattern search: _"Show me similar crashes"_ → vector search
|
||||
- [ ] Comparison: _"Did error rate increase after release?"_ → trend analysis
|
||||
- [ ] User impact: _"How many users affected by Y?"_ → aggregation query
|
||||
|
||||
### 3.2 Query Execution Engine
|
||||
|
||||
- [ ] **3.2.1** Query → data pipeline
|
||||
- [ ] Map entities to Cosmos queries
|
||||
- [ ] Fetch relevant clusters, telemetry, sessions
|
||||
- [ ] Assemble context for response generation
|
||||
- [ ] **3.2.2** Response generation
|
||||
- [ ] Direct answers for simple queries
|
||||
- [ ] AI-generated summaries for complex analysis
|
||||
- [ ] Data + visualization suggestions
|
||||
- [ ] Drill-down links for exploration
|
||||
|
||||
### 3.3 REST API Routes
|
||||
|
||||
- [ ] **3.3.1** Create `modules/ai-diagnostics/routes.ts`
|
||||
- [ ] `POST /ai-diagnostics/query` — natural language question
|
||||
- [ ] `GET /ai-diagnostics/clusters/:id/analysis` — pre-computed insight
|
||||
- [ ] `POST /ai-diagnostics/clusters/:id/analyze` — trigger fresh analysis
|
||||
- [ ] `GET /ai-diagnostics/suggestions` — auto-suggested investigations
|
||||
- [ ] `POST /ai-diagnostics/feedback` — submit insight rating
|
||||
|
||||
**Phase 3 Exit Criteria:**
|
||||
|
||||
- [ ] Natural language queries parse correctly (90%+ intent accuracy)
|
||||
- [ ] Query → response pipeline < 3 seconds
|
||||
- [ ] Complex queries return structured answers with evidence
|
||||
- [ ] API routes tested and documented
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Admin Dashboard UI (Week 2–3)
|
||||
|
||||
### 4.1 AI Insights Page
|
||||
|
||||
- [ ] **4.1.1** Create `/ops/ai-diagnostics/page.tsx`
|
||||
- [ ] Smart search bar (natural language input)
|
||||
- [ ] Suggested queries based on recent errors
|
||||
- [ ] Recent AI-generated insights list
|
||||
- [ ] Trending clusters (auto-detected anomalies)
|
||||
- [ ] **4.1.2** Query results view
|
||||
- [ ] AI-generated answer with confidence badge
|
||||
- [ ] Supporting evidence cards (cluster stats, sample errors)
|
||||
- [ ] Related debug sessions (linked traces)
|
||||
- [ ] Timeline visualization of error pattern
|
||||
- [ ] "Investigate further" actions
|
||||
|
||||
### 4.2 Cluster Detail with AI Analysis
|
||||
|
||||
- [ ] **4.2.1** Enhance error cluster detail
|
||||
- [ ] AI-generated summary card ("This appears to be...")
|
||||
- [ ] Root cause hypothesis with confidence
|
||||
- [ ] Evidence breakdown (stack samples, device patterns, API failures)
|
||||
- [ ] Suggested fixes from similar resolved issues
|
||||
- [ ] "Request deeper analysis" button (GPT-4o)
|
||||
- [ ] **4.2.2** Interactive investigation
|
||||
- [ ] Compare with other clusters ("Show me similar issues")
|
||||
- [ ] Filter by context (OS version, app version, feature flags)
|
||||
- [ ] View affected user journeys (breadcrumb trails)
|
||||
|
||||
### 4.3 Proactive Alerts
|
||||
|
||||
- [ ] **4.3.1** Anomaly detection
|
||||
- [ ] Auto-detect emerging error clusters
|
||||
- [ ] Spike in existing cluster frequency
|
||||
- [ ] New error types after releases
|
||||
- [ ] **4.3.2** AI-generated alerts
|
||||
- [ ] Slack/Teams notification with summary
|
||||
- [ ] "Investigate in AI Diagnostics" deep link
|
||||
- [ ] Auto-started debug session recommendations
|
||||
|
||||
**Phase 4 Exit Criteria:**
|
||||
|
||||
- [ ] Admin can ask questions and get AI-generated answers
|
||||
- [ ] Cluster detail shows AI analysis with evidence
|
||||
- [ ] Proactive alerts for emerging issues
|
||||
- [ ] Full test coverage (UI + API)
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Advanced Capabilities (Future)
|
||||
|
||||
### 5.1 Multi-Modal Analysis
|
||||
|
||||
- [ ] Analyze screenshots from debug sessions for UI issues
|
||||
- [ ] Voice transcription analysis (for voice app errors)
|
||||
- [ ] Performance trace visualization with AI annotations
|
||||
|
||||
### 5.2 Predictive Diagnostics
|
||||
|
||||
- [ ] Pre-crash pattern detection (warn before crash happens)
|
||||
- [ ] Resource exhaustion prediction (memory, disk, API quotas)
|
||||
- [ ] Config drift detection ("this setting combination often fails")
|
||||
|
||||
### 5.3 Self-Healing Suggestions
|
||||
|
||||
- [ ] Auto-generated config recommendations
|
||||
- [ ] Feature flag rollback suggestions
|
||||
- [ ] Circuit breaker threshold recommendations
|
||||
|
||||
## Implementation Tracking
|
||||
|
||||
| Phase | Task | Status | Commit |
|
||||
| ----- | -------------------------- | ------ | ------ |
|
||||
| 1.1 | Error clustering types | ⬜ | — |
|
||||
| 1.1 | Cosmos containers | ⬜ | — |
|
||||
| 1.1 | Error normalization | ⬜ | — |
|
||||
| 1.2 | Embedding pipeline | ⬜ | — |
|
||||
| 1.2 | Vector search setup | ⬜ | — |
|
||||
| 1.2 | Clustering algorithm | ⬜ | — |
|
||||
| 1.3 | Telemetry linking | ⬜ | — |
|
||||
| 1.3 | Error context enrichment | ⬜ | — |
|
||||
| 2.1 | Analysis prompts | ⬜ | — |
|
||||
| 2.1 | LLM integration | ⬜ | — |
|
||||
| 2.2 | Insight generation service | ⬜ | — |
|
||||
| 2.2 | Analysis workflow | ⬜ | — |
|
||||
| 2.2 | Confidence scoring | ⬜ | — |
|
||||
| 2.3 | Feedback loop | ⬜ | — |
|
||||
| 2.3 | Pattern accumulation | ⬜ | — |
|
||||
| 3.1 | Query parser | ⬜ | — |
|
||||
| 3.1 | Query patterns | ⬜ | — |
|
||||
| 3.2 | Query execution | ⬜ | — |
|
||||
| 3.2 | Response generation | ⬜ | — |
|
||||
| 3.3 | REST API routes | ⬜ | — |
|
||||
| 4.1 | AI insights page | ⬜ | — |
|
||||
| 4.1 | Query results view | ⬜ | — |
|
||||
| 4.2 | Cluster detail | ⬜ | — |
|
||||
| 4.2 | Interactive investigation | ⬜ | — |
|
||||
| 4.3 | Proactive alerts | ⬜ | — |
|
||||
|
||||
**Legend:** ⬜ Not started | 🟡 In progress | ✅ Complete | ⏸️ Deferred
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference for Implementing Agent
|
||||
|
||||
**📋 Full Roadmap:** `/Users/sd9235/code/mygh/learning_ai_common_plat/docs/roadmaps/AI_DIAGNOSTIC_ASSISTANT_ROADMAP.md`
|
||||
|
||||
**Key Files to Modify/Create:**
|
||||
|
||||
```
|
||||
services/platform-service/
|
||||
├── src/
|
||||
│ ├── modules/ai-diagnostics/
|
||||
│ │ ├── types.ts # [1.1.1] Error clustering types
|
||||
│ │ ├── repository.ts # [1.2] Data access layer
|
||||
│ │ ├── analyzer.ts # [2.2] LLM analysis engine
|
||||
│ │ ├── query-parser.ts # [3.1] NL query understanding
|
||||
│ │ ├── query-executor.ts # [3.2] Query execution
|
||||
│ │ ├── routes.ts # [3.3] REST API
|
||||
│ │ └── ai-diagnostics.test.ts # Tests
|
||||
│ ├── lib/
|
||||
│ │ ├── cosmos-init.ts # [1.1.2] Add containers
|
||||
│ │ ├── embedding-client.ts # [1.2.1] Azure OpenAI embeddings
|
||||
│ │ └── pii-redaction.ts # Reuse existing
|
||||
│ └── server.ts # [3.3] Register routes
|
||||
dashboards/admin-web/
|
||||
├── src/
|
||||
│ ├── app/(dashboard)/
|
||||
│ │ ├── ai-diagnostics/
|
||||
│ │ │ ├── page.tsx # [4.1] Main insights page
|
||||
│ │ │ └── [id]/
|
||||
│ │ │ └── page.tsx # [4.2] Cluster detail
|
||||
│ ├── lib/
|
||||
│ │ └── ai-diagnostics-client.ts # API client
|
||||
│ └── components/
|
||||
│ └── ai-diagnostics/ # Reusable components
|
||||
```
|
||||
|
||||
**Commit Message Format:**
|
||||
|
||||
```
|
||||
feat(ai-diagnostics): <description> [<task.code>]
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```bash
|
||||
git add services/platform-service/src/modules/ai-diagnostics/
|
||||
git commit -m "feat(ai-diagnostics): add error clustering types and cosmos containers [1.1.1-1.1.2]"
|
||||
```
|
||||
|
||||
**Testing Requirements:**
|
||||
|
||||
- Unit tests: 20+ Vitest tests for clustering, embeddings, LLM responses
|
||||
- Integration tests: End-to-end query → analysis pipeline
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Telemetry module (error events)
|
||||
- Azure OpenAI (embeddings + GPT-4o)
|
||||
- Existing diagnostics module (optional linking)
|
||||
|
||||
---
|
||||
|
||||
### ErrorClusterDoc
|
||||
|
||||
```typescript
|
||||
interface ErrorClusterDoc {
|
||||
id: string; // ec_<uuid>
|
||||
productId: string; // partition key
|
||||
fingerprintHash: string; // SHA-256 of normalized error
|
||||
|
||||
// Cluster metadata
|
||||
firstSeenAt: string; // ISO 8601
|
||||
lastSeenAt: string;
|
||||
occurrenceCount: number; // Total occurrences
|
||||
uniqueUsers: number; // Affected user count
|
||||
|
||||
// Error signature
|
||||
errorType: string; // Exception class/name
|
||||
messageTemplate: string; // Normalized message with placeholders
|
||||
stackSignature: string; // Normalized stack frames
|
||||
|
||||
// Vector embedding for semantic search
|
||||
embedding: number[]; // 1536-dim from text-embedding-3-small
|
||||
embeddingVersion: string; // Model version for re-embedding
|
||||
|
||||
// Context patterns (auto-extracted)
|
||||
commonContext: {
|
||||
osVersions: Array<{ version: string; count: number }>;
|
||||
appVersions: Array<{ version: string; count: number }>;
|
||||
deviceModels: Array<{ model: string; count: number }>;
|
||||
screenContexts: Array<{ screen: string; count: number }>;
|
||||
};
|
||||
|
||||
// Related data
|
||||
relatedClusterIds: string[]; // Similar clusters (vector similarity)
|
||||
mergedIntoClusterId?: string; // If deduplicated
|
||||
|
||||
// Resolution tracking
|
||||
status: 'active' | 'investigating' | 'resolved' | 'ignored';
|
||||
resolvedAt?: string;
|
||||
resolutionCommit?: string; // Link to fix
|
||||
|
||||
// Timestamps
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
ttl: number; // 90 days
|
||||
}
|
||||
```
|
||||
|
||||
### DiagnosticInsightDoc
|
||||
|
||||
```typescript
|
||||
interface DiagnosticInsightDoc {
|
||||
id: string; // di_<uuid>
|
||||
clusterId: string; // partition key (with productId)
|
||||
productId: string;
|
||||
|
||||
// AI-generated analysis
|
||||
analysisType: 'root_cause' | 'pattern' | 'comparison' | 'trend';
|
||||
generatedAt: string;
|
||||
|
||||
// LLM output
|
||||
rootCauseCategory: 'config' | 'dependency' | 'logic' | 'resource' | 'external' | 'unknown';
|
||||
hypothesis: string; // Natural language explanation
|
||||
reasoning: string; // Why LLM thinks this
|
||||
confidence: 'high' | 'medium' | 'low';
|
||||
confidenceScore: number; // 0.0–1.0
|
||||
|
||||
// Evidence
|
||||
evidence: Array<{
|
||||
type:
|
||||
| 'stack_trace'
|
||||
| 'telemetry_pattern'
|
||||
| 'device_correlation'
|
||||
| 'api_failure'
|
||||
| 'similar_issue';
|
||||
description: string;
|
||||
strength: 'strong' | 'moderate' | 'weak';
|
||||
data: Record<string, unknown>;
|
||||
}>;
|
||||
|
||||
// Suggested actions
|
||||
suggestedInvestigation: string[];
|
||||
potentialFixDirection?: string;
|
||||
similarResolvedIssues?: Array<{
|
||||
clusterId: string;
|
||||
resolution: string;
|
||||
confidence: number;
|
||||
}>;
|
||||
|
||||
// Feedback
|
||||
feedbackStats: {
|
||||
helpful: number;
|
||||
notHelpful: number;
|
||||
engineerNotes: string[];
|
||||
};
|
||||
|
||||
// LLM metadata
|
||||
modelUsed: string; // gpt-4o, gpt-4o-mini
|
||||
promptTokens: number;
|
||||
completionTokens: number;
|
||||
|
||||
createdAt: string;
|
||||
ttl: number; // 90 days
|
||||
}
|
||||
```
|
||||
|
||||
### NaturalLanguageQueryDoc
|
||||
|
||||
```typescript
|
||||
interface NaturalLanguageQueryDoc {
|
||||
id: string; // nq_<uuid>
|
||||
userId: string; // Admin who asked
|
||||
productId?: string; // Optional filter
|
||||
|
||||
// Query
|
||||
rawQuery: string; // "Why did iOS keyboard crash yesterday?"
|
||||
parsedIntent: 'root_cause' | 'pattern_search' | 'comparison' | 'trend' | 'impact';
|
||||
extractedEntities: {
|
||||
products?: string[];
|
||||
timeRange?: { start: string; end: string };
|
||||
errorTypes?: string[];
|
||||
platforms?: string[];
|
||||
userSegments?: string[];
|
||||
};
|
||||
|
||||
// Execution
|
||||
executedQuery: string; // Translated Cosmos query
|
||||
dataSources: string[]; // Clusters, telemetry, sessions accessed
|
||||
executionTimeMs: number;
|
||||
|
||||
// Response
|
||||
aiResponse: string; // Generated answer
|
||||
confidence: number; // Overall confidence
|
||||
supportingData: Array<{
|
||||
type: 'cluster' | 'telemetry' | 'session';
|
||||
id: string;
|
||||
relevanceScore: number;
|
||||
}>;
|
||||
|
||||
// Feedback
|
||||
userRating?: 'helpful' | 'not_helpful';
|
||||
userComment?: string;
|
||||
|
||||
createdAt: string;
|
||||
ttl: number; // 30 days
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: API Reference
|
||||
|
||||
| Method | Endpoint | Auth | Description |
|
||||
| ------ | --------------------------------------- | ----- | --------------------------------------- |
|
||||
| POST | `/ai-diagnostics/query` | Admin | Natural language diagnostic query |
|
||||
| GET | `/ai-diagnostics/clusters` | Admin | List error clusters (with AI summaries) |
|
||||
| GET | `/ai-diagnostics/clusters/:id` | Admin | Cluster detail with AI analysis |
|
||||
| POST | `/ai-diagnostics/clusters/:id/analyze` | Admin | Trigger fresh LLM analysis |
|
||||
| GET | `/ai-diagnostics/clusters/:id/analysis` | Admin | Get pre-computed insight |
|
||||
| GET | `/ai-diagnostics/suggestions` | Admin | AI-suggested investigations |
|
||||
| POST | `/ai-diagnostics/feedback` | Admin | Rate insight helpfulness |
|
||||
| POST | `/ai-diagnostics/search` | Admin | Semantic search across errors |
|
||||
|
||||
---
|
||||
|
||||
## Appendix C: Integration Points
|
||||
|
||||
### With Telemetry Module
|
||||
|
||||
- Error events auto-create/update clusters
|
||||
- Telemetry context enriches error analysis
|
||||
- Correlation IDs link errors to user journeys
|
||||
|
||||
### With Diagnostics Module
|
||||
|
||||
- Debug sessions linked to error clusters
|
||||
- Screenshots from sessions aid visual analysis
|
||||
- Network traces provide API failure context
|
||||
|
||||
### With Event Bus
|
||||
|
||||
| Event | Action |
|
||||
| ------------------------------- | --------------------------------------------------------- |
|
||||
| `telemetry.error.ingested` | Update/create cluster, trigger re-analysis if new pattern |
|
||||
| `diagnostics.session.completed` | Link session to related clusters, analyze captured logs |
|
||||
| `diagnostics.ingest.fatal` | High-priority cluster analysis, alert if novel pattern |
|
||||
|
||||
---
|
||||
|
||||
## Appendix D: Cost Estimation
|
||||
|
||||
| Component | Monthly Cost (est.) |
|
||||
| ------------------------ | ------------------------------- |
|
||||
| Azure OpenAI embeddings | $50–100 (10K errors/day) |
|
||||
| GPT-4o-mini analysis | $100–200 (1K analyses/day) |
|
||||
| GPT-4o deep analysis | $50–100 (100 deep analyses/day) |
|
||||
| Cosmos DB vector storage | $20–50 |
|
||||
| **Total** | **$220–450/month** |
|
||||
|
||||
Optimization:
|
||||
|
||||
- Cache frequent cluster analyses (24hr TTL)
|
||||
- Use GPT-4o-mini for 90% of queries
|
||||
- Batch embedding jobs during off-peak
|
||||
|
||||
---
|
||||
|
||||
## Current Status
|
||||
|
||||
- [ ] **Design complete** — Target: 2026-03-10
|
||||
- [ ] **Phase 1: Data Pipeline** — Not started
|
||||
- [ ] **Phase 2: LLM Engine** — Not started
|
||||
- [ ] **Phase 3: Query Interface** — Not started
|
||||
- [ ] **Phase 4: Admin UI** — Not started
|
||||
- [ ] **Phase 5: Advanced Capabilities** — Future
|
||||
|
||||
**Estimated Timeline:** 2–3 weeks (Phases 1–4)
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Telemetry module (must be collecting errors)
|
||||
- Diagnostics module (optional, for rich context)
|
||||
- Azure OpenAI deployment (embedding + GPT-4o access)
|
||||
|
||||
---
|
||||
|
||||
_Last Updated: 2026-03-03_
|
||||
719
docs/roadmaps/INTELLIGENT_AB_TESTING_ROADMAP.md
Normal file
719
docs/roadmaps/INTELLIGENT_AB_TESTING_ROADMAP.md
Normal file
@ -0,0 +1,719 @@
|
||||
# Intelligent A/B Testing — Implementation Roadmap
|
||||
|
||||
> **Module:** `platform-service/src/modules/ab-testing/`
|
||||
> **Admin UI:** `/ops/experiments/`
|
||||
> **Target:** AI-powered experiment management with auto-allocation, early stopping, and hypothesis generation
|
||||
> **Estimated Effort:** 2.5–3 weeks
|
||||
> **Status:** 🟡 Planning
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This roadmap delivers an **intelligent A/B testing platform** that goes beyond traditional feature flags. Unlike manual percentage rollouts, this system uses statistical algorithms for ** Thompson sampling**-based auto-allocation, **Bayesian early stopping** when variants clearly win/lose, and **LLM-powered hypothesis generation** from feature flag usage patterns.
|
||||
|
||||
### Key Differentiators vs. Static Feature Flags
|
||||
|
||||
| Capability | Static Flags (Current) | Intelligent A/B Testing |
|
||||
| ------------------ | ---------------------- | ----------------------------------------- |
|
||||
| Traffic Allocation | Manual percentage | **Multi-armed bandit optimization** |
|
||||
| Stopping Decision | Manual monitoring | **Auto-stop at statistical significance** |
|
||||
| Winner Selection | Human judgment | **Bayesian probability of superiority** |
|
||||
| Test Duration | Fixed (often wrong) | **Dynamic based on effect size** |
|
||||
| Hypothesis | Human-written | **AI-generated from usage patterns** |
|
||||
| Sample Size | Guesswork | **Power analysis + sequential testing** |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Core Experiment Engine (Week 1)
|
||||
|
||||
### 1.1 Data Model & Schemas
|
||||
|
||||
- [ ] **1.1.1** Create `modules/ab-testing/types.ts`
|
||||
- [ ] `ExperimentDoc` — experiment definition and config
|
||||
- [ ] `VariantDoc` — variant metadata + metrics
|
||||
- [ ] `AssignmentDoc` — user → variant assignments
|
||||
- [ ] `MetricDoc` — event types being tracked
|
||||
- [ ] `ExperimentResult` — statistical analysis results
|
||||
- [ ] Zod schemas for all inputs
|
||||
- [ ] **1.1.2** Add Cosmos containers to `cosmos-init.ts`
|
||||
- [ ] `experiments` (pk: `/productId`, TTL: 2 years for completed)
|
||||
- [ ] `experiment_variants` (pk: `/experimentId`)
|
||||
- [ ] `experiment_assignments` (pk: `/userId`, query by experiment)
|
||||
- [ ] `experiment_events` (pk: `/experimentId` + `/timestamp` for time-series)
|
||||
- [ ] `experiment_metrics` (pk: `/experimentId`, computed aggregates)
|
||||
|
||||
### 1.2 Assignment & Bucketing
|
||||
|
||||
- [ ] **1.2.1** Create deterministic bucketing
|
||||
- [ ] Consistent hashing (userId + experimentId → variant)
|
||||
- [ ] FNV-1a hash algorithm (same as feature flags)
|
||||
- [ ] Sticky assignments (user always sees same variant)
|
||||
- [ ] Override capability (force specific variant for QA)
|
||||
- [ ] **1.2.2** Assignment strategies
|
||||
- [ ] `random` — Simple randomization (control vs static)
|
||||
- [ ] `thompson` — Thompson sampling (multi-armed bandit)
|
||||
- [ ] `epsilon_greedy` — Epsilon-greedy exploration
|
||||
- [ ] `ucb` — Upper Confidence Bound algorithm
|
||||
- [ ] **1.2.3** Audience targeting
|
||||
- [ ] User property filters (platform, version, region, subscription tier)
|
||||
- [ ] Percentage rollout within target segment
|
||||
- [ ] Exclusion lists (beta users, internal accounts)
|
||||
|
||||
### 1.3 Event Tracking Pipeline
|
||||
|
||||
- [ ] **1.3.1** Metric definitions
|
||||
- [ ] `conversion` — Binary (did/didn't convert)
|
||||
- [ ] `count` — Integer events (sessions, messages)
|
||||
- [ ] `duration` — Time-based (session length, task time)
|
||||
- [ ] `revenue` — Monetary (purchase amount, LTV)
|
||||
- [ ] `custom` — Arbitrary numeric values
|
||||
- [ ] **1.3.2** Event ingestion
|
||||
- [ ] `POST /ab-testing/events` batch endpoint
|
||||
- [ ] Client SDK: `track(event, value, metadata)`
|
||||
- [ ] Automatic attribution (which variant caused this event)
|
||||
- [ ] Deduplication (eventId + userId uniqueness)
|
||||
|
||||
**Phase 1 Exit Criteria:**
|
||||
|
||||
- [ ] Experiments created with multiple variants
|
||||
- [ ] Users consistently assigned to variants
|
||||
- [ ] Events tracked and attributed correctly
|
||||
- [ ] 20+ tests for assignment and ingestion
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Statistical Analysis Engine (Week 1–2)
|
||||
|
||||
### 2.1 Bayesian Inference
|
||||
|
||||
- [ ] **2.1.1** Create `modules/ab-testing/statistics.ts`
|
||||
- [ ] `BetaDistribution` for conversion rates
|
||||
- [ ] `GammaDistribution` for count/duration metrics
|
||||
- [ ] `NormalDistribution` for continuous metrics
|
||||
- [ ] Monte Carlo simulation (10,000 samples)
|
||||
- [ ] **2.1.2** Probability calculations
|
||||
- [ ] `probabilityVariantBeatsControl(variant, control)`
|
||||
- [ ] `expectedLossIfChosen(variant)`
|
||||
- [ ] `probabilityBeatAllVariants(variant)`
|
||||
- [ ] **2.1.3** Credible intervals
|
||||
- [ ] 95% credible interval for each variant's true metric
|
||||
- [ ] Visualization-ready (lower, mean, upper bounds)
|
||||
|
||||
### 2.2 Early Stopping Rules
|
||||
|
||||
- [ ] **2.2.1** Stopping criteria
|
||||
- [ ] **Winner found:** Variant has > 95% probability of beating control
|
||||
- [ ] **Loser clear:** Control has > 95% probability of beating variant
|
||||
- [ ] **Practical significance:** Minimum detectable effect not reached
|
||||
- [ ] **Time bound:** Max duration reached (safety limit)
|
||||
- [ ] **2.2.2** Auto-promotion
|
||||
- [ ] Auto-rollout winner to 100% when threshold hit
|
||||
- [ ] Notify admins via Slack/email
|
||||
- [ ] Create audit log entry
|
||||
- [ ] **2.2.3** Guardrails
|
||||
- [ ] Minimum sample size before early stopping (100 users/variant)
|
||||
- [ ] Business hours only for auto-actions
|
||||
- [ ] Require approval for revenue-impacting experiments
|
||||
|
||||
### 2.3 Thompson Sampling
|
||||
|
||||
- [ ] **2.3.1** Multi-armed bandit implementation
|
||||
- [ ] Sample from posterior distributions
|
||||
- [ ] Assign user to variant with highest sample
|
||||
- [ ] Re-balance traffic every hour based on performance
|
||||
- [ ] **2.3.2** Exploration vs exploitation
|
||||
- [ ] Exploration rate decays over time
|
||||
- [ ] High uncertainty = more exploration
|
||||
- [ ] Clear winner = more traffic to winner
|
||||
- [ ] **2.3.3** Regret minimization
|
||||
- [ ] Track cumulative regret vs optimal variant
|
||||
- [ ] Regret bounds reporting
|
||||
|
||||
**Phase 2 Exit Criteria:**
|
||||
|
||||
- [ ] Bayesian probabilities calculated correctly
|
||||
- [ ] Early stopping triggers at appropriate thresholds
|
||||
- [ ] Thompson sampling re-allocates traffic dynamically
|
||||
- [ ] Statistical tests validate correctness
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: AI-Powered Hypothesis Generation (Week 2)
|
||||
|
||||
### 3.1 Pattern Detection
|
||||
|
||||
- [ ] **3.1.1** Usage pattern analysis
|
||||
- [ ] Analyze feature flag usage telemetry
|
||||
- [ ] Segment analysis (iOS vs Android, free vs pro)
|
||||
- [ ] Temporal patterns (day of week, time of day)
|
||||
- [ ] User behavior sequences (funnel analysis)
|
||||
- [ ] **3.1.2** Anomaly detection
|
||||
- [ ] Unexpected drop in feature adoption
|
||||
- [ ] Performance regression signals
|
||||
- [ ] User segment showing different behavior
|
||||
- [ ] **3.1.3** Opportunity identification
|
||||
- [ ] Underperforming features (low adoption)
|
||||
- [ ] High-dropoff flows
|
||||
- [ ] Competitor feature gaps
|
||||
|
||||
### 3.2 Hypothesis Generation
|
||||
|
||||
- [ ] **3.2.1** LLM hypothesis prompts
|
||||
|
||||
```
|
||||
Given this feature usage data:
|
||||
- Feature: {featureName}
|
||||
- Current adoption: {adoptionRate}% (baseline: {baseline}%)
|
||||
- Segment performance: {segmentData}
|
||||
- User feedback: {feedbackSamples}
|
||||
- Competitor analysis: {competitorFeatures}
|
||||
|
||||
Generate experiment hypotheses:
|
||||
1. Primary hypothesis: "Changing X will improve Y because..."
|
||||
2. Secondary hypotheses (2-3 alternatives)
|
||||
3. Expected effect size (conservative estimate)
|
||||
4. Success metric recommendation
|
||||
5. Risk assessment
|
||||
```
|
||||
|
||||
- [ ] **3.2.2** Hypothesis ranking
|
||||
- [ ] Expected impact scoring
|
||||
- [ ] Implementation difficulty estimate
|
||||
- [ ] Statistical power prediction
|
||||
- [ ] Risk-adjusted expected value
|
||||
- [ ] **3.2.3** Suggested experiment design
|
||||
- [ ] Variant count recommendation
|
||||
- [ ] Traffic allocation suggestion
|
||||
- [ ] Duration estimate
|
||||
- [ ] Required sample size calculation
|
||||
|
||||
### 3.3 Auto-Experiment Suggestions
|
||||
|
||||
- [ ] **3.3.1** Weekly AI reports
|
||||
- [ ] Top 5 experiment opportunities
|
||||
- [ ] Hypotheses with supporting evidence
|
||||
- [ ] Prioritized by expected impact
|
||||
- [ ] **3.3.2** One-click experiment creation
|
||||
- [ ] Pre-fill experiment from hypothesis
|
||||
- [ ] Suggested variants with descriptions
|
||||
- [ ] Pre-configured metrics
|
||||
|
||||
**Phase 3 Exit Criteria:**
|
||||
|
||||
- [ ] AI generates meaningful hypotheses from usage data
|
||||
- [ ] Hypothesis quality rated by product team (80%+ useful)
|
||||
- [ ] Auto-suggested experiments created in 1 click
|
||||
- [ ] Weekly reports generated automatically
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Admin Dashboard UI (Week 2–3)
|
||||
|
||||
### 4.1 Experiments List Page
|
||||
|
||||
- [ ] **4.1.1** Create `/ops/experiments/page.tsx`
|
||||
- [ ] Experiment cards (status, duration, sample size)
|
||||
- [ ] Quick filters (running, completed, draft)
|
||||
- [ ] AI-generated hypothesis badge
|
||||
- [ ] Health indicators (traffic balance, event flow)
|
||||
- [ ] **4.1.2** Experiment creation wizard
|
||||
- [ ] Step 1: Define hypothesis (AI suggestions available)
|
||||
- [ ] Step 2: Create variants (name, description, config)
|
||||
- [ ] Step 3: Select metrics (primary + secondary)
|
||||
- [ ] Step 4: Audience targeting
|
||||
- [ ] Step 5: Traffic allocation (manual or Thompson)
|
||||
- [ ] Step 6: Review and launch
|
||||
|
||||
### 4.2 Live Experiment Dashboard
|
||||
|
||||
- [ ] **4.2.1** Create `/ops/experiments/[id]/page.tsx`
|
||||
- [ ] Real-time metrics comparison
|
||||
- [ ] Variant performance table (conversions, counts, durations)
|
||||
- [ ] Bayesian probability visualization
|
||||
- [ ] Credible interval charts
|
||||
- [ ] **4.2.2** Statistical summary card
|
||||
- [ ] Probability of beating control (per variant)
|
||||
- [ ] Expected lift if implemented
|
||||
- [ ] Sample size progress bar
|
||||
- [ ] Days to significance estimate
|
||||
- [ ] **4.2.3** Action buttons
|
||||
- [ ] Adjust traffic allocation
|
||||
- [ ] Pause/resume experiment
|
||||
- [ ] Stop and declare winner
|
||||
- [ ] Rollout winner to 100%
|
||||
- [ ] Archive experiment
|
||||
|
||||
### 4.3 Results & Reporting
|
||||
|
||||
- [ ] **4.3.1** Results page
|
||||
- [ ] Final statistical summary
|
||||
- [ ] Variant comparison visualization
|
||||
- [ ] Segment breakdown (iOS vs Android, etc.)
|
||||
- [ ] Confidence intervals over time
|
||||
- [ ] **4.3.2** AI insights panel
|
||||
- [ ] Why this result occurred (LLM summary)
|
||||
- [ ] Unexpected findings
|
||||
- [ ] Follow-up experiment suggestions
|
||||
- [ ] **4.3.3** Export capabilities
|
||||
- [ ] CSV export of raw data
|
||||
- [ ] PDF report generation
|
||||
- [ ] API endpoint for data warehouse sync
|
||||
|
||||
**Phase 4 Exit Criteria:**
|
||||
|
||||
- [ ] Full experiment lifecycle manageable in UI
|
||||
- [ ] Real-time stats visible and accurate
|
||||
- [ ] Bayesian visualizations clear to non-statisticians
|
||||
- [ ] Export and reporting functional
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Advanced Capabilities (Future)
|
||||
|
||||
### 5.1 Multi-Variate Testing
|
||||
|
||||
- [ ] Test multiple variables simultaneously
|
||||
- [ ] Full factorial and fractional factorial designs
|
||||
- [ ] Interaction effect detection
|
||||
|
||||
### 5.2 Sequential Experimentation
|
||||
|
||||
- [ ] Multi-phase experiments (qualification → main → validation)
|
||||
- [ ] Holdout groups for long-term validation
|
||||
- [ ] Global holdout (never-exposed users)
|
||||
|
||||
### 5.3 Personalization Layer
|
||||
|
||||
- [ ] Contextual bandits (different variants for different users)
|
||||
- [ ] ML model for variant selection
|
||||
- [ ] Automatic personalization optimization
|
||||
|
||||
### 5.4 Experiment Coordination
|
||||
|
||||
- [ ] Mutually exclusive experiments
|
||||
- [ ] Experiment priority rules
|
||||
- [ ] Layered experimentation (orthogonal tests)
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Data Models
|
||||
|
||||
### ExperimentDoc
|
||||
|
||||
```typescript
|
||||
interface ExperimentDoc {
|
||||
id: string; // exp_<uuid>
|
||||
productId: string; // partition key
|
||||
|
||||
// Experiment definition
|
||||
name: string;
|
||||
description: string;
|
||||
hypothesis: string;
|
||||
aiGeneratedHypothesis?: boolean; // Flag for AI-suggested
|
||||
|
||||
// Status lifecycle: draft → running → paused | stopped | completed
|
||||
status: 'draft' | 'running' | 'paused' | 'stopped' | 'completed';
|
||||
|
||||
// Variants
|
||||
controlVariantId: string; // Baseline variant
|
||||
variantIds: string[]; // All variant IDs
|
||||
|
||||
// Configuration
|
||||
allocationStrategy: 'random' | 'thompson' | 'epsilon_greedy' | 'ucb';
|
||||
targetPercent: number; // % of eligible traffic
|
||||
|
||||
// Audience targeting
|
||||
targeting: {
|
||||
platforms?: string[]; // ios, android, web
|
||||
appVersions?: { min: string; max?: string };
|
||||
regions?: string[];
|
||||
userSegments?: string[]; // pro, free, enterprise
|
||||
userProperties?: Record<string, string | number | boolean>;
|
||||
};
|
||||
|
||||
// Metrics
|
||||
primaryMetric: {
|
||||
name: string;
|
||||
type: 'conversion' | 'count' | 'duration' | 'revenue' | 'custom';
|
||||
eventName: string; // Telemetry event to track
|
||||
aggregation: 'sum' | 'mean' | 'count' | 'unique';
|
||||
direction: 'increase' | 'decrease'; // Is higher better?
|
||||
minimumDetectableEffect: number; // % change we want to detect
|
||||
};
|
||||
secondaryMetrics: Array<{
|
||||
name: string;
|
||||
type: 'conversion' | 'count' | 'duration' | 'revenue' | 'custom';
|
||||
eventName: string;
|
||||
}>;
|
||||
|
||||
// Guardrails
|
||||
guardrails: {
|
||||
minSampleSizePerVariant: number; // Default: 100
|
||||
maxDurationDays: number; // Safety limit, default: 30
|
||||
autoStopEnabled: boolean;
|
||||
winnerThreshold: number; // % probability to auto-stop, default: 95
|
||||
requireApprovalFor: 'none' | 'revenue' | 'all';
|
||||
};
|
||||
|
||||
// Scheduling
|
||||
startAt?: string; // Scheduled start (ISO 8601)
|
||||
endAt?: string; // Scheduled end or actual stop
|
||||
|
||||
// Stats (denormalized for fast reads)
|
||||
totalParticipants: number;
|
||||
totalEvents: number;
|
||||
|
||||
// Timestamps
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
startedAt?: string;
|
||||
completedAt?: string;
|
||||
ttl: number; // 2 years for completed
|
||||
}
|
||||
```
|
||||
|
||||
### VariantDoc
|
||||
|
||||
```typescript
|
||||
interface VariantDoc {
|
||||
id: string; // var_<uuid>
|
||||
experimentId: string; // partition key
|
||||
|
||||
// Variant definition
|
||||
name: string; // "Control", "New Button Color", etc.
|
||||
description?: string;
|
||||
isControl: boolean;
|
||||
|
||||
// Feature flag configuration
|
||||
flagConfig: Record<string, unknown>; // Arbitrary config payload
|
||||
|
||||
// Traffic allocation (dynamic for bandit strategies)
|
||||
currentAllocationPercent: number; // 0–100%
|
||||
|
||||
// Statistics (real-time computed)
|
||||
stats: {
|
||||
participants: number;
|
||||
events: number;
|
||||
|
||||
// Primary metric
|
||||
primaryMetricValue: number; // Mean or conversion rate
|
||||
primaryMetricStdDev?: number;
|
||||
|
||||
// For conversion metrics
|
||||
conversions?: number;
|
||||
conversionRate?: number; // 0–1
|
||||
|
||||
// Bayesian posterior parameters
|
||||
betaAlpha?: number; // For Beta distribution
|
||||
betaBeta?: number;
|
||||
|
||||
gammaShape?: number; // For Gamma distribution
|
||||
gammaScale?: number;
|
||||
};
|
||||
|
||||
// Bayesian results
|
||||
bayesianResults?: {
|
||||
probabilityBeatsControl: number; // 0–1
|
||||
probabilityBeatsAll: number; // 0–1
|
||||
expectedLiftPercent: number; // Relative to control
|
||||
expectedLoss: number; // Risk of choosing this variant
|
||||
credibleInterval: {
|
||||
lower: number;
|
||||
mean: number;
|
||||
upper: number;
|
||||
};
|
||||
};
|
||||
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
```
|
||||
|
||||
### ExperimentAssignmentDoc
|
||||
|
||||
```typescript
|
||||
interface ExperimentAssignmentDoc {
|
||||
id: string; // ea_<uuid>
|
||||
userId: string; // partition key (for user lookups)
|
||||
|
||||
experimentId: string;
|
||||
variantId: string;
|
||||
|
||||
// Assignment metadata
|
||||
assignedAt: string; // First assignment
|
||||
firstExposedAt?: string; // First actual exposure (feature use)
|
||||
|
||||
// Context at assignment
|
||||
assignmentContext: {
|
||||
platform: string;
|
||||
appVersion: string;
|
||||
osVersion: string;
|
||||
deviceModel?: string;
|
||||
region?: string;
|
||||
};
|
||||
|
||||
// Events attributed to this assignment
|
||||
eventCount: number;
|
||||
lastEventAt?: string;
|
||||
|
||||
// TTL: Remove after experiment completes + analysis period
|
||||
ttl: number; // experimentEnd + 90 days
|
||||
}
|
||||
```
|
||||
|
||||
### ExperimentEventDoc
|
||||
|
||||
```typescript
|
||||
interface ExperimentEventDoc {
|
||||
id: string; // ee_<uuid>
|
||||
experimentId: string; // partition key
|
||||
timestamp: string; // Sort key for time-series queries
|
||||
|
||||
// Attribution
|
||||
userId: string;
|
||||
variantId: string;
|
||||
assignmentId: string;
|
||||
|
||||
// Event details
|
||||
metricName: string;
|
||||
metricType: 'conversion' | 'count' | 'duration' | 'revenue' | 'custom';
|
||||
value: number; // Numeric value
|
||||
|
||||
// Conversion tracking (for binary metrics)
|
||||
converted: boolean; // For conversion metrics
|
||||
|
||||
// Context
|
||||
eventMetadata?: Record<string, unknown>;
|
||||
|
||||
// Denormalized for filtering
|
||||
platform: string;
|
||||
appVersion: string;
|
||||
|
||||
// TTL: Shorter for raw events
|
||||
ttl: number; // 90 days
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Tracking
|
||||
|
||||
| Phase | Task | Status | Commit |
|
||||
| ----- | ----------------------------- | ------ | ------ |
|
||||
| 1.1 | Experiment types & schemas | ⬜ | — |
|
||||
| 1.1 | Cosmos containers | ⬜ | — |
|
||||
| 1.2 | Deterministic bucketing | ⬜ | — |
|
||||
| 1.2 | Assignment strategies | ⬜ | — |
|
||||
| 1.2 | Audience targeting | ⬜ | — |
|
||||
| 1.3 | Metric definitions | ⬜ | — |
|
||||
| 1.3 | Event ingestion | ⬜ | — |
|
||||
| 2.1 | Bayesian inference engine | ⬜ | — |
|
||||
| 2.1 | Probability calculations | ⬜ | — |
|
||||
| 2.1 | Credible intervals | ⬜ | — |
|
||||
| 2.2 | Early stopping rules | ⬜ | — |
|
||||
| 2.2 | Auto-promotion | ⬜ | — |
|
||||
| 2.2 | Guardrails | ⬜ | — |
|
||||
| 2.3 | Thompson sampling | ⬜ | — |
|
||||
| 2.3 | Exploration vs exploitation | ⬜ | — |
|
||||
| 2.3 | Regret minimization | ⬜ | — |
|
||||
| 3.1 | Pattern detection | ⬜ | — |
|
||||
| 3.1 | Anomaly detection | ⬜ | — |
|
||||
| 3.2 | Hypothesis generation prompts | ⬜ | — |
|
||||
| 3.2 | Hypothesis ranking | ⬜ | — |
|
||||
| 3.3 | Auto-experiment suggestions | ⬜ | — |
|
||||
| 4.1 | Experiments list page | ⬜ | — |
|
||||
| 4.1 | Creation wizard | ⬜ | — |
|
||||
| 4.2 | Live dashboard | ⬜ | — |
|
||||
| 4.2 | Statistical summary | ⬜ | — |
|
||||
| 4.3 | Results & reporting | ⬜ | — |
|
||||
| 4.3 | AI insights panel | ⬜ | — |
|
||||
|
||||
**Legend:** ⬜ Not started | 🟡 In progress | ✅ Complete | ⏸️ Deferred
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference for Implementing Agent
|
||||
|
||||
**📋 Full Roadmap:** `/Users/sd9235/code/mygh/learning_ai_common_plat/docs/roadmaps/INTELLIGENT_AB_TESTING_ROADMAP.md`
|
||||
|
||||
**Key Files to Modify/Create:**
|
||||
|
||||
```
|
||||
services/platform-service/
|
||||
├── src/
|
||||
│ ├── modules/ab-testing/
|
||||
│ │ ├── types.ts # [1.1] Experiment, Variant, Assignment types
|
||||
│ │ ├── repository.ts # [1.2] Data access layer
|
||||
│ │ ├── bucketing.ts # [1.2] FNV-1a hash, sticky assignments
|
||||
│ │ ├── statistics.ts # [2.1] Bayesian inference, Beta/Normal distributions
|
||||
│ │ ├── allocation.ts # [2.3] Thompson sampling, bandit strategies
|
||||
│ │ ├── hypothesis-generator.ts # [3.2] LLM pattern analysis
|
||||
│ │ ├── routes.ts # [4] REST API
|
||||
│ │ └── ab-testing.test.ts # Tests
|
||||
│ ├── lib/
|
||||
│ │ └── cosmos-init.ts # [1.1] Add containers
|
||||
│ └── server.ts # Register routes
|
||||
dashboards/admin-web/
|
||||
├── src/
|
||||
│ ├── app/(dashboard)/
|
||||
│ │ ├── experiments/
|
||||
│ │ │ ├── page.tsx # [4.1] Experiments list
|
||||
│ │ │ ├── new/page.tsx # [4.1] Creation wizard
|
||||
│ │ │ └── [id]/
|
||||
│ │ │ └── page.tsx # [4.2] Live dashboard
|
||||
│ ├── lib/
|
||||
│ │ └── experiments-client.ts # API client
|
||||
│ └── components/
|
||||
│ └── experiments/ # Bayesian charts, variant cards
|
||||
```
|
||||
|
||||
**Commit Message Format:**
|
||||
|
||||
```
|
||||
feat(ab-testing): <description> [<task.code>]
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```bash
|
||||
git add services/platform-service/src/modules/ab-testing/
|
||||
git commit -m "feat(ab-testing): add experiment types and cosmos containers [1.1]"
|
||||
```
|
||||
|
||||
**Testing Requirements:**
|
||||
|
||||
- Unit tests: 25+ Vitest tests for bucketing, statistics, bandit algorithms
|
||||
- Statistical validation: A/A tests, known distribution tests
|
||||
- Integration: End-to-end experiment lifecycle
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Feature flags module (reuse bucketing logic)
|
||||
- Telemetry module (event tracking)
|
||||
- Azure OpenAI (hypothesis generation)
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: Statistical Methods
|
||||
|
||||
### Bayesian A/B Testing
|
||||
|
||||
**Conversion Metrics (Beta-Binomial):**
|
||||
|
||||
```
|
||||
Posterior: Beta(α + conversions, β + non-conversions)
|
||||
Where α = β = 1 (uniform prior)
|
||||
|
||||
Probability variant beats control:
|
||||
P(variant > control) = Σ(i=0 to n) [BetaCDF_control(i)] * [BetaPDF_variant(i)]
|
||||
```
|
||||
|
||||
**Continuous Metrics (Normal):**
|
||||
|
||||
```
|
||||
Posterior: Normal(μ_n, σ_n²)
|
||||
Where μ_n, σ_n updated via conjugate prior
|
||||
|
||||
Probability variant beats control via Monte Carlo sampling
|
||||
```
|
||||
|
||||
### Thompson Sampling
|
||||
|
||||
```
|
||||
For each incoming user:
|
||||
For each variant:
|
||||
Sample θ_i from variant's posterior distribution
|
||||
Assign user to variant with max(θ_i)
|
||||
|
||||
Update variant's posterior after observing outcome
|
||||
```
|
||||
|
||||
### Early Stopping
|
||||
|
||||
```
|
||||
Stop experiment when:
|
||||
max_variant P(beats control) > 0.95 → Winner found
|
||||
OR max_variant P(beats control) < 0.05 → No winner
|
||||
OR days_running > max_duration
|
||||
AND samples_per_variant > min_sample_size
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appendix C: API Reference
|
||||
|
||||
| Method | Endpoint | Auth | Description |
|
||||
| ------ | -------------------------------------- | -------- | -------------------------------- |
|
||||
| POST | `/ab-testing/experiments` | Admin | Create experiment |
|
||||
| GET | `/ab-testing/experiments` | Admin | List experiments |
|
||||
| GET | `/ab-testing/experiments/:id` | Admin | Get experiment details |
|
||||
| PATCH | `/ab-testing/experiments/:id` | Admin | Update experiment |
|
||||
| DELETE | `/ab-testing/experiments/:id` | Admin | Stop/archive experiment |
|
||||
| POST | `/ab-testing/experiments/:id/start` | Admin | Start experiment |
|
||||
| POST | `/ab-testing/experiments/:id/pause` | Admin | Pause experiment |
|
||||
| POST | `/ab-testing/experiments/:id/complete` | Admin | Complete with winner |
|
||||
| POST | `/ab-testing/assign` | Any auth | Get variant assignment for user |
|
||||
| POST | `/ab-testing/events` | Any auth | Track experiment event |
|
||||
| GET | `/ab-testing/experiments/:id/results` | Admin | Get statistical results |
|
||||
| GET | `/ab-testing/suggestions` | Admin | AI-generated experiment ideas |
|
||||
| POST | `/ab-testing/hypotheses` | Admin | Generate hypothesis from pattern |
|
||||
|
||||
---
|
||||
|
||||
## Appendix D: Integration Points
|
||||
|
||||
### With Feature Flags Module
|
||||
|
||||
- Experiments build on feature flag infrastructure
|
||||
- Flag state = variant assignment
|
||||
- Consistent bucketing with existing flags
|
||||
|
||||
### With Telemetry Module
|
||||
|
||||
- Experiment events enriched with telemetry context
|
||||
- Automatic metric tracking from existing events
|
||||
- Funnel analysis using telemetry breadcrumbs
|
||||
|
||||
### With Event Bus
|
||||
|
||||
| Event | Action |
|
||||
| ----------------------------- | ----------------------------------- |
|
||||
| `ab.experiment.started` | Notify stakeholders, log audit |
|
||||
| `ab.experiment.completed` | Generate report, suggest follow-ups |
|
||||
| `ab.variant.declared_winner` | Trigger auto-rollout if enabled |
|
||||
| `ab.early_stopping.triggered` | Alert experiment owner |
|
||||
|
||||
---
|
||||
|
||||
## Appendix E: Cost Estimation
|
||||
|
||||
| Component | Monthly Cost (est.) |
|
||||
| ---------------------------- | ------------------------ |
|
||||
| Cosmos DB (experiment data) | $100–200 |
|
||||
| LLM hypothesis generation | $50–100 (weekly reports) |
|
||||
| Compute (statistical engine) | $50 (negligible) |
|
||||
| **Total** | **$200–350/month** |
|
||||
|
||||
---
|
||||
|
||||
## Current Status
|
||||
|
||||
- [ ] **Design complete** — Target: 2026-03-10
|
||||
- [ ] **Phase 1: Core Engine** — Not started
|
||||
- [ ] **Phase 2: Statistics** — Not started
|
||||
- [ ] **Phase 3: AI Hypotheses** — Not started
|
||||
- [ ] **Phase 4: Admin UI** — Not started
|
||||
- [ ] **Phase 5: Advanced** — Future
|
||||
|
||||
**Estimated Timeline:** 2.5–3 weeks (Phases 1–4)
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Feature flags module (for assignment infrastructure)
|
||||
- Telemetry module (for event tracking)
|
||||
- Azure OpenAI (for hypothesis generation)
|
||||
|
||||
---
|
||||
|
||||
_Last Updated: 2026-03-03_
|
||||
848
docs/roadmaps/PREDICTIVE_CHURN_HEALTH_SCORING_ROADMAP.md
Normal file
848
docs/roadmaps/PREDICTIVE_CHURN_HEALTH_SCORING_ROADMAP.md
Normal file
@ -0,0 +1,848 @@
|
||||
# Predictive Churn & Health Scoring — Implementation Roadmap
|
||||
|
||||
> **Module:** `platform-service/src/modules/predictive-analytics/`
|
||||
> **Admin UI:** `/ops/health-dashboard/`
|
||||
> **Target:** ML-powered churn prediction, health scoring, and proactive retention
|
||||
> **Estimated Effort:** 3 weeks
|
||||
> **Status:** 🟡 Planning
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This roadmap delivers a **predictive analytics platform** that forecasts user churn 7–30 days in advance and computes product health scores from telemetry. Unlike reactive dashboards that show what happened, this system **predicts what will happen**—enabling proactive retention campaigns, resource allocation, and product improvements before users leave.
|
||||
|
||||
### Key Differentiators vs. Reactive Analytics
|
||||
|
||||
| Capability | Traditional Analytics | Predictive Churn & Health |
|
||||
| ------------------- | -------------------------- | --------------------------------- |
|
||||
| Insight Type | Historical (what happened) | **Predictive (what will happen)** |
|
||||
| Churn Detection | After user leaves | **7–30 days before churn** |
|
||||
| Health View | Current snapshot only | **Trending + forecasted** |
|
||||
| Interventions | Reactive recovery | **Proactive prevention** |
|
||||
| Product Insights | Manual pattern search | **Auto-detected risk signals** |
|
||||
| Resource Allocation | Guesswork | **Risk-weighted prioritization** |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Feature Engineering Pipeline (Week 1)
|
||||
|
||||
### 1.1 Telemetry Feature Extraction
|
||||
|
||||
- [ ] **1.1.1** Create `modules/predictive-analytics/feature-extractor.ts`
|
||||
- [ ] User behavior features (session frequency, depth, recency)
|
||||
- [ ] Engagement features (feature usage diversity, core action completion)
|
||||
- [ ] Performance features (error rate, latency exposure, crash frequency)
|
||||
- [ ] Social features (sharing, collaboration, network effects)
|
||||
- [ ] Revenue features (payment history, plan changes, support tickets)
|
||||
- [ ] **1.1.2** Time-window aggregations
|
||||
- [ ] Last 24 hours (recent behavior)
|
||||
- [ ] Last 7 days (weekly patterns)
|
||||
- [ ] Last 30 days (monthly trends)
|
||||
- [ ] Life-to-date (all-time totals)
|
||||
- [ ] **1.1.3** Rolling window features
|
||||
- [ ] 7-day rolling average (trend smoothing)
|
||||
- [ ] Week-over-week change (acceleration)
|
||||
- [ ] Cohort-normalized scores (vs. similar users)
|
||||
|
||||
### 1.2 Feature Store
|
||||
|
||||
- [ ] **1.2.1** Create `modules/predictive-analytics/feature-store.ts`
|
||||
- [ ] `UserFeatureVector` — normalized feature values per user
|
||||
- [ ] `ProductHealthMetrics` — aggregated product-level scores
|
||||
- [ ] Feature versioning (track feature schema changes)
|
||||
- [ ] **1.2.2** Add Cosmos containers to `cosmos-init.ts`
|
||||
- [ ] `user_features` (pk: `/userId`, TTL: 90 days)
|
||||
- [ ] `product_health` (pk: `/productId` + `/date`, time-series)
|
||||
- [ ] `feature_definitions` (pk: `/productId`, feature metadata)
|
||||
- [ ] **1.2.3** Feature computation jobs
|
||||
- [ ] Daily feature computation (nightly batch)
|
||||
- [ ] Real-time feature updates (on key events)
|
||||
- [ ] Feature backfill (compute historical features)
|
||||
|
||||
### 1.3 Product-Specific Feature Catalog
|
||||
|
||||
- [ ] **1.3.1** Define features per product
|
||||
- [ ] **NomGap:** Fast completion rate, protocol adherence, streak length, autophagy engagement
|
||||
- [ ] **JarvisJr:** Session frequency, agent diversity, voice/text ratio, skill progression
|
||||
- [ ] **ChronoMind:** Timer completion rate, cascade effectiveness, routine adherence, urgency response
|
||||
- [ ] **MindLyst:** Brain usage diversity, triage accuracy, memory capture frequency, reflection completion
|
||||
- [ ] **PeakPulse:** Session frequency, goal completion, streak maintenance, social sharing
|
||||
- [ ] **LysnrAI:** Dictation frequency, accuracy rate, hotkey usage, vocabulary growth
|
||||
- [ ] **1.3.2** Feature importance tracking
|
||||
- [ ] Which features correlate with churn/retention
|
||||
- [ ] Feature drift detection (behavior changes over time)
|
||||
- [ ] Auto-suggest new features based on patterns
|
||||
|
||||
**Phase 1 Exit Criteria:**
|
||||
|
||||
- [ ] 50+ features extracted per product
|
||||
- [ ] Feature store populated for all active users
|
||||
- [ ] Daily feature computation job running
|
||||
- [ ] Feature importance analysis completed
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Churn Prediction Model (Week 1–2)
|
||||
|
||||
### 2.1 Model Architecture
|
||||
|
||||
- [ ] **2.1.1** Create `modules/predictive-analytics/churn-model.ts`
|
||||
- [ ] Binary classification (will churn in 7 days? 30 days?)
|
||||
- [ ] Gradient Boosted Trees (XGBoost/LightGBM) baseline
|
||||
- [ ] Neural network ensemble (for comparison)
|
||||
- [ ] **2.1.2** Training pipeline
|
||||
- [ ] Label definition: No activity for N days = churned
|
||||
- [ ] Train/validation/test split (time-based, not random)
|
||||
- [ ] Cross-validation with temporal folds
|
||||
- [ ] Hyperparameter tuning (optuna/ray tune)
|
||||
- [ ] **2.1.3** Model evaluation
|
||||
- [ ] ROC-AUC (discrimination ability)
|
||||
- [ ] Precision/Recall at different thresholds
|
||||
- [ ] Calibration (predicted prob vs. actual rate)
|
||||
- [ ] Per-product performance breakdown
|
||||
|
||||
### 2.2 Prediction Service
|
||||
|
||||
- [ ] **2.2.1** Real-time scoring API
|
||||
- [ ] `POST /predictive/churn-score` — single user prediction
|
||||
- [ ] `POST /predictive/churn-batch` — batch scoring
|
||||
- [ ] Latency < 100ms for single prediction
|
||||
- [ ] **2.2.2** Risk segmentation
|
||||
- [ ] Risk buckets: Critical (>80%), High (60–80%), Medium (30–60%), Low (<30%)
|
||||
- [ ] Risk score components (which features drive the score)
|
||||
- [ ] Confidence intervals on predictions
|
||||
- [ ] **2.2.3** Model versioning
|
||||
- [ ] A/B test model versions
|
||||
- [ ] Shadow mode (predict without acting)
|
||||
- [ ] Rollback capability
|
||||
|
||||
### 2.3 Explanation Engine
|
||||
|
||||
- [ ] **2.3.1** SHAP value computation
|
||||
- [ ] Feature contributions to each prediction
|
||||
- [ ] Global feature importance (what drives churn overall)
|
||||
- [ ] Local explanations (why this specific user is at risk)
|
||||
- [ ] **2.3.2** Natural language explanations
|
||||
```
|
||||
"This user shows 78% churn risk because:
|
||||
- Session frequency dropped 60% in the last week
|
||||
- No core feature usage in 5 days
|
||||
- Error rate increased 3x vs. their baseline
|
||||
- Similar users who showed these patterns had 85% churn rate"
|
||||
```
|
||||
- [ ] **2.3.3** Actionable insight extraction
|
||||
- [ ] Top 3 risk factors per user
|
||||
- [ ] Suggested intervention based on risk profile
|
||||
- [ ] Priority ranking (who to contact first)
|
||||
|
||||
**Phase 2 Exit Criteria:**
|
||||
|
||||
- [ ] Model achieves > 75% AUC on test set
|
||||
- [ ] Real-time scoring API < 100ms latency
|
||||
- [ ] Explanations generated for all predictions
|
||||
- [ ] Risk segmentation validated against historical churn
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Product Health Scoring (Week 2)
|
||||
|
||||
### 3.1 Health Metric Framework
|
||||
|
||||
- [ ] **3.1.1** Create `modules/predictive-analytics/health-scoring.ts`
|
||||
- [ ] Health dimensions: Acquisition, Activation, Retention, Revenue, Engagement
|
||||
- [ ] Composite health score (weighted average)
|
||||
- [ ] Per-dimension scores with drill-down
|
||||
- [ ] **3.1.2** Product health indicators
|
||||
- [ ] Daily Active Users (DAU) trend
|
||||
- [ ] New user activation rate (Day-1, Day-7)
|
||||
- [ ] Cohort retention curves (Day-1, Day-7, Day-30)
|
||||
- [ ] Feature adoption rates (new feature uptake)
|
||||
- [ ] Error rates and stability scores
|
||||
- [ ] Support ticket volume and sentiment
|
||||
- [ ] Revenue metrics (MRR, ARPU, LTV)
|
||||
|
||||
### 3.2 Health Score Computation
|
||||
|
||||
- [ ] **3.2.1** Baseline establishment
|
||||
- [ ] Historical 90-day baseline for each metric
|
||||
- [ ] Peer product comparison (ChronoMind vs. JarvisJr benchmarks)
|
||||
- [ ] Industry benchmarks (if available)
|
||||
- [ ] **3.2.2** Scoring algorithm
|
||||
- [ ] Z-score normalization (how many std devs from baseline)
|
||||
- [ ] Trend direction (improving vs. declining)
|
||||
- [ ] Volatility adjustment (consistent vs. erratic)
|
||||
- [ ] 0–100 health score scale
|
||||
- [ ] **3.2.3** Alert thresholds
|
||||
- [ ] Critical: Score < 60 or 20% drop from baseline
|
||||
- [ ] Warning: Score 60–75 or 10% drop
|
||||
- [ ] Healthy: Score > 75 and stable
|
||||
|
||||
### 3.3 Anomaly Detection
|
||||
|
||||
- [ ] **3.3.1** Statistical anomaly detection
|
||||
- [ ] Prophet/ARIMA for time-series forecasting
|
||||
- [ ] Forecast vs. actual deviation detection
|
||||
- [ ] Seasonal pattern recognition (day-of-week, monthly)
|
||||
- [ ] **3.3.2** Multi-dimensional anomaly detection
|
||||
- [ ] Correlation breakdown detection (metrics usually correlated diverging)
|
||||
- [ ] Cohort-specific anomalies (specific region, platform, segment)
|
||||
- [ ] **3.3.3** Root cause suggestion
|
||||
- [ ] Correlation with deployments/releases
|
||||
- [ ] Error spike correlation
|
||||
- [ ] External factor detection (holidays, events)
|
||||
|
||||
**Phase 3 Exit Criteria:**
|
||||
|
||||
- [ ] Health scores computed daily for all products
|
||||
- [ ] Anomaly detection with < 5% false positive rate
|
||||
- [ ] Historical baseline established for all metrics
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Proactive Intervention System (Week 2–3)
|
||||
|
||||
### 4.1 Retention Campaign Automation
|
||||
|
||||
- [ ] **4.1.1** Campaign trigger rules
|
||||
- [ ] High-risk user enters segment → trigger email
|
||||
- [ ] Medium-risk + specific behavior → trigger in-app message
|
||||
- [ ] Critical risk → trigger personal outreach task
|
||||
- [ ] **4.1.2** Personalized messaging
|
||||
- [ ] Message variant based on risk factors
|
||||
- [ ] Feature recommendations based on unused capabilities
|
||||
- [ ] Success stories from similar users
|
||||
- [ ] **4.1.3** Campaign effectiveness tracking
|
||||
- [ ] Control group vs. treatment
|
||||
- [ ] Churn rate comparison
|
||||
- [ ] Revenue impact measurement
|
||||
|
||||
### 4.2 Auto-Trigger Flows
|
||||
|
||||
- [ ] **4.2.1** Platform integrations
|
||||
- [ ] Email delivery via existing `modules/delivery/`
|
||||
- [ ] Push notifications via `modules/notifications/`
|
||||
- [ ] Slack notifications for CS team
|
||||
- [ ] CRM integration (create outreach tasks)
|
||||
- [ ] **4.2.2** Smart scheduling
|
||||
- [ ] Optimal contact time prediction
|
||||
- [ ] Frequency capping (don't spam)
|
||||
- [ ] Multi-channel orchestration
|
||||
- [ ] **4.2.3** Feedback loop
|
||||
- [ ] Track intervention outcomes
|
||||
- [ ] Retrain model with intervention effectiveness
|
||||
- [ ] A/B test intervention strategies
|
||||
|
||||
### 4.3 Risk Dashboard for CS Team
|
||||
|
||||
- [ ] **4.3.1** At-risk user list
|
||||
- [ ] Sortable by churn probability
|
||||
- [ ] Filter by product, segment, risk factors
|
||||
- [ ] Last activity preview
|
||||
- [ ] **4.3.2** User risk profile
|
||||
- [ ] Churn probability trend over time
|
||||
- [ ] Key risk factors highlighted
|
||||
- [ ] Recommended actions
|
||||
- [ ] User activity timeline
|
||||
- [ ] **4.3.3** Intervention tracking
|
||||
- [ ] Contact history
|
||||
- [ ] Response tracking
|
||||
- [ ] Outcome recording (retained/churned)
|
||||
|
||||
**Phase 4 Exit Criteria:**
|
||||
|
||||
- [ ] Automated campaigns triggered for high-risk users
|
||||
- [ ] CS team dashboard with at-risk user queue
|
||||
- [ ] Intervention effectiveness measurement in place
|
||||
- [ ] Feedback loop improving model accuracy
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Admin Dashboard UI (Week 3)
|
||||
|
||||
### 5.1 Product Health Overview
|
||||
|
||||
- [ ] **5.1.1** Create `/ops/health-dashboard/page.tsx`
|
||||
- [ ] Health score cards for each product
|
||||
- [ ] Trend sparklines (7-day, 30-day)
|
||||
- [ ] Alert summary (critical issues count)
|
||||
- [ ] Product comparison table
|
||||
- [ ] **5.1.2** Health detail view
|
||||
- [ ] Dimension breakdown (acquisition, activation, retention, etc.)
|
||||
- [ ] Metric time-series charts
|
||||
- [ ] Anomaly markers on charts
|
||||
- [ ] Cohort retention curves
|
||||
|
||||
### 5.2 Churn Prediction Dashboard
|
||||
|
||||
- [ ] **5.2.1** Churn risk overview
|
||||
- [ ] Risk distribution pie chart
|
||||
- [ ] At-risk user count by product
|
||||
- [ ] Predicted churn impact (revenue at risk)
|
||||
- [ ] Model performance metrics (AUC, calibration)
|
||||
- [ ] **5.2.2** User risk explorer
|
||||
- [ ] Search/filter at-risk users
|
||||
- [ ] Risk score with explanation
|
||||
- [ ] Top risk factors
|
||||
- [ ] Recommended interventions
|
||||
- [ ] **5.2.3** Model insights
|
||||
- [ ] Global feature importance chart
|
||||
- [ ] Model performance over time
|
||||
- [ ] Feature drift alerts
|
||||
|
||||
### 5.3 Campaign Management
|
||||
|
||||
- [ ] **5.3.1** Campaign list
|
||||
- [ ] Active/paused/completed campaigns
|
||||
- [ ] Trigger rules summary
|
||||
- [ ] Performance stats (sent, opened, converted)
|
||||
- [ ] **5.3.2** Campaign editor
|
||||
- [ ] Trigger condition builder
|
||||
- [ ] Message template editor
|
||||
- [ ] Audience targeting
|
||||
- [ ] A/B test configuration
|
||||
- [ ] **5.3.3** Campaign analytics
|
||||
- [ ] Funnel: triggered → sent → opened → retained
|
||||
- [ ] Revenue impact
|
||||
- [ ] Comparison to control group
|
||||
|
||||
**Phase 5 Exit Criteria:**
|
||||
|
||||
- [ ] Health dashboard shows all products with trends
|
||||
- [ ] Churn predictions visible with explanations
|
||||
- [ ] Campaign creation and management functional
|
||||
- [ ] Full test coverage
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Advanced Capabilities (Future)
|
||||
|
||||
### 6.1 Cohort-Specific Models
|
||||
|
||||
- [ ] Segment-specific churn models (iOS vs. Android, free vs. pro)
|
||||
- [ ] Regional models (different behaviors by geography)
|
||||
- [ ] Temporal models (seasonal churn patterns)
|
||||
|
||||
### 6.2 LTV Prediction
|
||||
|
||||
- [ ] Predict lifetime value at signup
|
||||
- [ ] Predict upgrade probability (free → pro)
|
||||
- [ ] Optimize acquisition channels by predicted LTV
|
||||
|
||||
### 6.3 Product Recommendations
|
||||
|
||||
- [ ] Suggest features to at-risk users based on successful cohorts
|
||||
- [ ] Personalized onboarding based on predicted needs
|
||||
- [ ] Next-best-action recommendations
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Data Models
|
||||
|
||||
### UserChurnPredictionDoc
|
||||
|
||||
```typescript
|
||||
interface UserChurnPredictionDoc {
|
||||
id: string; // cp_<uuid>
|
||||
userId: string; // partition key
|
||||
productId: string;
|
||||
|
||||
// Prediction
|
||||
predictionHorizon: 7 | 14 | 30; // Days
|
||||
churnProbability: number; // 0–1
|
||||
riskSegment: 'critical' | 'high' | 'medium' | 'low';
|
||||
|
||||
// Feature vector snapshot
|
||||
features: Record<string, number>; // Normalized feature values
|
||||
featureVersion: string; // Schema version
|
||||
|
||||
// Model info
|
||||
modelVersion: string;
|
||||
modelType: 'xgboost' | 'neural';
|
||||
predictionTimestamp: string;
|
||||
|
||||
// Explanation (SHAP values)
|
||||
explanation: {
|
||||
topRiskFactors: Array<{
|
||||
feature: string;
|
||||
contribution: number; // SHAP value
|
||||
direction: 'positive' | 'negative'; // Increases or decreases churn risk
|
||||
}>;
|
||||
globalFeatureImportance: Array<{
|
||||
feature: string;
|
||||
importance: number;
|
||||
}>;
|
||||
};
|
||||
|
||||
// Natural language summary
|
||||
nlExplanation: string; // Auto-generated explanation
|
||||
|
||||
// Intervention
|
||||
suggestedActions: string[];
|
||||
interventionHistory: Array<{
|
||||
action: string;
|
||||
timestamp: string;
|
||||
outcome?: 'responded' | 'ignored' | 'churned' | 'retained';
|
||||
}>;
|
||||
|
||||
// Validation (ground truth)
|
||||
actualChurned?: boolean;
|
||||
validationDate?: string;
|
||||
|
||||
createdAt: string;
|
||||
ttl: number; // predictionHorizon + 90 days
|
||||
}
|
||||
```
|
||||
|
||||
### ProductHealthScoreDoc
|
||||
|
||||
```typescript
|
||||
interface ProductHealthScoreDoc {
|
||||
id: string; // ph_<uuid>
|
||||
productId: string; // partition key
|
||||
date: string; // Sort key (YYYY-MM-DD)
|
||||
|
||||
// Composite score
|
||||
overallHealthScore: number; // 0–100
|
||||
healthStatus: 'critical' | 'warning' | 'healthy';
|
||||
|
||||
// Dimension scores
|
||||
dimensions: {
|
||||
acquisition: {
|
||||
score: number; // 0–100
|
||||
metrics: {
|
||||
newUsers: number;
|
||||
activationRateDay1: number;
|
||||
activationRateDay7: number;
|
||||
cac: number;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
activation: {
|
||||
score: number;
|
||||
metrics: {
|
||||
firstValueMomentRate: number;
|
||||
timeToFirstAction: number;
|
||||
onboardingCompletionRate: number;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
retention: {
|
||||
score: number;
|
||||
metrics: {
|
||||
dau: number;
|
||||
mau: number;
|
||||
dauMauRatio: number;
|
||||
day7Retention: number;
|
||||
day30Retention: number;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
engagement: {
|
||||
score: number;
|
||||
metrics: {
|
||||
avgSessionLength: number;
|
||||
sessionsPerUser: number;
|
||||
featureAdoption: Record<string, number>;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
revenue: {
|
||||
score: number;
|
||||
metrics: {
|
||||
mrr: number;
|
||||
arpu: number;
|
||||
churnRate: number;
|
||||
upgradeRate: number;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
stability: {
|
||||
score: number;
|
||||
metrics: {
|
||||
crashFreeRate: number;
|
||||
errorRate: number;
|
||||
avgLatency: number;
|
||||
uptimePercent: number;
|
||||
};
|
||||
trend: 'improving' | 'stable' | 'declining';
|
||||
};
|
||||
};
|
||||
|
||||
// Anomalies detected
|
||||
anomalies: Array<{
|
||||
metric: string;
|
||||
expectedValue: number;
|
||||
actualValue: number;
|
||||
deviationPercent: number;
|
||||
severity: 'critical' | 'warning';
|
||||
suggestedCause?: string;
|
||||
}>;
|
||||
|
||||
// Forecasts
|
||||
forecasts: {
|
||||
next7Days: {
|
||||
expectedHealthScore: number;
|
||||
confidenceInterval: [number, number];
|
||||
};
|
||||
next30Days: {
|
||||
expectedHealthScore: number;
|
||||
confidenceInterval: [number, number];
|
||||
};
|
||||
};
|
||||
|
||||
// Benchmarks
|
||||
vsBaseline7Day: number; // % change vs. 7-day baseline
|
||||
vsBaseline30Day: number; // % change vs. 30-day baseline
|
||||
|
||||
createdAt: string;
|
||||
ttl: number; // 2 years
|
||||
}
|
||||
```
|
||||
|
||||
### RetentionCampaignDoc
|
||||
|
||||
```typescript
|
||||
interface RetentionCampaignDoc {
|
||||
id: string; // rc_<uuid>
|
||||
productId: string; // partition key
|
||||
|
||||
// Campaign definition
|
||||
name: string;
|
||||
description: string;
|
||||
status: 'draft' | 'active' | 'paused' | 'completed';
|
||||
|
||||
// Trigger conditions
|
||||
trigger: {
|
||||
type: 'churn_risk' | 'health_score_drop' | 'behavioral' | 'scheduled';
|
||||
conditions: Array<{
|
||||
field: string;
|
||||
operator: 'gt' | 'lt' | 'eq' | 'in';
|
||||
value: unknown;
|
||||
}>;
|
||||
};
|
||||
|
||||
// Audience
|
||||
audience: {
|
||||
riskSegments?: string[]; // 'critical', 'high', etc.
|
||||
products?: string[];
|
||||
userSegments?: string[];
|
||||
excludeRecentContact?: number; // Hours (frequency capping)
|
||||
};
|
||||
|
||||
// Message content
|
||||
messages: Array<{
|
||||
channel: 'email' | 'push' | 'in_app' | 'slack_cs';
|
||||
templateId: string;
|
||||
variant?: string; // A/B test variant
|
||||
delayHours?: number; // Delay after trigger
|
||||
conditions?: Array<{
|
||||
field: string;
|
||||
operator: string;
|
||||
value: unknown;
|
||||
}>;
|
||||
}>;
|
||||
|
||||
// Performance tracking
|
||||
stats: {
|
||||
triggered: number;
|
||||
sent: number;
|
||||
opened: number;
|
||||
clicked: number;
|
||||
converted: number;
|
||||
controlGroupSize: number;
|
||||
controlChurnRate: number;
|
||||
treatmentChurnRate: number;
|
||||
};
|
||||
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
ttl: number; // 1 year after completion
|
||||
}
|
||||
```
|
||||
|
||||
### UserFeatureVectorDoc
|
||||
|
||||
```typescript
|
||||
interface UserFeatureVectorDoc {
|
||||
id: string; // fv_<uuid>
|
||||
userId: string; // partition key
|
||||
productId: string;
|
||||
|
||||
// Computed features
|
||||
features: {
|
||||
// Recency features
|
||||
daysSinceLastSession: number;
|
||||
daysSinceLastCoreAction: number;
|
||||
|
||||
// Frequency features
|
||||
sessionsLast7Days: number;
|
||||
sessionsLast30Days: number;
|
||||
avgSessionsPerWeek: number;
|
||||
|
||||
// Engagement depth
|
||||
avgSessionDuration: number;
|
||||
actionsPerSession: number;
|
||||
uniqueFeaturesUsed: number;
|
||||
|
||||
// Product-specific (examples)
|
||||
// NomGap
|
||||
fastCompletionRate?: number;
|
||||
streakLength?: number;
|
||||
|
||||
// JarvisJr
|
||||
agentDiversityScore?: number;
|
||||
voiceSessionRatio?: number;
|
||||
|
||||
// ChronoMind
|
||||
timerCompletionRate?: number;
|
||||
routineAdherenceScore?: number;
|
||||
|
||||
// Error/stability
|
||||
errorRateLast7Days: number;
|
||||
crashCountLast30Days: number;
|
||||
|
||||
// Revenue
|
||||
planTier: number; // 0=free, 1=pro, 2=enterprise
|
||||
lifetimeValue: number;
|
||||
daysSinceLastPayment?: number;
|
||||
};
|
||||
|
||||
// Normalized (0–1) for model input
|
||||
normalizedFeatures: Record<string, number>;
|
||||
|
||||
// Metadata
|
||||
featureSchemaVersion: string;
|
||||
computedAt: string;
|
||||
|
||||
// Time windows
|
||||
observationWindow: {
|
||||
start: string;
|
||||
end: string;
|
||||
};
|
||||
|
||||
ttl: number; // 90 days
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Tracking
|
||||
|
||||
| Phase | Task | Status | Commit |
|
||||
| ----- | ----------------------------- | ------ | ------ |
|
||||
| 1.1 | Telemetry feature extraction | ⬜ | — |
|
||||
| 1.1 | Time-window aggregations | ⬜ | — |
|
||||
| 1.1 | Rolling window features | ⬜ | — |
|
||||
| 1.2 | Feature store | ⬜ | — |
|
||||
| 1.2 | Cosmos containers | ⬜ | — |
|
||||
| 1.2 | Feature computation jobs | ⬜ | — |
|
||||
| 1.3 | Product-specific features | ⬜ | — |
|
||||
| 1.3 | Feature importance tracking | ⬜ | — |
|
||||
| 2.1 | XGBoost model architecture | ⬜ | — |
|
||||
| 2.1 | Training pipeline | ⬜ | — |
|
||||
| 2.1 | Model evaluation | ⬜ | — |
|
||||
| 2.2 | Real-time scoring API | ⬜ | — |
|
||||
| 2.2 | Risk segmentation | ⬜ | — |
|
||||
| 2.2 | Model versioning | ⬜ | — |
|
||||
| 2.3 | SHAP explanations | ⬜ | — |
|
||||
| 2.3 | Natural language explanations | ⬜ | — |
|
||||
| 2.3 | Actionable insights | ⬜ | — |
|
||||
| 3.1 | Health metric framework | ⬜ | — |
|
||||
| 3.1 | Health indicators | ⬜ | — |
|
||||
| 3.2 | Baseline establishment | ⬜ | — |
|
||||
| 3.2 | Scoring algorithm | ⬜ | — |
|
||||
| 3.2 | Alert thresholds | ⬜ | — |
|
||||
| 3.3 | Anomaly detection | ⬜ | — |
|
||||
| 4.1 | Campaign trigger rules | ⬜ | — |
|
||||
| 4.1 | Personalized messaging | ⬜ | — |
|
||||
| 4.2 | Platform integrations | ⬜ | — |
|
||||
| 4.3 | CS team dashboard | ⬜ | — |
|
||||
| 5.1 | Health overview UI | ⬜ | — |
|
||||
| 5.2 | Churn prediction dashboard | ⬜ | — |
|
||||
| 5.3 | Campaign management | ⬜ | — |
|
||||
|
||||
**Legend:** ⬜ Not started | 🟡 In progress | ✅ Complete | ⏸️ Deferred
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference for Implementing Agent
|
||||
|
||||
**📋 Full Roadmap:** `/Users/sd9235/code/mygh/learning_ai_common_plat/docs/roadmaps/PREDICTIVE_CHURN_HEALTH_SCORING_ROADMAP.md`
|
||||
|
||||
**Key Files to Modify/Create:**
|
||||
|
||||
```
|
||||
services/platform-service/
|
||||
├── src/
|
||||
│ ├── modules/predictive-analytics/
|
||||
│ │ ├── types.ts # [1.2] Feature, HealthScore, Prediction types
|
||||
│ │ ├── repository.ts # Data access layer
|
||||
│ │ ├── feature-extractor.ts # [1.1] Telemetry → features
|
||||
│ │ ├── feature-store.ts # [1.2] Feature vector storage
|
||||
│ │ ├── churn-model.ts # [2.1] XGBoost training & inference
|
||||
│ │ ├── scoring-api.ts # [2.2] Real-time prediction endpoint
|
||||
│ │ ├── explanation-engine.ts # [2.3] SHAP + NL explanations
|
||||
│ │ ├── health-scoring.ts # [3] Health dimension calculation
|
||||
│ │ ├── anomaly-detection.ts # [3.3] Prophet/ARIMA forecasting
|
||||
│ │ ├── campaign-engine.ts # [4] Retention automation
|
||||
│ │ ├── routes.ts # [5] REST API
|
||||
│ │ └── predictive-analytics.test.ts # Tests
|
||||
│ ├── lib/
|
||||
│ │ └── cosmos-init.ts # [1.2] Add containers
|
||||
│ └── server.ts # Register routes
|
||||
dashboards/admin-web/
|
||||
├── src/
|
||||
│ ├── app/(dashboard)/
|
||||
│ │ ├── health-dashboard/
|
||||
│ │ │ └── page.tsx # [5.1] Product health overview
|
||||
│ │ └── predictive/
|
||||
│ │ ├── at-risk/
|
||||
│ │ │ └── page.tsx # [4.3] At-risk user list
|
||||
│ │ └── campaigns/
|
||||
│ │ └── page.tsx # [5.3] Campaign management
|
||||
│ ├── lib/
|
||||
│ │ └── predictive-client.ts # API client
|
||||
│ └── components/
|
||||
│ └── predictive/ # Risk cards, health charts
|
||||
```
|
||||
|
||||
**Commit Message Format:**
|
||||
|
||||
```
|
||||
feat(predictive-analytics): <description> [<task.code>]
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```bash
|
||||
git add services/platform-service/src/modules/predictive-analytics/
|
||||
git commit -m "feat(predictive-analytics): add feature extraction and store [1.1-1.2]"
|
||||
```
|
||||
|
||||
**Testing Requirements:**
|
||||
|
||||
- Unit tests: 20+ Vitest tests for feature extraction, model inference
|
||||
- Model validation: AUC, calibration, precision@k metrics
|
||||
- Integration: End-to-end prediction pipeline
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Telemetry module (feature extraction)
|
||||
- Delivery module (retention campaigns)
|
||||
- Azure ML or scikit-learn (model training)
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: API Reference
|
||||
|
||||
| Method | Endpoint | Auth | Description |
|
||||
| ------ | -------------------------------------- | ------------- | ----------------------------- |
|
||||
| GET | `/predictive/health` | Admin | Get all product health scores |
|
||||
| GET | `/predictive/health/:productId` | Admin | Get product health detail |
|
||||
| GET | `/predictive/health/:productId/trends` | Admin | Historical health trends |
|
||||
| POST | `/predictive/churn-score` | Admin/Service | Get churn prediction for user |
|
||||
| POST | `/predictive/churn-batch` | Admin | Batch churn scoring |
|
||||
| GET | `/predictive/at-risk-users` | Admin/CS | List users by risk segment |
|
||||
| GET | `/predictive/users/:id/risk-profile` | Admin/CS | User churn risk details |
|
||||
| GET | `/predictive/model/performance` | Admin | Model accuracy metrics |
|
||||
| GET | `/predictive/model/features` | Admin | Feature importance ranking |
|
||||
| GET | `/predictive/campaigns` | Admin | List retention campaigns |
|
||||
| POST | `/predictive/campaigns` | Admin | Create campaign |
|
||||
| PATCH | `/predictive/campaigns/:id` | Admin | Update campaign |
|
||||
| GET | `/predictive/campaigns/:id/stats` | Admin | Campaign performance |
|
||||
| POST | `/predictive/campaigns/:id/trigger` | Admin | Manual trigger for testing |
|
||||
|
||||
---
|
||||
|
||||
## Appendix C: Integration Points
|
||||
|
||||
### With Telemetry Module
|
||||
|
||||
- Raw events feed feature extraction
|
||||
- Error rates flow into health scores
|
||||
- Correlation IDs link behaviors to predictions
|
||||
|
||||
### With Diagnostics Module
|
||||
|
||||
- Debug sessions enrich feature vectors
|
||||
- Error clusters correlate with churn risk
|
||||
- Screenshot patterns analyzed for UX issues
|
||||
|
||||
### With Event Bus
|
||||
|
||||
| Event | Action |
|
||||
| ----------------------------------- | ---------------------------------------- |
|
||||
| `predictive.churn.risk_detected` | Trigger retention campaign |
|
||||
| `predictive.health.critical` | Alert leadership, suggest debug sessions |
|
||||
| `predictive.anomaly.detected` | Create incident, notify on-call |
|
||||
| `user.retention.campaign_responded` | Update model with outcome |
|
||||
|
||||
### With Delivery Module
|
||||
|
||||
- Retention campaigns use email templates
|
||||
- Push notifications for urgent interventions
|
||||
- A/B test message variants
|
||||
|
||||
---
|
||||
|
||||
## Appendix D: Cost Estimation
|
||||
|
||||
| Component | Monthly Cost (est.) |
|
||||
| ------------------------------------ | -------------------------- |
|
||||
| Cosmos DB (features + predictions) | $150–300 |
|
||||
| Model training (Azure ML) | $100–200 |
|
||||
| Inference compute | $50–100 |
|
||||
| Email delivery (retention campaigns) | $50–200 (volume-dependent) |
|
||||
| **Total** | **$350–800/month** |
|
||||
|
||||
ROI: If system prevents 5% of predicted churn at $50 LTV with 10K at-risk users/month:
|
||||
|
||||
- 500 users retained × $50 = $25K/month value
|
||||
- 10:1+ ROI
|
||||
|
||||
---
|
||||
|
||||
## Appendix E: Success Metrics
|
||||
|
||||
### Model Performance
|
||||
|
||||
- [ ] AUC > 75% (discrimination)
|
||||
- [ ] Calibration slope 0.9–1.1 (well-calibrated probabilities)
|
||||
- [ ] Precision@10% > 60% (high-risk predictions are accurate)
|
||||
|
||||
### Business Impact
|
||||
|
||||
- [ ] 10%+ reduction in churn rate for targeted cohorts
|
||||
- [ ] 5%+ increase in re-engagement campaign response
|
||||
- [ ] CS team satisfaction with at-risk user visibility
|
||||
|
||||
### Operational
|
||||
|
||||
- [ ] Prediction latency < 100ms
|
||||
- [ ] Feature freshness < 24 hours
|
||||
- [ ] Model retraining automation
|
||||
|
||||
---
|
||||
|
||||
## Current Status
|
||||
|
||||
- [ ] **Design complete** — Target: 2026-03-10
|
||||
- [ ] **Phase 1: Feature Pipeline** — Not started
|
||||
- [ ] **Phase 2: Churn Model** — Not started
|
||||
- [ ] **Phase 3: Health Scoring** — Not started
|
||||
- [ ] **Phase 4: Interventions** — Not started
|
||||
- [ ] **Phase 5: Admin UI** — Not started
|
||||
- [ ] **Phase 6: Advanced** — Future
|
||||
|
||||
**Estimated Timeline:** 3 weeks (Phases 1–5)
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- Telemetry module (for feature extraction)
|
||||
- Azure ML or similar (for model training)
|
||||
- Delivery module (for retention campaigns)
|
||||
|
||||
---
|
||||
|
||||
_Last Updated: 2026-03-03_
|
||||
Loading…
Reference in New Issue
Block a user