# Client Telemetry β€” Implementation Roadmap > **Status:** Phases 0–3 code complete βœ… Β· Phase 4 (Operational Wiring) **NOT STARTED** πŸ”΄ > **Last updated:** 2026-02-17 (reviewed for accuracy against running code) > **Design doc:** [`CLIENT_TELEMETRY_DESIGN.md`](./CLIENT_TELEMETRY_DESIGN.md) > **Repos:** `learning_ai_common_plat` (platform-service) Β· `learning_voice_ai_agent` (all clients + dashboards) --- ## Phase 0 β€” Design & Review - [x] Write comprehensive telemetry design doc β€” schema, APIs, admin UX, privacy guardrails ([`c59049e`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c59049e)) - [x] Systematic review: identify and fix 18 bugs/gaps in the design doc ([`083cf02`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/083cf02)) - TTL format (ISO β†’ seconds), `regionCode` prefix format, missing `pk` field - Auth model for keyboard extension (`X-Install-Token`) - Config endpoint query params (`userId`/`anonymousInstallId`) - Error clustering made version-agnostic (`affectedVersions` array) - GDPR erasure endpoint added - iOS offline queue strategy (App Group UserDefaults, FIFO eviction) - Global defaults for `batchSize`/`flushInterval`/`maxQueueSize` --- ## Phase 1 β€” MVP (iOS Keyboard + Backend + Admin UI) ### Platform-Service Telemetry Module - [x] `types.ts` β€” Zod schemas for events, policies, clusters, queries ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] `repository.ts` β€” Cosmos DB CRUD for events, policies, clusters ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] `routes.ts` β€” Fastify routes: ingestion, config, admin query, clusters, policy CRUD, GDPR erasure ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] `telemetry.test.ts` β€” 34 Vitest tests for schemas + policy evaluation ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Register telemetry routes in `server.ts` ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Add Cosmos containers (`telemetry_events`, `telemetry_error_clusters`, `telemetry_collection_policies`) to `cosmos-init.ts` ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) ### iOS Keyboard Telemetry Client - [x] `LysnrTelemetry.swift` β€” Singleton client with App Group offline queue, `X-Install-Token` auth, 200-event cap ([`e546475`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/e546475)) - [x] Instrument `KeyboardViewController.swift` β€” 10+ telemetry points ([`e546475`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/e546475)) - [x] `session_started` / `session_ended` (with full `DictationContext`) - [x] `backend_selected` (azure / local + reason) - [x] `recognition_started` / `recognition_failed` - [x] `mic_permission_denied` - [x] `insert_noop` detection - [x] `error_recovery_attempted` (localβ†’azure, azureβ†’local) - [x] Session summary metrics (duration, segments, words, transcript length) ### Admin Dashboard β€” Client Logs Page - [x] `/ops/client-logs/page.tsx` β€” Events table + Error Clusters tab ([`d202f94`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/d202f94)) - [x] Stat cards (total events, errors, warnings, keyboard events) - [x] Filters (platform, channel, level, module, free-text search) - [x] Expandable event detail rows (device, tags, metrics, dictation context) - [x] Error Clusters tab with severity, affected versions, user count - [x] `/api/telemetry/route.ts` β€” API route proxying to platform-service ([`d202f94`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/d202f94)) - [x] `platform-client.ts` β€” `queryTelemetryEvents` + `queryTelemetryClusters` ([`d202f94`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/d202f94)) - [x] `sidebar-nav.tsx` β€” "Client Logs" nav item with `FileText` icon ([`d202f94`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/d202f94)) --- ## Phase 2 β€” Full Platform Coverage ### iOS Main App - [x] `TelemetryService.swift` β€” Main app telemetry service with App Group queue drain on foreground ([`a173baa`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a173baa)) - [x] `LysnrAIApp.swift` β€” `scenePhase` integration for activate/deactivate lifecycle ([`a173baa`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a173baa)) - [x] `app_foregrounded` / `app_backgrounded` events - [x] Keyboard queue flush on every foreground transition - [x] 60-second periodic flush timer ### Desktop App (Python) - [x] `platform_telemetry.py` β€” `PlatformTelemetry` singleton with `urllib.request` POST, threaded flush timer, persistent `install_id` in `~/.LysnrAI/install_id` ([`a173baa`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a173baa)) - [x] `main.py` instrumentation ([`a173baa`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a173baa)) - [x] `app_started` / `app_stopped` lifecycle events - [x] `dictation_started` (with backend tag) - [x] `dictation_completed` (with duration_ms, word_count, transcript_length metrics) - [x] `mic_permission_denied` / `recording_start_failed` error events ### Web User Dashboard - [x] `telemetry.ts` β€” Browser client with `sendBeacon`, `localStorage` install ID, auto-flush on visibility change ([`130e1d6`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/130e1d6)) - [x] `/api/telemetry/ingest/route.ts` β€” Server-side proxy to platform-service ([`130e1d6`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/130e1d6)) - [x] `providers.tsx` β€” `initTelemetry()` called on app mount ([`130e1d6`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/130e1d6)) ### Tracker Dashboard - [x] `telemetry.ts` β€” Browser client (same pattern as user dashboard) ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) - [x] `/api/telemetry/ingest/route.ts` β€” Server-side proxy to platform-service ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) - [x] `providers.tsx` β€” `initTelemetry()` called on app mount ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) ### Admin Dashboard Self-Telemetry - [x] `telemetry.ts` β€” Browser client tracking admin page views, filter usage, policy changes ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) - [x] `/api/telemetry/admin-ingest/route.ts` β€” Separate proxy from admin query route ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) - [x] `providers.tsx` β€” `initTelemetry()` called on app mount ([`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609)) ### Android - [x] `TelemetryClient.kt` β€” Kotlin singleton with OkHttp POST, SharedPreferences offline queue, persistent install ID ([`9196f48`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/9196f48)) - [x] Instrument `LysnrInputMethodService.kt` β€” 10 telemetry points ([`9196f48`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/9196f48)) - [x] `session_started` / `session_ended` (with words_inserted metric) - [x] `dictation_started` (with backend + reason tags) - [x] `dictation_completed` (with duration_ms, word_count, segment_count, transcript_length) - [x] `mic_permission_denied` - [x] `recognition_failed` (with errorCode + errorDomain) - [x] `error_recovery_attempted` (azureβ†’local fallback) - [x] Offline queue using SharedPreferences with FIFO eviction ([`9196f48`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/9196f48)) - [x] Flush on app foreground via `ProcessLifecycleOwner` + 60s periodic flush timer ([`9196f48`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/9196f48)) --- ## Phase 3 β€” Intelligence & Admin Tooling ### Error Clustering & Alerting - [x] Automated error fingerprinting (hash of `platform + channel + module + eventName + errorDomain + errorCode`) β€” Phase 1 ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Cluster severity escalation (`warn` β†’ `error` β†’ `fatal` based on count + affected users) β€” Phase 1 ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Webhook alerting when cluster severity escalates (Slack-compatible, env `TELEMETRY_ALERT_WEBHOOK_URL`) ([`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323)) - [x] Dashboard: cluster timeline chart (Recharts stacked bar, last 14 days, severity breakdown) ([`dc49073`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/dc49073)) - [x] Dashboard: "Resolve" / "Ignore" / "Reopen" actions on clusters ([`6d7b1d3`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/6d7b1d3)) - [x] Cluster status field (`open`/`resolved`/`ignored`) + `PATCH /telemetry/clusters/:id` endpoint ([`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323)) ### Geo Enrichment - [x] Server-side IP β†’ country/region lookup on ingestion (configurable via `TELEMETRY_GEO_API_URL`, 24h in-memory cache, 2s timeout) ([`2f61ea5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2f61ea5)) - [x] Populate `countryCode` + `regionCode` fields (e.g., `US:WA`) on events from server-side IP lookup ([`2f61ea5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2f61ea5)) - [x] Admin UI: geographic distribution chart (horizontal bar chart + country table, Geo tab on client-logs page) ([`0bfd4bd`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0bfd4bd), [`82a25c0`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/82a25c0)) - [x] Policy targeting by `regionCode`/`countryCodes` ranges (schema already supports it in `TelemetryTargetingSchema`) ### Collection Policy Builder UI - [x] Admin page: `/ops/telemetry-policies` ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) - [x] CRUD UI for collection policies (name, enabled, targeting rules, sampling rates) ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) - [x] Targeting builder: platform checkboxes, channel badges, release channel selection, percentage slider ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) - [x] Live preview: "N / M clients would match this policy" β€” `POST /telemetry/policies/preview` + UI button ([`61c919a`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/61c919a), [`da9031b`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/da9031b)) - [x] Policy activation/deactivation toggle ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) - [x] Scheduling: `startsAt` / `expiresAt` date pickers ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) ### Privacy & Compliance - [x] PII regex scanner on ingestion (email, phone, SSN, credit card patterns β†’ reject before storage) β€” Phase 1 ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Admin API: GDPR erasure endpoint `DELETE /telemetry/user/:userId` β€” Phase 1 ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Admin UI: GDPR erasure proxy route `/api/telemetry/erasure` ([`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9)) - [x] Retention policy enforcement (TTL-based auto-expiry, `TELEMETRY_EVENT_TTL_DAYS` env var) β€” Phase 1 ([`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff)) - [x] Audit log entries for policy CRUD + GDPR erasure (`telemetry.policy.created/updated/deleted`, `telemetry.gdpr.erasure`, `telemetry.cluster.resolved/ignored`) ([`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323)) - [x] Admin UI: GDPR erasure tab on Client Logs page ([`6d7b1d3`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/6d7b1d3)) ### Performance & Scale - [x] ETag caching on `GET /telemetry/config` (`If-None-Match` β†’ 304, `Cache-Control: private, max-age=60`) ([`2fb3410`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2fb3410)) - [x] Server-side rate limiting per `installId` (100 events/min, in-memory sliding window) ([`2fb3410`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2fb3410)) - [x] Cosmos DB indexing policy tuning β€” `scripts/cosmos-telemetry-indexes.sh` with composite indexes for all 3 containers ([`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323)) - [x] Batch ingestion deduplication by `event.id` ([`2fb3410`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2fb3410)) - [x] In-memory ingestion metrics counters + `GET /telemetry/metrics` admin endpoint ([`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323)) - [x] Admin UI: Metrics tab on Client Logs page (ingested, rejected, PII blocked, rate limited, duplicates) ([`6d7b1d3`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/6d7b1d3)) - [x] Prometheus OpenMetrics export endpoint `GET /telemetry/metrics/prometheus` ([`2f61ea5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2f61ea5)) --- ## Phase 4 β€” Operational Wiring (NOT STARTED πŸ”΄) > **This phase bridges "code exists" β†’ "telemetry actually flows."** > All Phases 0–3 are code-complete, but **no telemetry data has ever reached the server** from any real client. > The items below are required before the telemetry system can be called "done." ### 4.1 β€” Platform-Service Deployment - [ ] Deploy platform-service to a **publicly reachable URL** (Azure Container Apps, Azure App Service, or VM) - [ ] Configure DNS / reverse proxy so clients can reach `https://api.lysnrai.com` (or similar) - [ ] Set env vars: `COSMOS_ENDPOINT`, `COSMOS_KEY`, `TELEMETRY_ENABLED=true` - [ ] Run `scripts/cosmos-telemetry-indexes.sh` against live Cosmos DB to create containers + indexes - [ ] Verify `POST /api/telemetry/events` accepts a test payload from `curl` ### 4.2 β€” iOS Keyboard Extension Wiring - [ ] **Register App Groups capability** in Apple Developer portal for both `com.bytelyst.LysnrAI` and `com.bytelyst.LysnrAI.keyboard` - [ ] **Restore entitlements** in TestFlight builds (currently cleared because provisioning profile lacks App Groups) - `LysnrAI.entitlements`: `aps-environment` + `com.apple.security.application-groups` - `LysnrKeyboard.entitlements`: `com.apple.security.application-groups` - [ ] **Write `platform_service_url`** to App Group UserDefaults β€” currently `LysnrTelemetry.swift` reads `platform_service_url` from App Group (line 80) but **nothing writes it** - Option A: Main app writes URL on launch from env/config - Option B: Hardcode URL in `LysnrTelemetry.swift` init - Option C: Bundle in `env.dev` and read from shared config - [ ] **Verify mic permission flow on physical device** β€” keyboard extensions may not show permission prompts; main app must request mic permission first. Current "Mic error" on device likely caused by this. - [ ] Test Full Access ON vs OFF paths on physical device ### 4.3 β€” iOS Main App TelemetryService Integration - [ ] Verify `TelemetryService.swift` reads `platform_service_url` from config/env and writes to App Group - [ ] Verify keyboard queue drain works: main app foreground β†’ reads App Group `telemetry_event_queue` β†’ POSTs to server - [ ] Test lifecycle: app backgrounded β†’ keyboard generates events β†’ app foregrounded β†’ events flushed ### 4.4 β€” Desktop App Wiring - [ ] Set `PLATFORM_SERVICE_URL` env var in `~/.LysnrAI/.env` pointing to deployed service - [ ] Verify `platform_telemetry.py` sends events on dictation start/stop - [ ] Test offline β†’ online queue drain ### 4.5 β€” Web Dashboard Wiring - [ ] Set `PLATFORM_SERVICE_URL` in dashboard `.env.local` files - [ ] Verify `/api/telemetry/ingest` proxy routes forward to deployed platform-service - [ ] Verify admin dashboard `/ops/client-logs` page loads real data from platform-service ### 4.6 β€” Android Wiring - [ ] Set platform service URL in Android app config - [ ] Test SharedPreferences offline queue + foreground flush - [ ] Verify keyboard instrumentation events reach server ### 4.7 β€” Webhook / Alert Configuration - [ ] Set `TELEMETRY_ALERT_WEBHOOK_URL` env var (Slack webhook or equivalent) - [ ] Test cluster severity escalation triggers webhook - [ ] Set `TELEMETRY_GEO_API_URL` env var (ip-api.com or similar) for geo enrichment ### 4.8 β€” End-to-End Smoke Test - [ ] iOS keyboard β†’ platform-service β†’ Cosmos β†’ admin dashboard query β€” **full round-trip** - [ ] Desktop β†’ platform-service β†’ Cosmos β†’ admin dashboard query - [ ] Web dashboard β†’ platform-service ingest β†’ admin dashboard query - [ ] Trigger error cluster creation β†’ verify cluster appears in admin UI - [ ] Trigger rate limit β†’ verify rejection in metrics tab - [ ] GDPR erasure β†’ verify events deleted from Cosmos ### Summary: What Blocks "100% Done" | Blocker | Severity | Effort | | --------------------------------------------------- | ----------- | ----------------------------------------------- | | **Platform-service not deployed** | πŸ”΄ Critical | Medium β€” needs Azure infra | | **App Group entitlements not registered** | πŸ”΄ Critical | Low β€” Apple Developer portal config | | **`platform_service_url` not written to App Group** | πŸ”΄ Critical | Low β€” one-line code change | | **Cosmos containers not created in prod** | 🟑 High | Low β€” run indexing script | | **Mic permission flow on device** | 🟑 High | Medium β€” needs device testing + possible UX fix | | **Webhook URL not configured** | 🟒 Low | Trivial β€” env var | | **Geo API URL not configured** | 🟒 Low | Trivial β€” env var | | **Remaining test gaps (5 items)** | 🟒 Low | Medium β€” integration/e2e tests | --- ## Architecture Summary ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ iOS Keyboard Ext β”‚ β”‚ iOS Main App β”‚ β”‚ Desktop (Python) β”‚ β”‚ LysnrTelemetry │───▢│ TelemetryService β”‚ β”‚ PlatformTelemetryβ”‚ β”‚ (App Group queue) β”‚ β”‚ (drains queue) β”‚ β”‚ (urllib POST) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Full Access ON ──┐ β”‚ β”‚ direct POST β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Platform Service (Fastify, port 4003) β”‚ β”‚ POST /api/telemetry/events β€” batch ingestion β”‚ β”‚ GET /api/telemetry/config β€” client collection config β”‚ β”‚ GET /api/telemetry/query β€” admin event search β”‚ β”‚ GET /api/telemetry/clusters β€” admin error clusters β”‚ β”‚ CRUD /api/telemetry/policies β€” collection policy management β”‚ β”‚ DELETE /api/telemetry/user/:userId β€” GDPR erasure β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Azure Cosmos DB β”‚ β”‚ telemetry_events partitionKeyPath: /pk β”‚ β”‚ pk value = productId:yyyyMM:platform (e.g. lysnrai:202602:ios) β”‚ β”‚ telemetry_error_clusters partitionKeyPath: /pk β”‚ β”‚ pk value = productId:platform:module (e.g. lysnrai:ios:dictation)β”‚ β”‚ telemetry_collection_policies partitionKeyPath: /productId β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Admin Dashboard β”‚ GET β”‚ User Dashboard β”‚ POST β”‚ /ops/client-logs │─────────▢│ /api/telemetry/ │─────────▢ platform β”‚ (queries via β”‚ query/ β”‚ ingest β”‚ /events -service β”‚ platform-service API) β”‚ clustersβ”‚ (browser β†’ proxy) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Android β”‚ β”‚ TelemetryClient.kt │──▢ POST /api/telemetry/events ──▢ platform-service β”‚ (SharedPreferences) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## Test Coverage | Component | Test File | Tests | Coverage | | --------------------------------- | ------------------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | **Platform-service telemetry** | `telemetry.test.ts` | 89 | Zod schemas (34), `containsPII` (6), `computePk` (4), `normalizeMessage` (7), `generateFingerprint` (8), `policyMatchesContext` (13), `mergePolicies` (5), `checkRateLimit` (3), plus additional route-logic tests | | **iOS LysnrTelemetry (keyboard)** | `LysnrAITests/LysnrTelemetryTests.swift` | 18 | Identity (5), session management (2), event types (1), DictationContext (3), track (3), flush (2), queue (1), crash-safety (1) | | **Desktop Python client** | `tests/cloud/test_platform_telemetry.py` | 19 | Event format (6), queue behavior (2), session mgmt (2), flush/HTTP (5), install ID (2), singleton (2) | | **Web dashboard client** | `user-dashboard-web/src/__tests__/telemetry.test.ts` | 12 | `trackEvent` (3), `trackPageView` (1), `flush` (4), install ID (2), `initTelemetry` (2) | | **Tracker dashboard client** | `tracker-dashboard-web/src/__tests__/telemetry.test.ts` | 10 | `trackEvent` (3), `trackPageView` (1), `flush` (4), `initTelemetry` (2) | | **Admin dashboard client** | `admin-dashboard-web/src/__tests__/telemetry.test.ts` | 10 | `trackEvent` (3), `trackPageView` (1), `flush` (4), `initTelemetry` (2) | | **Total** | | **158** | | ### Verification commands ```bash # Platform-service (89 telemetry tests within 624 total) cd ../learning_ai_common_plat && pnpm --filter @lysnrai/platform-service test # iOS keyboard telemetry (18 tests) cd learning_voice_ai_agent xcodebuild test-without-building \ -workspace mobile_app/ios/LysnrAI.xcworkspace \ -scheme LysnrAITests \ -destination 'platform=iOS Simulator,name=iPhone 17 Pro' \ -only-testing:LysnrAITests/LysnrTelemetryTests # Desktop Python (19 tests) python -m pytest tests/cloud/test_platform_telemetry.py -v # Web user-dashboard (12 tests) cd user-dashboard-web && npx vitest run src/__tests__/telemetry.test.ts # Tracker dashboard (10 tests) cd tracker-dashboard-web && npx vitest run src/__tests__/telemetry.test.ts # Admin dashboard (10 tests) cd admin-dashboard-web && npx vitest run src/__tests__/telemetry.test.ts ``` ### Not yet tested - [x] iOS `LysnrTelemetry.swift` β€” βœ… 18 XCTest unit tests (`LysnrTelemetryTests.swift`, build 28) - [ ] iOS `TelemetryService.swift` (main app) β€” needs XCTest target for main app - [ ] Android `TelemetryClient.kt` β€” needs Android instrumented tests or Robolectric - [ ] Admin dashboard `/api/telemetry/route.ts` β€” API route integration test - [ ] Platform-service HTTP integration tests (Fastify inject for telemetry routes) - [ ] End-to-end: client β†’ platform-service β†’ Cosmos read-back β†’ admin dashboard query --- ## Bugs Found During Review The following bugs were discovered during systematic review of the roadmap against actual code and fixed: | # | Severity | Issue | Fix | | --- | ---------- | ------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | | 1 | **High** | Desktop Python `id` used `uuid.uuid4().hex` (32 hex, no dashes) β€” fails Zod `.uuid()` server validation | Changed to `str(uuid.uuid4())` | | 2 | **High** | Web telemetry `osFamily='web'` not in Zod `OsFamilyEnum` β€” fails server validation | Changed to `'other'` | | 3 | **Medium** | Status said "Phase 2 complete" but Android is all unchecked | Fixed status line | | 4 | **Medium** | Architecture diagram showed wrong pk for `telemetry_error_clusters` (`/productId` β†’ actual `/pk` = `productId:platform:module`) | Fixed diagram | | 5 | **Medium** | Tracker dashboard telemetry missing from roadmap entirely | Added as Phase 2 pending | | 6 | **Medium** | Admin dashboard self-telemetry (page views) not mentioned | Added as Phase 2 pending | | 7 | **Low** | Architecture diagram missing Android client box | Added with "not yet implemented" note | | 8 | **Low** | Architecture diagram implied Admin reads Cosmos directly (it queries Platform Service) | Fixed data flow arrows | | 9 | **Low** | Web `telemetry.ts` JSDoc said "via the admin dashboard proxy" (wrong dashboard) | Fixed to "user dashboard's /api/telemetry/ingest proxy" | | 10 | **Low** | Commit log missing roadmap doc commit | Added | --- ## Commit Log | Date | Repo | Commit | Description | | ---------- | ----------- | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- | | 2026-02-16 | common-plat | [`c59049e`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c59049e) | Design doc: client telemetry & log insights | | 2026-02-16 | common-plat | [`083cf02`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/083cf02) | Fix 18 gaps in telemetry design doc (rev 2) | | 2026-02-16 | common-plat | [`ce4c4ff`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/ce4c4ff) | Telemetry module β€” ingest, config, query, clusters, policies (34 tests) | | 2026-02-17 | voice-agent | [`e546475`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/e546475) | iOS keyboard telemetry client + KeyboardViewController instrumentation | | 2026-02-17 | voice-agent | [`d202f94`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/d202f94) | Admin dashboard Client Logs page + sidebar nav | | 2026-02-17 | voice-agent | [`a173baa`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a173baa) | iOS main app TelemetryService + Desktop Python platform_telemetry | | 2026-02-17 | voice-agent | [`130e1d6`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/130e1d6) | Web user-dashboard telemetry client + ingest proxy | | 2026-02-17 | common-plat | [`c3d6977`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/c3d6977) | Telemetry roadmap doc (this file) | | 2026-02-17 | voice-agent | [`ae77438`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/ae77438) | Fix: desktop uuid format + web osFamily β€” pass Zod validation | | 2026-02-17 | common-plat | [`20f77d5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/20f77d5) | Tests: route-logic tests β€” PII, pk, fingerprint, policy matching (34β†’77) | | 2026-02-17 | voice-agent | [`08efdb6`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/08efdb6) | Tests: Python client (19) + web dashboard (12) telemetry tests | | 2026-02-17 | voice-agent | [`a102609`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/a102609) | Tracker + admin self-telemetry clients + tests (20 tests) | | 2026-02-17 | voice-agent | [`9196f48`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/9196f48) | Android TelemetryClient + keyboard instrumentation + ProcessLifecycleOwner | | 2026-02-17 | voice-agent | [`c7732c9`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/c7732c9) | Phase 3: Policy Builder UI + GDPR erasure proxy + sidebar nav | | 2026-02-17 | common-plat | [`2fb3410`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2fb3410) | Phase 3: Rate limiting, batch dedup, ETag config caching (614 tests) | | 2026-02-17 | common-plat | [`056f323`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/056f323) | Phase 3: Cluster resolve/ignore, audit logging, webhook alerts, metrics, Cosmos indexes | | 2026-02-17 | voice-agent | [`6d7b1d3`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/6d7b1d3) | Phase 3: Cluster actions UI, metrics tab, GDPR erasure UI | | 2026-02-17 | common-plat | [`2f61ea5`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/2f61ea5) | Phase 3: Geo enrichment, Prometheus metrics export | | 2026-02-17 | voice-agent | [`dc49073`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/dc49073) | Phase 3: Cluster timeline chart (Recharts) | | 2026-02-17 | common-plat | [`61c919a`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/61c919a) | Phase 3: Policy preview endpoint (count matching clients) | | 2026-02-17 | voice-agent | [`da9031b`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/da9031b) | Phase 3: Policy builder live preview UI + API proxy | | 2026-02-17 | common-plat | [`0bfd4bd`](https://github.com/saravanakumardb1/learning_ai_common_plat/commit/0bfd4bd) | Phase 3: Geo distribution endpoint (GET /telemetry/geo, Cosmos GROUP BY) | | 2026-02-17 | voice-agent | [`82a25c0`](https://github.com/saravanakumardb1/learning_voice_ai_agent/commit/82a25c0) | Phase 3: Geo distribution UI β€” bar chart + country table on client-logs Geo tab |