- shared/realtime.ts: add SOCKET_NAMESPACES constants (/trading, /admin, root) - shared/feature-flags.ts: add tabs.marketplace and tabs.membership to TradingFeatureFlagsResponse; add FEATURE_FLAG_KEYS constants - .env.example: remove /api suffix from VITE/NEXT_PUBLIC trading URL vars (web appends /api itself); add tab visibility flag vars with comments - web: add useTabFeatureFlags hook + DOM test; wire tab visibility into App.tsx - web/vite.config.ts: finalize build config - mobile/providers/TradingDataProvider.tsx: deriveSocketParams for proxy-safe socket origin/path resolution (already landed upstream, conflict resolved) - docs: add CUTOVER_WEB.md, CUTOVER_MOBILE.md checklists; update OPERATIONS.md with Docker commands and resolved gap log; update ROADMAP.md to Done; add BACKEND_AUDIT_SCHEMA.md, BACKEND_API_DEPRECATION.md, CONVENTIONS.md; add audit-events container entry to AZURE_INFRASTRUCTURE.md - README.md: full rewrite with workspace table, arch summary, env var reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
228 lines
8.4 KiB
Markdown
228 lines
8.4 KiB
Markdown
# Web Internal Adoption Checklist (Stage 2 Cutover)
|
||
|
||
## Purpose
|
||
|
||
This document is the step-by-step runbook for switching internal operators from the
|
||
legacy `bytelyst-trading-dashboard-web` to the new monorepo web dashboard (`web/`).
|
||
|
||
It covers the pre-flight gate, deployment, validation, rollback triggers, and
|
||
post-adoption monitoring. Complete every step in order.
|
||
|
||
---
|
||
|
||
## Pre-Flight Gate
|
||
|
||
Do not begin cutover until all of the following are true.
|
||
|
||
### Go / No-Go Checks
|
||
|
||
Run from the monorepo root:
|
||
|
||
```bash
|
||
pnpm verify # typecheck + test + build — must be green
|
||
pnpm lint # backend contract + security guards + web/mobile lint — must be green
|
||
pnpm smoke:release # auth + kill-switch smoke tests — must pass
|
||
```
|
||
|
||
Backend-specific:
|
||
|
||
```bash
|
||
cd backend
|
||
npm run check:api-contract # feature-flag shapes, audit events, namespace constants
|
||
npm run check:websocket-contract # BotState lifecycle consistency
|
||
npm run check:security-guards # tenant isolation — must be green
|
||
npm run check:tenant-isolation # row-level access — must be green
|
||
```
|
||
|
||
### Environment Checks
|
||
|
||
- [ ] Backend is deployed and reachable (`GET /health/live` returns 200)
|
||
- [ ] Cosmos DB containers readable and writable (`dynamic_config`, `trading-profiles`, `trading-control`, `snapshots`, `capital-ledger`)
|
||
- [ ] Platform-service is reachable from the deployment environment
|
||
- [ ] `PLATFORM_AUTH_ENABLED=true` is set on the backend deployment
|
||
- [ ] `VITE_TRADING_API_URL` points to the deployed backend (not localhost)
|
||
- [ ] `VITE_PLATFORM_URL` points to the live platform-service
|
||
- [ ] `CORS_ALLOWED_ORIGINS` on the backend includes the new web dashboard origin
|
||
- [ ] Feature flags set correctly for the rollout population:
|
||
- `TAB_MARKETPLACE_ENABLED` — set per rollout plan
|
||
- `TAB_MEMBERSHIP_ENABLED` — set per rollout plan
|
||
- `ENABLE_BACKTEST` — set per rollout plan
|
||
|
||
### Rollback Readiness
|
||
|
||
- [ ] The legacy web dashboard URL is still live and working
|
||
- [ ] You know who owns the rollback decision and how to reach them
|
||
- [ ] Backend trade-halt control is reachable (`POST /internal/trading/pause`)
|
||
|
||
---
|
||
|
||
## Step 1 — Deploy the Web Dashboard
|
||
|
||
```bash
|
||
# From monorepo root — production build
|
||
pnpm build
|
||
|
||
# Or using Docker
|
||
pnpm docker:up
|
||
```
|
||
|
||
Verify the deployment:
|
||
|
||
- [ ] Web dashboard loads at the new URL without a blank screen
|
||
- [ ] Browser console shows no errors on load
|
||
- [ ] Network tab shows no 4xx/5xx on initial API calls
|
||
|
||
---
|
||
|
||
## Step 2 — Internal Operator Sign-In
|
||
|
||
Have each internal operator complete the sign-in sequence:
|
||
|
||
- [ ] Navigate to the new web dashboard URL
|
||
- [ ] Sign in using platform credentials (same as the legacy dashboard)
|
||
- [ ] Session restores correctly after browser refresh (no re-login required)
|
||
- [ ] Auth token is a platform JWT (check via browser devtools: `Authorization: Bearer ...` on API calls)
|
||
- [ ] `GET /api/me/profile` returns the correct user profile and role
|
||
|
||
---
|
||
|
||
## Step 3 — Core Feature Validation
|
||
|
||
Each operator validates their own user scope:
|
||
|
||
### Trading State
|
||
- [ ] Overview tab loads with live bot state (not stale/empty)
|
||
- [ ] WebSocket connection shows "Connected" in the header
|
||
- [ ] Socket connects to `/trading` namespace (check backend logs: `[API][/trading] Client connected`)
|
||
- [ ] Positions tab shows current open positions
|
||
- [ ] Trade History tab shows closed trade history
|
||
- [ ] My Strategies tab lists the operator's trading profiles
|
||
|
||
### Real-Time Updates
|
||
- [ ] Leave the dashboard open for 60 seconds; confirm symbol prices update live
|
||
- [ ] Trigger a manual order or profile toggle; confirm the state updates without refresh
|
||
|
||
### Admin Operators (role = admin only)
|
||
- [ ] Signals tab is visible and loads correctly
|
||
- [ ] Entries tab is visible and loads correctly
|
||
- [ ] Admin Panel tab is visible
|
||
- [ ] Strategy Clusters tab is visible
|
||
- [ ] Admin Panel → Trading Control: pause and resume work correctly
|
||
- [ ] Backend logs show `[AUDIT]` entries for pause/resume actions
|
||
- [ ] "Preview as Customer" toggle hides admin-only tabs correctly
|
||
|
||
### Kill-Switch Behaviour
|
||
- [ ] If platform-service maintenance mode is toggled on, web blocks access with correct UI
|
||
- [ ] After maintenance mode is lifted, web recovers without a page reload
|
||
|
||
---
|
||
|
||
## Step 4 — Config and Feature Flag Validation
|
||
|
||
- [ ] `GET /api/feature-flags` returns the correct `backtest`, `tabs.marketplace`, and `tabs.membership` values
|
||
- [ ] Backtesting tab visibility matches `ENABLE_BACKTEST` and `BACKTEST_CUSTOMER_ENABLED` config
|
||
- [ ] Marketplace tab visibility matches `TAB_MARKETPLACE_ENABLED` config
|
||
- [ ] Membership tab visibility matches `TAB_MEMBERSHIP_ENABLED` config
|
||
- [ ] Dynamic config changes via Admin Panel → Config are persisted to Cosmos and visible after a page refresh
|
||
|
||
---
|
||
|
||
## Step 5 — Request Tracing Spot Check
|
||
|
||
Pick any operator action (e.g., load trade history):
|
||
|
||
- [ ] Browser devtools shows `x-request-id` header on the request
|
||
- [ ] Backend response echoes the same `x-request-id`
|
||
- [ ] Search backend logs for that `x-request-id` — the full request trace appears
|
||
|
||
---
|
||
|
||
## Step 6 — Parallel Run Period (Recommended: 1–3 days)
|
||
|
||
Run the new and legacy dashboards in parallel before switching traffic fully:
|
||
|
||
- [ ] Operators use the new dashboard as primary
|
||
- [ ] Legacy dashboard remains accessible as a fallback
|
||
- [ ] No trading state mutations go through the legacy dashboard during this period
|
||
- [ ] Monitor for discrepancies between what new and legacy dashboards show
|
||
|
||
---
|
||
|
||
## Step 7 — Traffic Cutover
|
||
|
||
Once parallel run is complete with no issues:
|
||
|
||
- [ ] Update any bookmarks, internal links, or runbooks to point to the new URL
|
||
- [ ] Communicate to all internal users that the new dashboard is now primary
|
||
- [ ] Disable or redirect the legacy dashboard URL (do not delete it yet)
|
||
|
||
---
|
||
|
||
## Rollback Triggers
|
||
|
||
Stop cutover and revert to the legacy dashboard immediately if any of the following occur:
|
||
|
||
| Condition | Action |
|
||
|---|---|
|
||
| Sign-in or session restore fails for any operator | Rollback |
|
||
| Tenant data leak — operator sees another user's positions or history | Rollback immediately + page oncall |
|
||
| Trading control (pause/resume) does not apply correctly | Rollback |
|
||
| Dynamic config writes fail silently | Rollback |
|
||
| WebSocket disconnects repeatedly with no recovery | Rollback |
|
||
| Missing data in positions or trade history vs. legacy dashboard | Investigate before proceeding |
|
||
|
||
### Rollback Steps
|
||
|
||
1. Restore the legacy dashboard URL as primary (flip DNS or update internal links)
|
||
2. Notify all operators to switch back immediately
|
||
3. Do **not** rewrite or delete Cosmos state during first-response rollback
|
||
4. File an incident report referencing the `x-request-id` values from affected requests
|
||
5. Resolve the root cause before re-attempting cutover
|
||
|
||
---
|
||
|
||
## Post-Adoption Monitoring (First 24 Hours)
|
||
|
||
Watch the following immediately after cutting over:
|
||
|
||
### Immediate (first 30 minutes)
|
||
- [ ] Platform auth failure rate is zero
|
||
- [ ] Token refresh failures are zero
|
||
- [ ] Backend `401` / `403` error rate is baseline (no spike)
|
||
- [ ] WebSocket connection error rate is baseline
|
||
|
||
### First Hour
|
||
- [ ] Cosmos reads and writes are completing successfully (check backend logs for Cosmos errors)
|
||
- [ ] Dynamic config refresh cycle completes without error (every `DYNAMIC_CONFIG_REFRESH_MS`)
|
||
- [ ] No tenant isolation anomalies in security guard logs
|
||
|
||
### First 24 Hours
|
||
- [ ] Runtime control drift: Cosmos control-plane state matches in-memory trading control mode
|
||
- [ ] Kill-switch state matches platform-service state
|
||
- [ ] No stale session events (operators are not re-prompted to log in unexpectedly)
|
||
- [ ] No build or chunk-size regressions affecting web load time (check browser waterfall)
|
||
|
||
---
|
||
|
||
## Post-Cutover Sign-Off
|
||
|
||
Complete after the first 24 hours with no rollback triggers:
|
||
|
||
- [ ] All operators confirm the new dashboard is working correctly
|
||
- [ ] Monitoring checks above are all green
|
||
- [ ] Incident response runbooks updated to reference the new dashboard URL
|
||
- [ ] Legacy web repo marked as archived (not deleted — kept as reference)
|
||
- [ ] ROADMAP.md: mark "Web internal adoption" as `[x]` Done
|
||
- [ ] OPERATIONS.md: update Staged Cutover section to reflect Stage 2 complete
|
||
|
||
---
|
||
|
||
## Next Stage
|
||
|
||
After web internal adoption is confirmed:
|
||
|
||
**Stage 3 — Mobile Internal Beta** (see planned `docs/CUTOVER_MOBILE.md`)
|
||
- Release mobile app to internal testers
|
||
- Validate sign-in, session restore, live state, degraded-state handling
|
||
- Gate: backend/web contracts must be stable through at least one full backend deploy cycle
|