- shared/realtime.ts: add SOCKET_NAMESPACES constants (/trading, /admin, root) - shared/feature-flags.ts: add tabs.marketplace and tabs.membership to TradingFeatureFlagsResponse; add FEATURE_FLAG_KEYS constants - .env.example: remove /api suffix from VITE/NEXT_PUBLIC trading URL vars (web appends /api itself); add tab visibility flag vars with comments - web: add useTabFeatureFlags hook + DOM test; wire tab visibility into App.tsx - web/vite.config.ts: finalize build config - mobile/providers/TradingDataProvider.tsx: deriveSocketParams for proxy-safe socket origin/path resolution (already landed upstream, conflict resolved) - docs: add CUTOVER_WEB.md, CUTOVER_MOBILE.md checklists; update OPERATIONS.md with Docker commands and resolved gap log; update ROADMAP.md to Done; add BACKEND_AUDIT_SCHEMA.md, BACKEND_API_DEPRECATION.md, CONVENTIONS.md; add audit-events container entry to AZURE_INFRASTRUCTURE.md - README.md: full rewrite with workspace table, arch summary, env var reference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8.4 KiB
Web Internal Adoption Checklist (Stage 2 Cutover)
Purpose
This document is the step-by-step runbook for switching internal operators from the
legacy bytelyst-trading-dashboard-web to the new monorepo web dashboard (web/).
It covers the pre-flight gate, deployment, validation, rollback triggers, and post-adoption monitoring. Complete every step in order.
Pre-Flight Gate
Do not begin cutover until all of the following are true.
Go / No-Go Checks
Run from the monorepo root:
pnpm verify # typecheck + test + build — must be green
pnpm lint # backend contract + security guards + web/mobile lint — must be green
pnpm smoke:release # auth + kill-switch smoke tests — must pass
Backend-specific:
cd backend
npm run check:api-contract # feature-flag shapes, audit events, namespace constants
npm run check:websocket-contract # BotState lifecycle consistency
npm run check:security-guards # tenant isolation — must be green
npm run check:tenant-isolation # row-level access — must be green
Environment Checks
- Backend is deployed and reachable (
GET /health/livereturns 200) - Cosmos DB containers readable and writable (
dynamic_config,trading-profiles,trading-control,snapshots,capital-ledger) - Platform-service is reachable from the deployment environment
PLATFORM_AUTH_ENABLED=trueis set on the backend deploymentVITE_TRADING_API_URLpoints to the deployed backend (not localhost)VITE_PLATFORM_URLpoints to the live platform-serviceCORS_ALLOWED_ORIGINSon the backend includes the new web dashboard origin- Feature flags set correctly for the rollout population:
TAB_MARKETPLACE_ENABLED— set per rollout planTAB_MEMBERSHIP_ENABLED— set per rollout planENABLE_BACKTEST— set per rollout plan
Rollback Readiness
- The legacy web dashboard URL is still live and working
- You know who owns the rollback decision and how to reach them
- Backend trade-halt control is reachable (
POST /internal/trading/pause)
Step 1 — Deploy the Web Dashboard
# From monorepo root — production build
pnpm build
# Or using Docker
pnpm docker:up
Verify the deployment:
- Web dashboard loads at the new URL without a blank screen
- Browser console shows no errors on load
- Network tab shows no 4xx/5xx on initial API calls
Step 2 — Internal Operator Sign-In
Have each internal operator complete the sign-in sequence:
- Navigate to the new web dashboard URL
- Sign in using platform credentials (same as the legacy dashboard)
- Session restores correctly after browser refresh (no re-login required)
- Auth token is a platform JWT (check via browser devtools:
Authorization: Bearer ...on API calls) GET /api/me/profilereturns the correct user profile and role
Step 3 — Core Feature Validation
Each operator validates their own user scope:
Trading State
- Overview tab loads with live bot state (not stale/empty)
- WebSocket connection shows "Connected" in the header
- Socket connects to
/tradingnamespace (check backend logs:[API][/trading] Client connected) - Positions tab shows current open positions
- Trade History tab shows closed trade history
- My Strategies tab lists the operator's trading profiles
Real-Time Updates
- Leave the dashboard open for 60 seconds; confirm symbol prices update live
- Trigger a manual order or profile toggle; confirm the state updates without refresh
Admin Operators (role = admin only)
- Signals tab is visible and loads correctly
- Entries tab is visible and loads correctly
- Admin Panel tab is visible
- Strategy Clusters tab is visible
- Admin Panel → Trading Control: pause and resume work correctly
- Backend logs show
[AUDIT]entries for pause/resume actions - "Preview as Customer" toggle hides admin-only tabs correctly
Kill-Switch Behaviour
- If platform-service maintenance mode is toggled on, web blocks access with correct UI
- After maintenance mode is lifted, web recovers without a page reload
Step 4 — Config and Feature Flag Validation
GET /api/feature-flagsreturns the correctbacktest,tabs.marketplace, andtabs.membershipvalues- Backtesting tab visibility matches
ENABLE_BACKTESTandBACKTEST_CUSTOMER_ENABLEDconfig - Marketplace tab visibility matches
TAB_MARKETPLACE_ENABLEDconfig - Membership tab visibility matches
TAB_MEMBERSHIP_ENABLEDconfig - Dynamic config changes via Admin Panel → Config are persisted to Cosmos and visible after a page refresh
Step 5 — Request Tracing Spot Check
Pick any operator action (e.g., load trade history):
- Browser devtools shows
x-request-idheader on the request - Backend response echoes the same
x-request-id - Search backend logs for that
x-request-id— the full request trace appears
Step 6 — Parallel Run Period (Recommended: 1–3 days)
Run the new and legacy dashboards in parallel before switching traffic fully:
- Operators use the new dashboard as primary
- Legacy dashboard remains accessible as a fallback
- No trading state mutations go through the legacy dashboard during this period
- Monitor for discrepancies between what new and legacy dashboards show
Step 7 — Traffic Cutover
Once parallel run is complete with no issues:
- Update any bookmarks, internal links, or runbooks to point to the new URL
- Communicate to all internal users that the new dashboard is now primary
- Disable or redirect the legacy dashboard URL (do not delete it yet)
Rollback Triggers
Stop cutover and revert to the legacy dashboard immediately if any of the following occur:
| Condition | Action |
|---|---|
| Sign-in or session restore fails for any operator | Rollback |
| Tenant data leak — operator sees another user's positions or history | Rollback immediately + page oncall |
| Trading control (pause/resume) does not apply correctly | Rollback |
| Dynamic config writes fail silently | Rollback |
| WebSocket disconnects repeatedly with no recovery | Rollback |
| Missing data in positions or trade history vs. legacy dashboard | Investigate before proceeding |
Rollback Steps
- Restore the legacy dashboard URL as primary (flip DNS or update internal links)
- Notify all operators to switch back immediately
- Do not rewrite or delete Cosmos state during first-response rollback
- File an incident report referencing the
x-request-idvalues from affected requests - Resolve the root cause before re-attempting cutover
Post-Adoption Monitoring (First 24 Hours)
Watch the following immediately after cutting over:
Immediate (first 30 minutes)
- Platform auth failure rate is zero
- Token refresh failures are zero
- Backend
401/403error rate is baseline (no spike) - WebSocket connection error rate is baseline
First Hour
- Cosmos reads and writes are completing successfully (check backend logs for Cosmos errors)
- Dynamic config refresh cycle completes without error (every
DYNAMIC_CONFIG_REFRESH_MS) - No tenant isolation anomalies in security guard logs
First 24 Hours
- Runtime control drift: Cosmos control-plane state matches in-memory trading control mode
- Kill-switch state matches platform-service state
- No stale session events (operators are not re-prompted to log in unexpectedly)
- No build or chunk-size regressions affecting web load time (check browser waterfall)
Post-Cutover Sign-Off
Complete after the first 24 hours with no rollback triggers:
- All operators confirm the new dashboard is working correctly
- Monitoring checks above are all green
- Incident response runbooks updated to reference the new dashboard URL
- Legacy web repo marked as archived (not deleted — kept as reference)
- ROADMAP.md: mark "Web internal adoption" as
[x]Done - OPERATIONS.md: update Staged Cutover section to reflect Stage 2 complete
Next Stage
After web internal adoption is confirmed:
Stage 3 — Mobile Internal Beta (see planned docs/CUTOVER_MOBILE.md)
- Release mobile app to internal testers
- Validate sign-in, session restore, live state, degraded-state handling
- Gate: backend/web contracts must be stable through at least one full backend deploy cycle