# Architecture Risk Analysis Date: 2026-02-16 Scope: `bytelyst-trading-bot-service` + `bytelyst-trading-dashboard-web` Mode: Risk analysis only (no redesign, no implementation proposal) ## Phase Roadmap Tracking ### Phase 1: Multi-Tenant Isolation Hardening - [x] Partition runtime REST state by authenticated `user_id` for: - `/api/status` - `/api/alerts` - `/api/symbol/:symbol` - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/a91e253 - [x] Remove global runtime websocket broadcast surfaces and emit tenant-scoped updates only. - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/a91e253 - [x] Prevent stale profile cache leakage on profile removal (`unregisterManualTrader` cleanup path). - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/a91e253 - [x] Enforce RLS coverage for `orders` and `trade_history` in schema + CI policy verification. - Commits: - https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/d9395b4 - [x] Add validation proving tenant isolation (`user A` cannot access `user B` profile/trade runtime data). - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/089af51 - [ ] Run `schema/009_tenant_rls_orders_trade_history.sql` on each target environment (dev/staging/prod) and verify policy existence in Supabase. ### Phase 2: Restart Durability & Snapshot - [x] Create durable `bot_state_snapshots` table with `auth.users` FK and RLS guard, then verify presence via policy gate. - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/ae339ba - [x] Replace file-only state restore/persistence with `SupabaseService.saveBotStateSnapshot`/`loadLatestBotStateSnapshot` backed by the new table and asynchronous `ApiServer` snapshot flow. - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/ae339ba - [x] Rehydrate `TradeExecutor` on startup by pulling open orders from Supabase/exchange, rebuilding lifecycle tracking, and validating duplicates via DB queries instead of in-memory maps. - Commit: https://github.com/saravanakumardb/bytelyst-trading-bot-service/commit/ae339ba - [ ] Apply the new snapshot migration and ensure each environment records a valid `snapshot_user_id` (or derived owner) before relying on DB restore. ## Architecture Risk Analysis - Mitigated in code (pending environment rollout): Multi-tenant state exposure risk. - Runtime REST responses and websocket emissions are now tenant-scoped by authenticated user. - Evidence (fix commits): `a91e253`, `089af51`. - Critical: Snapshot architecture inconsistency for restart durability. - Runtime writes snapshots with `user_id='system_backup'` while schema requires UUID FK. - Startup flow does not actually restore from DB snapshot path. - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:375`, `bytelyst-trading-bot-service/schema/008_schema_gap_backfill.sql:205`, `bytelyst-trading-bot-service/src/services/apiServer.ts:637`, `bytelyst-trading-bot-service/src/services/apiServer.ts:642`. - High: Non-transactional lifecycle persistence across orders/history/runtime maps can drift under partial failures. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:597`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:642`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:801`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:920`. - High: Single-process authority assumption. - In-memory maps are primary state authorities, with no distributed coordination boundary. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:68`, `bytelyst-trading-bot-service/src/services/apiServer.ts:191`. ## Missing Enterprise Components - Tenant-scoped state/event partitioning for WebSocket/REST runtime payloads. - RLS validation coverage for `orders` and `trade_history` is now present in migration + policy checks; environment rollout remains. - Evidence (fix commit): `d9395b4`. - Durable structured audit sink for control-plane actions (currently log-line only). - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:399`. - Metrics export wiring is not active in API/runtime integration. - Evidence: `bytelyst-trading-bot-service/src/services/MetricsService.ts:4`. ## Structural Integrity Risks - API profile state caches are write/update only, no explicit delete path in merge maps. - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:1269`, `bytelyst-trading-bot-service/src/services/apiServer.ts:1285`. - Lifecycle UI can become window-biased due to hard query limits, causing false orphan/mismatch interpretations. - Evidence: `bytelyst-trading-dashboard-web/src/tabs/PositionsTab.tsx:346`, `bytelyst-trading-dashboard-web/src/tabs/PositionsTab.tsx:397`, `bytelyst-trading-dashboard-web/src/tabs/PositionsTab.tsx:664`. - Client-side synthetic lifecycle IDs can diverge from backend trace truth. - Evidence: `bytelyst-trading-dashboard-web/src/tabs/PositionsTab.tsx:293`, `bytelyst-trading-dashboard-web/src/tabs/PositionsTab.tsx:297`. ## Lifecycle Integrity Risks - Finalization suppression when entry chain is missing protects integrity but can produce realized PnL visibility gaps. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:843`. - Mixed-side virtual position reconstruction collapses to dominant side, masking opposite residual slices. - Evidence: `bytelyst-trading-bot-service/src/services/SupabaseService.ts:1254`. - Exit lifecycle control is symbol-scoped, not trade-scoped, for in-flight transition locking. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:76`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:548`. ## Capital Isolation Risks - Capital checks rely on local in-memory open positions and exclude pending intent reservation. - Evidence: `bytelyst-trading-bot-service/src/index.ts:131`, `bytelyst-trading-bot-service/src/services/AutoTrader.ts:167`, `bytelyst-trading-bot-service/src/services/ManualTrader.ts:26`. - Concurrent profile evaluations per symbol can pass guard checks before committed state convergence. - Evidence: `bytelyst-trading-bot-service/src/index.ts:523`, `bytelyst-trading-bot-service/src/index.ts:530`. ## Exchange Reconciliation Risks - Reconciliation parity checks only recent subset (`limit=25` per profile), allowing long-tail drift persistence. - Evidence: `bytelyst-trading-bot-service/src/index.ts:845`. - Pending-order recovery scope (`pending_new`) is narrower than stale-order sync scope (`pending_new|pending|accepted|new`). - Evidence: `bytelyst-trading-bot-service/src/services/SupabaseService.ts:655`, `bytelyst-trading-bot-service/src/services/SupabaseService.ts:597`. - Runtime pending order broadcast normalizes to `pending_new`, which can transiently differ from DB/exchange truth. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:957`. ## Restart Recovery Risks - Critical idempotency and lifecycle coordination maps are process-local and reset on restart. - Evidence: `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:73`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:74`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:75`, `bytelyst-trading-bot-service/src/services/TradeExecutor.ts:76`. - Startup restore sequence prefers local file; DB snapshot restore path is intentionally not used. - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:621`, `bytelyst-trading-bot-service/src/services/apiServer.ts:642`. ## Concurrency and Locking Risks - Loop guards are local booleans only; no shared lock domain across services/instances. - Evidence: `bytelyst-trading-bot-service/src/index.ts:462`, `bytelyst-trading-bot-service/src/index.ts:463`, `bytelyst-trading-bot-service/src/services/tradeMonitor.ts:10`, `bytelyst-trading-bot-service/src/services/OrderStatusSyncService.ts:47`. - Profile-scale fan-out creates one monitor and one order-sync worker per profile plus orphan sync workers, expanding race and rate-pressure surfaces. - Evidence: `bytelyst-trading-bot-service/src/index.ts:196`, `bytelyst-trading-bot-service/src/index.ts:198`, `bytelyst-trading-bot-service/src/index.ts:207`. ## Observability Gaps - Readiness signal only tracks loop recency and exchange connectivity flag; lacks explicit monitor/order-sync/profile-sync liveness SLO dimensions. - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:352`. - No persistent operational audit table for lifecycle control actions. - Evidence: `bytelyst-trading-bot-service/src/services/apiServer.ts:399`. - Metrics service exists but is not exposed as runtime telemetry endpoint. - Evidence: `bytelyst-trading-bot-service/src/services/MetricsService.ts:114`. ## Production Readiness Score - Score: **58 / 100** - Rationale: Core lifecycle execution logic is materially hardened, but enterprise-grade readiness is constrained by tenant data exposure, restart durability gaps, local-process locking assumptions, and incomplete production observability integration.