16 KiB
Workspace Anti-Pattern Audit
Date: 2026-02-28 Scope: All 5 workspace repos (
learning_ai_common_plat,learning_voice_ai_agent,learning_multimodal_memory_agents,learning_ai_clock,learning_ai_fastgap) Method: Automated grep/scan across all repos + manual review
Summary
| Severity | Count | Category |
|---|---|---|
| P0 — Security / Data Loss | 4 | Auth gaps, secrets exposure, CORS wildcard |
| P1 — Reliability / Crashes | 6 | Missing error handling, no retries, no graceful shutdown |
| P2 — Maintainability / Debt | 8 | Code duplication, version mismatches, package divergence |
| P3 — Operational / DX | 6 | CI gaps, env sprawl, missing observability |
| Total | 24 |
P0 — Security / Data Loss
1. Admin API routes missing auth guards — CRITICAL
28 of 53 admin-web API routes have no auth check at all. This includes sensitive endpoints:
| Route | Risk |
|---|---|
/api/ops/secrets (GET/POST) |
Lists and writes Azure Key Vault secrets |
/api/ops/secrets/[name] (GET/DELETE) |
Reads and deletes individual secrets |
/api/telemetry/* (7 routes) |
Queries/mutates telemetry data |
/api/themes/* (4 routes) |
Modifies platform themes |
/api/tokens/* (2 routes) |
API token management |
/api/stripe/config |
Stripe configuration |
The secrets routes are the most critical — they interact directly with Azure Key Vault with zero authentication. Anyone who can reach the admin dashboard can read/write/delete all production secrets.
Fix: Add Next.js edge middleware (middleware.ts) that validates JWT on all /api/* routes except /api/auth/login and /api/auth/forgot-password. This is a single file, ~30 lines, that protects all routes uniformly.
2. User-dashboard API routes missing auth — HIGH
31 of 36 user-dashboard API routes lack explicit auth checks. Includes:
/api/payments,/api/subscription— billing operations/api/sessions/*— user session data/api/stripe/portal,/api/stripe/config/api/transcripts— user transcript data/api/dashboard— dashboard aggregations
Fix: Same middleware.ts pattern. Protect all /api/* except /api/auth/*.
3. CORS defaults to wildcard (origin: true) — MEDIUM
In @bytelyst/fastify-core, when CORS_ORIGIN env var is not set, CORS defaults to origin: true (allow all origins). In production, if someone forgets to set this variable, any website can make authenticated requests to platform-service.
// packages/fastify-core/src/create-app.ts:34
const origin = corsOrigin ? corsOrigin.split(',').map(o => o.trim()) : true;
Fix: Default to false (deny all) when CORS_ORIGIN is unset. Require explicit opt-in.
4. No CSP / security headers on MindLyst-web and ChronoMind-web — MEDIUM
Admin-web, tracker-web, and user-dashboard all have security headers in next.config.ts. MindLyst-web and ChronoMind-web have zero security headers configured — no CSP, no X-Frame-Options, no HSTS.
Fix: Copy the security headers block from admin-web's next.config.ts to both apps.
P1 — Reliability / Crashes
5. 21 API routes with no try/catch — crash on any DB/network error — HIGH
| Dashboard | Total Routes | Without try/catch | % Unprotected |
|---|---|---|---|
| user-dashboard | 36 | 12 | 33% |
| mindlyst-web | 33 | 6 | 18% |
| admin-web | 53 | 3 | 6% |
Unprotected routes include payments, subscriptions, sessions, transcripts, SSO callbacks. Any Cosmos timeout or network blip returns an unhandled 500 with a stack trace (information leak + poor UX).
Fix: Wrap each handler body in try/catch, or create a shared withErrorHandler() HOF that all API routes use.
6. No retry / timeout / circuit breaker in @bytelyst/api-client — HIGH
The shared createApiClient() has no timeout, no retry logic, and no circuit breaker. Every consumer (6 dashboards + mobile apps) inherits this:
- A single Cosmos slowdown cascades to all dashboards
- Network blips cause immediate failures with no recovery
- No AbortController timeout — requests can hang indefinitely
Fix: Add timeout option (default 10s via AbortController), retries option (default 2 for GET, 0 for mutations), and exponential backoff.
7. No graceful shutdown in Fastify services — MEDIUM
startService() in @bytelyst/fastify-core calls process.exit(1) on startup failure but has no SIGTERM/SIGINT handler. In Docker/K8s, this means:
- In-flight requests are dropped on deploy
- Database connections not cleaned up
- Potential data corruption on writes
Fix: Add to startService():
for (const signal of ['SIGTERM', 'SIGINT']) {
process.on(signal, async () => {
app.log.info(`Received ${signal}, shutting down gracefully`);
await app.close();
process.exit(0);
});
}
8. No error.tsx in any Next.js app — MEDIUM
Zero of the 5 Next.js apps have an error.tsx file. When a React component throws during render, users see a blank white page (or the browser's default error). This is the #1 source of "the app is broken" reports.
Fix: Add error.tsx to each app's root app/ directory — ~20 lines showing a "Something went wrong" UI with a retry button.
9. No not-found.tsx in 4 of 5 Next.js apps — LOW
Only MindLyst has a custom 404. The other 4 apps show Next.js's default 404 page.
Fix: Add not-found.tsx to each app.
10. Missing loading.tsx in 4 of 5 dashboards — LOW
Only admin-web has a loading.tsx. Other dashboards show no loading indicator during route transitions.
Fix: Add a skeleton loader loading.tsx to each app's layout group.
P2 — Maintainability / Code Duplication
11. MindLyst-web uses raw @azure/cosmos v3 instead of @bytelyst/cosmos — HIGH
MindLyst-web has its own 86-line cosmos.ts with a hand-rolled Cosmos client using v3.17.3 of the SDK. Every other dashboard uses @bytelyst/cosmos (which uses v4.x). This means:
- Different API surface (v3 vs v4 have breaking changes)
- No container registry (MindLyst manages containers ad-hoc)
- Hardcoded
PRODUCT_ID = "mindlyst"instead of using@bytelyst/config - Bug fixes to the shared package don't reach MindLyst
Fix: Migrate MindLyst-web to @bytelyst/cosmos + @bytelyst/config. Replace the 86-line file with ~40 lines matching user-dashboard pattern.
12. MindLyst billing-client uses raw fetch instead of @bytelyst/api-client — MEDIUM
MindLyst's billing-client.ts has its own billingFetch() wrapper with hardcoded headers, token management, and error handling. User-dashboard's billing-client.ts correctly uses createApiClient().
Fix: Rewrite MindLyst's billing-client to use @bytelyst/api-client like every other dashboard.
13. Duplicate feature-flags.ts across repos — MEDIUM
feature-flags.ts is nearly identical in user-dashboard and MindLyst-web (only differs by PRODUCT_ID fallback). Both have their own raw fetch() calls.
Fix: Either add a createFeatureFlagClient() to @bytelyst/api-client or create a thin @bytelyst/feature-flags package.
14. 5 copies of product-config.ts with identical boilerplate — LOW
Every service and dashboard has its own product-config.ts that wraps @bytelyst/config. The files are 5-10 lines of identical code.
Fix: Consider making @bytelyst/config export a ready-to-use PRODUCT_ID constant (lazy-loaded) to eliminate the wrapper files.
15. 4 copies of docker-prep.sh across repos — LOW
Each consumer repo has its own docker-prep.sh script (22-45 lines each) for packing @bytelyst/* tarballs. They diverge in package lists and paths.
Fix: Move the canonical script to learning_ai_common_plat/scripts/docker-prep.sh and have consumer repos call it, or use a shared Makefile target.
16. Duplicate error-boundary.tsx in admin + user dashboards — LOW
Nearly identical class component (differs by 3 whitespace lines). Should be in a shared UI package.
Fix: Move to @bytelyst/react-auth (or create @bytelyst/react-ui) and re-export.
17. Zod v4 vs v3 conflict — ChronoMind uses Zod 4 — MEDIUM
ChronoMind-web uses zod: "^4.3.6" while every other package and service uses Zod 3.x. The @bytelyst/* packages (config, events) all depend on Zod 3. This means:
- ChronoMind cannot use
@bytelyst/configor any Zod-dependent shared package without runtime conflicts - Schema types are incompatible between Zod 3 and Zod 4
Fix: Either downgrade ChronoMind to Zod 3 to match ecosystem, or upgrade the entire ecosystem to Zod 4 (breaking change for all services).
18. TypeScript version skew — MINOR
| Range | Repos |
|---|---|
^5 (loose) |
admin-web, tracker-web, user-dashboard, chronomind-web |
^5.7.0 – ^5.7.3 |
common-plat root, services |
~5.9.2 – 5.9.3 (pinned) |
NomGap, MindLyst-web |
MindLyst pins 5.9.3 (exact) while NomGap uses ~5.9.2. The common-plat root uses ^5.7.0. This can cause type-checking discrepancies.
Fix: Standardize all repos to ^5.9.0 in a coordinated PR.
P3 — Operational / DX
19. 10 disabled CI workflows — no automated quality gate — HIGH
| Repo | Disabled Workflows |
|---|---|
| learning_voice_ai_agent | 7 (ci.yml, ci-python-backend, ci-admin-dashboard, ci-user-dashboard, ci-tracker-dashboard, churn-alert, release) |
| learning_ai_common_plat | 2 (ci.yml, trigger-consumers) |
| learning_multimodal_memory_agents | 1 (ci.yml) |
Only ChronoMind and NomGap have active CI. The three largest repos have zero automated CI on push/PR. Regressions go undetected until manual testing.
Fix: Re-enable CI workflows. Even a minimal typecheck + test workflow on PR catches most regressions.
20. Zero x-request-id propagation in dashboard API routes — MEDIUM
All 122 dashboard API routes (53 admin + 36 user + 33 mindlyst) lack x-request-id propagation. When a dashboard API route calls platform-service, there's no way to correlate the request across services in logs.
Fix: Add a shared middleware or utility that auto-generates and forwards x-request-id from incoming request to all outgoing fetch() / createApiClient() calls.
21. 80+ unique env vars with no central registry — MEDIUM
Across all services and dashboards, there are 80+ unique process.env.* references. There's no single document listing which vars each app needs, their valid values, and which are required vs optional.
Fix: Create an ENV_REGISTRY.md in common-plat docs, auto-generated by scanning all repos. Each entry: var name, required/optional, which apps use it, description.
22. No middleware.ts in any Next.js app — MEDIUM
None of the 5 Next.js apps have a middleware.ts file. This means:
- No edge-level auth protection (each API route must check auth individually — and most don't)
- No redirect logic for unauthenticated users
- No request logging at the edge
Fix: Add middleware.ts to admin-web, user-dashboard, and mindlyst-web. Tracker-web may not need it if it's mostly public.
23. No instrumentation.ts in ChronoMind-web — LOW
All other Next.js apps have instrumentation.ts for AKV secret resolution at startup. ChronoMind-web is missing it — secrets won't resolve from Key Vault.
Fix: Add instrumentation.ts following the pattern from user-dashboard-web.
24. Package manager split: pnpm (common-plat) vs npm (all consumers) — INFO
Common-plat uses pnpm workspace. All 4 consumer repos use npm with package-lock.json. This isn't a bug but creates friction:
- Contributors must know which tool to use where
file:refs from npm to pnpm workspace packages requirepnpm buildfirst- Lock file formats differ
Recommendation: Document this clearly. Long-term, consider migrating consumers to pnpm or publishing @bytelyst/* to a private registry.
Priority Action Plan
Sprint 1 — Security (1-2 days)
- Add
middleware.tsto admin-web (blocks unauthenticated access to secrets, telemetry, themes, tokens) - Add
middleware.tsto user-dashboard (blocks unauthenticated access to payments, sessions, transcripts) - Fix CORS default to deny-all when
CORS_ORIGINis unset - Add security headers to MindLyst-web and ChronoMind-web
next.config.ts
Sprint 2 — Reliability (2-3 days)
- Add
error.tsx+not-found.tsxto all 5 Next.js apps - Add try/catch to all 21 unprotected API routes (or create shared error handler HOF)
- Add timeout + retry to
@bytelyst/api-client - Add graceful shutdown to
@bytelyst/fastify-core
Sprint 3 — Deduplication (2-3 days)
- Migrate MindLyst-web from raw
@azure/cosmosv3 →@bytelyst/cosmosv4 - Migrate MindLyst billing-client to
@bytelyst/api-client - Consolidate
feature-flags.tsinto shared package - Resolve Zod v3/v4 conflict (ChronoMind)
Sprint 4 — Ops & DX (1-2 days)
- Re-enable CI on the 3 largest repos (even minimal typecheck + test)
- Add
x-request-idpropagation to dashboard API layer - Standardize TypeScript version across all repos
- Create
ENV_REGISTRY.mdwith all env vars documented
Items Confirmed Correct (Not Anti-Patterns)
| # | Item | Status |
|---|---|---|
| A | Tracker-web has no direct Cosmos access — uses @bytelyst/api-client → platform-service |
Correct by design |
| B | MindLyst native (KMP) has no Azure wiring — all Azure goes through web API routes | Correct by design |
| C | ChronoMind/NomGap have no direct Azure SDK usage — REST API only | Correct by design |
| D | console.log in @bytelyst/logger |
Intentional (it IS the logger) |
| E | console.log in design-tokens/generate.ts |
Build script, not production code |
| F | print() in Python cli_output.py |
Intentional CLI output (has noqa comment) |
| G | as any in api-client/client.ts:44 |
Single occurrence, casting Headers — acceptable |