Closes the Phase 5 P1 doc-drift checkbox and REVIEW_ACTIONS #5. The 3000-vs-3049 confusion came from prose claims in three docs that each picked a different "right" answer. The truth is: the web container listens on :3000; docker-compose maps `127.0.0.1:3049:3000`; production is fronted by Traefik on `https://devops.bytelyst.com`. Encoding that explicitly so future readers don't have to dig through compose files: - DEPLOYMENT.md becomes canonical. Its content is now the (more accurate) old DEPLOYMENT_GUIDE.md merged with a "Ports — quick reference" table covering Local dev / Docker Compose / Production Traefik, plus a Local-development section for `pnpm dev`. - DEPLOYMENT_GUIDE.md → 5-line redirect stub pointing at DEPLOYMENT.md (kept for `deploy.sh` and any external links). - deploy.sh updated to point at DEPLOYMENT.md. - README.md "Web port: 3000" line rewritten to spell out container vs Compose-host vs dev-mode and link to the port table. - ENDPOINTS.md gets a top-of-file note: every `localhost:3000` URL in that file is the `pnpm dev` workflow; substitute `:3049` for the Dockerized stack. - REVIEW_ACTIONS.md #5 marked RESOLVED with the rationale. No code, behavior, lint, or test changes. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
8.0 KiB
Dashboard Repo Review — Top Actions
Reviewed: 2026-05-27. Scope: /opt/bytelyst/learning_ai_devops_tools/dashboard (the ByteLyst DevOps Dashboard pnpm workspace: backend/ Fastify 5 + web/ Next.js 16).
Baseline state (verified during review):
pnpm typecheck— passes for both backend and web.pnpm test:run— passes (backend 9 tests / 1 file, web 11 tests / 2 files).pnpm secret-scan— clean..envis gitignored; only.env.examplefiles are tracked.
The dashboard is functional and well-structured, but several issues block CI, hide regressions, and create operational risk. Actions are ordered by priority.
P0 — Broken / Urgent
1. CI workflow points at a non-existent path
.gitea/workflows/ci.yml runs everything from /opt/bytelyst/bytelyst-devops-tools/dashboard, but the actual checkout lives at /opt/bytelyst/learning_ai_devops_tools/dashboard. The same wrong path is hard-coded in DEPLOYMENT.md and scripts/deploy-hotcopy.sh.
- Action: replace the hard-coded path with
${{ gitea.workspace }}(or a singleWORKDIRenv var) in <ref_file file="/opt/bytelyst/learning_ai_devops_tools/dashboard/.gitea/workflows/ci.yml" />, then fix the two other references in <ref_file file="/opt/bytelyst/learning_ai_devops_tools/dashboard/DEPLOYMENT.md" /> and <ref_file file="/opt/bytelyst/learning_ai_devops_tools/dashboard/scripts/deploy-hotcopy.sh" />. - Verify: trigger a CI run on a throwaway branch and confirm green.
2. "Lint" steps are no-ops
Both backend/package.json and web/package.json define lint as echo 'No linting configured...'. The CI step "Lint" therefore always passes regardless of code quality. There is no ESLint, Biome, or equivalent configured anywhere in the workspace.
- Action: pick one tool (recommend ESLint +
@typescript-eslintfor backend, Next.js's built-in ESLint config for web, sincenextalready ships it). Wirenext lintintoweb/package.jsonand add a minimal.eslintrcto backend. - Verify:
pnpm lintreturns a non-zero exit on a deliberately bad change.
P1 — Important Gaps
3. Test coverage is extremely thin
Backend has 12 modules (services, deployments, health, audit, backup, system, env, azure-config, code-quality, cosmos-config, hermes-ops, vm) but only services has a test file. The deployment orchestrator (backend/src/modules/deployments/orchestrator.ts), CSRF (backend/src/lib/csrf.ts), and auth (backend/src/lib/auth.ts) — the highest-risk surfaces — have no tests at all.
- Action: add
*.test.tsfor at leastauth,csrf,deployments/orchestrator, andhealthrepository before adding more features. Mirror the style of <ref_file file="/opt/bytelyst/learning_ai_devops_tools/dashboard/backend/src/modules/services/services.test.ts" />. - Add
pnpm test:coverageto CI and fail under a threshold (start at 50 %, raise over time).
4. SSE deployment-log streaming is disabled — RESOLVED (removed)
The TODO has been resolved by removing the SSE claim, not by shipping it: the fastify-sse-v2 dependency is gone from backend/package.json, the commented-out import + plugin registration are gone from backend/src/server.ts, and the deployment-log endpoint is now documented as JSON-polled. The web client never used EventSource (web/src/lib/api.ts already polls /api/deployments/:id/logs via the normal apiRequest helper), so no UI change was required. README/DEPLOYMENT.md updated to match. If a real-time stream is wanted later, ship it explicitly via reply.raw and update the docs in the same change.
5. Documentation drift — RESOLVED
DEPLOYMENT.mdis now the single canonical deployment guide;DEPLOYMENT_GUIDE.mdis reduced to a one-line redirect for backwards compat withdeploy.shand external links.deploy.shupdated to referenceDEPLOYMENT.md.DEPLOYMENT.mdcarries an explicit Ports — quick reference table that distinguishes container port (:3000), Compose host port (:3049), and the Traefik production URL — so the 3000-vs-3049 question has one truthful answer per deployment mode rather than three contradictory prose claims.README.md"Web port: 3000" rewritten to call out container vs Compose host vs dev-mode explicitly and link to the port table.ENDPOINTS.mdgot a top-of-file note: everylocalhost:3000URL in the file refers topnpm dev; substitute:3049for the Dockerized stack. Thehttps://api.bytelyst.com/api/devopsvs/devopsambiguity was already resolved by the existing "URL Note" section (kept).
6. Docker socket + host log mounts are very privileged
docker-compose.yml mounts /var/run/docker.sock, the host scripts directory, and three host log paths into devops-backend. This is the same risk profile as Portainer but with custom code reading/writing those mounts. There is no documentation of which backend module talks to the docker socket or what commands it issues.
- Action: document the privilege surface (which routes shell out, which call
docker), and consider a thin allow-list wrapper instead of mounting the raw socket. At minimum, add a section toDEPLOYMENT.mdenumerating these mounts and their purpose so reviewers know the blast radius.
P2 — Hygiene
7. Backend module structure isn't enforced
Most modules follow the routes.ts / repository.ts / types.ts triple, but a few have extras (deployments/orchestrator.ts). There is no architectural test, README, or generator. New contributors will diverge.
- Action: add a short
backend/src/modules/README.mddescribing the convention, and (optionally) an architectural test usingdependency-cruiseror a custom vitest.
8. README is unfocused
README.md mixes "Recent Improvements" (a changelog), feature list, setup, env vars, and full API docs into one 219-line file. The first cat of the file even shows it begins with two blank lines after the title — easy to miss content.
- Action: trim README to: what / quickstart / pointers. Move "Recent Improvements" into
CHANGELOG.mdand keep API docs only inENDPOINTS.md/ Swagger.
9. .pnpmfile.cjs dual-mode install is undocumented in CI
pnpm install:common-plat vs pnpm install:gitea is only mentioned in the README. The CI workflow uses install:common-plat, which only works if the runner has the sibling learning_ai_common_plat checkout available. That assumption isn't asserted anywhere.
- Action: add a pre-install check that fails fast with a clear message if the expected workspace path is missing, and document the runner prerequisites in the CI file.
10. No production logging / metrics story
backend/src/server.ts uses Fastify's default logger only. There is a web/src/lib/telemetry.ts file but nothing wires it to a backend. The dashboard advertises "monitoring" but doesn't emit its own structured telemetry.
- Action: decide on a target (pino transports → stdout for container logs is enough for now) and write down the choice. If Prometheus / OpenTelemetry is in scope, file a tracked issue rather than leaving it implied.
11. E2E tests aren't wired into local workflow
web/e2e/dashboard.spec.ts and web/e2e/hermes.spec.ts exist and pnpm test:e2e is defined, but nothing documents how to start the backend+web before running them, and CI's E2E step (visible in .gitea/workflows/ci.yml) is cut off in the file — need to confirm it actually launches the stack.
- Action: read the bottom half of
ci.ymland confirm the E2E job sets up backend+web; if not, fix it. Add apnpm test:e2erecipe to README that explicitly says "runpnpm devfirst" or use Playwright'swebServerconfig.
Suggested execution order
- Fix the CI path (#1) — unblocks everything else.
- Reconcile the SSE TODO (#4) — either remove the claim or ship the feature.
- Add real linting (#2) and tighten test coverage on auth/csrf/orchestrator (#3).
- Documentation pass: ports, deployment docs, README trim (#5, #8).
- Privilege/operational hardening (#6, #10).
- Convention + DX polish (#7, #9, #11).
Each item above is small enough to land as a single PR.