Closes the three Phase 5 P2 follow-ups from the DEPLOYMENT.md
mitigation roadmap that don't need infra changes. Two P2 items remain
(non-root container, docker-proxy daemon) — both genuinely need
container/orchestration work and stay queued.
1. Allow-list shell wrapper (P1)
New `lib/shell.ts`:
- `execAllowed(cmd, args, opts)` — `execFile`-only, no shell, no
interpolation. Single escape hatch for ad-hoc invocations.
- `dockerRestart(name)` — name validated against
`[a-zA-Z0-9][a-zA-Z0-9._-]{0,127}`; throws InvalidShellArgError
on anything else (including non-strings, shell metacharacters,
command-substitution attempts). Tests cover all of these.
- `dockerPrune(kind, {all?})` — kind constrained to
{container,image,volume,builder}; `--all` only valid for image.
- `runBashScript(path, args, {allowedRoots})` — script path AND
cwd both checked against allowed roots; rejects `..` escapes
and prefix-matching siblings (`/opt/projects-evil` vs
`/opt/projects`).
- `runNpmScript(script, {cwd, allowedRoots})` — script ∈
{typecheck,lint,build,test,test:run,start}; cwd inside roots.
17 unit tests cover every rejection path. Module added to the
coverage gate (≥95% lines).
Migrated highest-risk callers off template-literal `exec`:
- `vm/repository.ts:restartContainer` → `dockerRestart`. Was
previously `await execAsync(\`docker restart "${name}"\`)`
with only a regex check; now goes through the wrapper.
- `system/repository.ts:dockerCleanup` → `dockerPrune` per kind
+ `execAllowed` for `docker system df`. Drops the array of
template-literal command strings entirely.
- `code-quality/repository.ts` → `runNpmScript` for every
lifecycle invocation. cwd is now the resolved (normalised,
`..`-collapsed) path, not the raw input.
2. projectPath validation for /code-quality/check (P1)
`runCodeQualityCheck` now calls
`assertPathInAllowedRoots(projectPath, getAllowedRoots())` before
any subprocess spawns. `getAllowedRoots()` reads
`CODE_QUALITY_ALLOWED_ROOTS` (colon-separated env, defaults to
`/opt/bytelyst`). Rejection happens with a clear error message
listing the configured roots so operators know what to allow.
3. Audit-log every privileged shell-out (P2)
`audit/types.ts` extended: `action` now includes `'shell-exec'`,
`entityType` includes `'host'`. The migration is additive — old
audit rows still validate.
Three privileged routes now write a `shell-exec` audit row with
actor (authUserId / authRole), entity id, and a sanitized details
payload before responding:
- `POST /docker/cleanup` — `entityId: docker-cleanup:<type>`,
details include {type, force, freedSpace}.
- `POST /vm/cleanup` — `entityId: vm-cleanup:<mode>`.
- `POST /vm/containers/:name/restart` — `entityId:
container-restart:<name>`, details include {success, message}.
Audited even on failure so attempted privileged actions are
still recorded.
Audit writes are best-effort — a Cosmos hiccup logs a warn but
never fails the request the operator was running.
Verified: backend typecheck ✅, 74/74 unit tests ✅ (17 new for
shell.ts + audit changes), 7/7 E2E ✅, lint 0 errors, coverage gate
≥95% lines on every gated file (which now includes shell.ts).
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Two threads, one commit because they're both about closing dashboard-
side roadmap items that don't need their own slice.
Phase 7 — auth coverage on hermes routes:
- `/api/hermes/ops` was the last unauthenticated Hermes endpoint —
despite revealing instance / gateway / Tailscale-IP / backup-repo /
warnings state. Now gated on `requireAdmin`, matching the new
`/api/hermes/telemetry/:instance` from the previous slice and
every other privileged route in this backend.
- Privilege-surface table in `dashboard/DEPLOYMENT.md` updated to
show `requireAdmin` for both Hermes routes; the previous
"no auth, read-only ops snapshot" carve-out is gone.
- Roadmap Phase 7 ticks for "require auth on hermes routes" + "keep
hermes data private-only" with verification notes.
Phase 4 — Bheem/Uma parity (delegation brief):
- Phase 4 is **VM ops, not codebase work** — it requires sudo on the
Hostinger VM, Uma-owned GitHub credentials, and Telegram bot
tokens. None of it is editable in this repo. Wrote
`docs/prompts/phase4-bheem-uma-parity.md` as a self-contained
delegation brief covering: Uma persistent-backup repo + timer,
Uma health watchdog, first restore rehearsal, quarterly drill
reminder, and the dashboard-side verification (the /hermes/ops +
/hermes/telemetry/bheem outputs that confirm the gap is closed).
- Phase 4 section header in the roadmap now points at the brief
and explains why the checkboxes stay open in this repo.
Verified: backend 57/57 unit tests ✅, web 7/7 E2E ✅ (Playwright
mocks bypass requireAdmin since they fulfill before the request
reaches Fastify; real auth'd users get the same flow as every other
admin route). Lint 0 errors, build green.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Closes the final Phase 5 P1 checkbox and REVIEW_ACTIONS #6.
The backend container has root-equivalent host access via the docker
socket, host log mounts, and the VM scripts mount, but until now the
"who can do what to the host?" answer was scattered across compose
files and route handlers. This commit centralizes it.
DEPLOYMENT.md gains a "Privilege Surface" section that lists:
- every host mount + container path + mode + purpose
- every shell-outing route, the actual commands it runs, and the
auth gate on each
- what an admin token can do today (≈ host shell)
- five known sharp edges (un-allow-listed container names, unvalidated
projectPath, no per-route audit-log on shell-outs, container runs
as root, global rate-limit only)
- a P1 → P3 mitigation roadmap (allow-list wrapper around shell-outs,
projectPath validation, audit-logging shell-outs, drop root in
container, replace docker.sock with a verb-restricted proxy)
Concurrent code fix: `POST /code-quality/check` was reachable
**unauthenticated** despite shelling out to `npm run typecheck/lint/
build/test:run` in a caller-supplied `projectPath`. Added
`preHandler: requireAdmin` to bring it in line with every other
shell-outing route in the dashboard. Same commit because the
documentation table promises this gate exists.
REVIEW_ACTIONS #6 marked RESOLVED with the rationale; roadmap checkbox
ticked. Tests, typecheck, lint (0 errors), build, and coverage gate
(≥95% lines on every gated file) all stay green.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Closes the Phase 5 P1 doc-drift checkbox and REVIEW_ACTIONS #5.
The 3000-vs-3049 confusion came from prose claims in three docs that
each picked a different "right" answer. The truth is: the web container
listens on :3000; docker-compose maps `127.0.0.1:3049:3000`; production
is fronted by Traefik on `https://devops.bytelyst.com`. Encoding that
explicitly so future readers don't have to dig through compose files:
- DEPLOYMENT.md becomes canonical. Its content is now the (more
accurate) old DEPLOYMENT_GUIDE.md merged with a "Ports — quick
reference" table covering Local dev / Docker Compose / Production
Traefik, plus a Local-development section for `pnpm dev`.
- DEPLOYMENT_GUIDE.md → 5-line redirect stub pointing at
DEPLOYMENT.md (kept for `deploy.sh` and any external links).
- deploy.sh updated to point at DEPLOYMENT.md.
- README.md "Web port: 3000" line rewritten to spell out container
vs Compose-host vs dev-mode and link to the port table.
- ENDPOINTS.md gets a top-of-file note: every `localhost:3000` URL
in that file is the `pnpm dev` workflow; substitute `:3049` for
the Dockerized stack.
- REVIEW_ACTIONS.md #5 marked RESOLVED with the rationale.
No code, behavior, lint, or test changes.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Closes the long-standing SSE TODO. The previous attempt with
`fastify-sse-v2 ^4` was incompatible with Fastify 5 and was never wired
in; the README/DEPLOYMENT.md kept advertising "real-time log streaming"
that didn't exist. The web client never used EventSource — `web/src/
lib/api.ts` already polls `/deployments/:id/logs` via the normal
`apiRequest` helper.
Resolution: remove the claim, not ship the feature.
- drop `fastify-sse-v2` dep from `backend/package.json` + lockfile
- delete the commented-out plugin import + register in `server.ts`,
replace with a NOTE explaining the JSON-polling decision and how
to add a stream later (`reply.raw`)
- remove the `TODO: Re-enable SSE` comment in `deployments/routes.ts`;
the endpoint already returns JSON, document that explicitly
- rewrite the README "Deployment Log Streaming" section as
"Deployment Logs" (JSON-polled, no SSE); fix the endpoint table
- flip the DEPLOYMENT.md bullet from "Real-time log streaming (SSE)"
to "Deployment log retrieval (JSON polling — no SSE)"
- mark REVIEW_ACTIONS #4 RESOLVED with the reasoning
- tick the roadmap checkbox
If a real-time stream is wanted later, ship it explicitly via
`reply.raw` and update README/DEPLOYMENT.md/the route comment in the
same change. Don't reintroduce a half-disabled plugin.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- ci.yml: actions/checkout into the runner workspace instead of cd-ing into a
hard-coded host path and `git reset --hard origin/main` on the live checkout;
install via `pnpm install:gitea` (self-contained, no sibling common-plat
checkout); E2E step left as a TODO pointer (ci-e2e-hardening, Phase 5 P2).
- Fix the same stale /opt/bytelyst/bytelyst-devops-tools path in deploy.sh,
scripts/deploy-hotcopy.sh, DEPLOYMENT.md, DEPLOYMENT_GUIDE.md.
- Replace the no-op `lint` echoes with real ESLint 9 flat configs (js +
typescript-eslint recommended) for backend and web; add a root `pnpm lint`.
- Fix the 10 errors lint surfaced, incl. require('os') in an ESM backend
(system/repository.ts -> import * as os), prefer-const x4, and a ternary
expression-statement in web vm/page.tsx.
Verified locally: secret-scan, lint (0 errors; correctly fails on bad code),
typecheck, unit tests (backend 9 / web 11), and build all green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add docker-compose.yml following trading web pattern
- Update web Dockerfile to use multi-stage build with metadata
- Add build metadata (commit SHA, branch, timestamp, author, message)
- Rewrite deploy.sh to use docker compose with build metadata
- Add hotcopy deployment script for quick updates
- Add comprehensive backend API with deployment orchestration
- Add health checks, service management, and monitoring endpoints
- Add CI/CD workflow configuration
- Add deployment documentation and guides
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>