Closes Phase 2. Every entity in `web/src/lib/hermes` now carries an
`instanceId: 'vijay' | 'bheem'` (with `'all'` allowed for cross-cutting
agents like Hermes Core / GitHub link), and a global instance switcher
above every Mission Control pane filters them.
Library changes (`web/src/lib/hermes.ts`):
- New `HermesInstanceId` / `HermesInstanceFilter` types + `HERMES_INSTANCES`
metadata array.
- `instanceId` added to `HermesProduct`, `HermesTask`, `HermesEvent`,
`HermesRun`, `HermesAgentStatus`. Seed data deterministically split
~50/50 across instances; agents tagged per-scope (Local VM runner →
bheem, CLI runner / Scheduler → vijay, Hermes Core / GitHub /
OpenClaw / deployment / notifications → all).
- `getHermesTasks({instance})`, `getHermesProducts(view, instance)`,
`getHermesAgents(instance)`, `getHermesHistory(instance)`,
`getHermesOverview(instance)` all accept the filter; helper
`instanceMatches(scope, filter)` keeps the semantics consistent
(always-match for `'all'` on either side).
UI changes:
- New `HermesInstanceProvider` (React context, localStorage-backed
under `hermes.instanceFilter.v1`, SSR-safe default to avoid
hydration mismatch) mounted in `app/hermes/layout.tsx`.
- New `HermesInstanceSwitcher` segmented control (radiogroup with
aria-checked) rendered in the layout header above every pane.
- New `HermesInstanceBadge` shown on task rows (Active Missions +
Task Ledger), product cards (overview minicards + portfolio
cards), and agent cards.
- `/hermes` overview gains a "Per-instance roll-up" section that
always shows Vijay vs Bheem side-by-side regardless of the active
filter — that's the always-cross-instance comparison view, while
the eight metric cards above it are filtered by the switcher.
Tests:
- 2 new unit tests in `lib/hermes.test.ts` (instance tagging on seed
data + filter semantics across tasks/products/agents/overview).
- 1 new E2E test asserting the switcher's radiogroup, default
selection, and persistence-friendly state change.
- All green: 13/13 web unit tests, 7/7 E2E.
`web/test-results/` and `web/playwright-report/` added to `.gitignore`
since they're regenerated per run.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Closes the Phase 5 P2 checkbox (second half — first half: pino logging
in 1e64d75). Phase 5 is now fully green.
Two changes:
1. `web/e2e/hermes.spec.ts` now intercepts `/api/hermes/ops` with a
fixture snapshot. The backend's hermes-ops endpoint shells out to
`systemctl` / `git` / `ps` / `du` on the live VM and is therefore
neither available nor deterministic in CI. Mocking it lets the
suite run against the web stack alone (no backend, no live VM).
Fixture shape mirrors the Zod schema in
`backend/src/modules/hermes-ops/types.ts`.
2. `.gitea/workflows/ci.yml` re-enables the previously-commented-out
E2E step. Adds a preceding `playwright install --with-deps
chromium` step so the runner pulls the browser fresh per run.
The web suite starts its own Next dev server via Playwright's
`webServer` config (`pnpm exec next dev -p 3200`), so we do NOT
start the backend in CI — every backend route used by the suite
is mocked via `page.route` (auth, csrf, services, deployments,
health/cache, seed, hermes-ops).
Verified locally: `pnpm exec playwright test` → 6 passed in 19.5s
(2 hermes specs + 4 dashboard/login specs across desktop + mobile).
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- ci.yml: actions/checkout into the runner workspace instead of cd-ing into a
hard-coded host path and `git reset --hard origin/main` on the live checkout;
install via `pnpm install:gitea` (self-contained, no sibling common-plat
checkout); E2E step left as a TODO pointer (ci-e2e-hardening, Phase 5 P2).
- Fix the same stale /opt/bytelyst/bytelyst-devops-tools path in deploy.sh,
scripts/deploy-hotcopy.sh, DEPLOYMENT.md, DEPLOYMENT_GUIDE.md.
- Replace the no-op `lint` echoes with real ESLint 9 flat configs (js +
typescript-eslint recommended) for backend and web; add a root `pnpm lint`.
- Fix the 10 errors lint surfaced, incl. require('os') in an ESM backend
(system/repository.ts -> import * as os), prefer-const x4, and a ternary
expression-statement in web vm/page.tsx.
Verified locally: secret-scan, lint (0 errors; correctly fails on bad code),
typecheck, unit tests (backend 9 / web 11), and build all green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
vm-health-check.sh:
- check_gpu(): nvidia-smi probe; "CPU-only" OK on this VM (no GPU)
- check_image_freshness(): flag containers running images >30d old.
Skips third-party images (gitea, grafana, prom, mcr.microsoft, axllent,
caddy, traefik, valkey, cadvisor) — they have their own rebuild cadence.
Currently flags 19 stale product images (~60d old).
chaos-validation.sh:
- Monthly chaos test: kill PID 1 in chronomind-web, wait up to 35 min
for docker-health-watchdog to detect + restart. Telegram pass/fail.
- Refuses to run if target not healthy. systemd timer fires 1st of month
at 10:00 UTC (after 08:00 weekly digest).
vm-io-anomaly-check.sh:
- 6h avg sda write rate; transition alerts at WARN (1 GB/hr) /
CRIT (2.5 GB/hr). De-dupes via /var/log/vm-io-anomaly-state so the
alert fires once per transition, not every 6h. Current baseline:
~1.94 GB/hr (orphan-container state-file writes; see Phase 0.3).
- Reports recovery to OK when rate drops back.
vm/page.tsx: gpu + image_freshness added to CHECK_META so they render
with proper icon/label and slot into CHECK_ORDER.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- prometheus.ts: new Prometheus client with 7d/30d range queries for disk,
memory, swap, CPU steal, and disk I/O (GB/hr); getWeeklyDigestData()
aggregates all metrics for digest and API endpoint
- routes.ts: GET /api/vm/metrics/trend?metric=…&range=… and
GET /api/vm/weekly-digest endpoints
- api.ts: TrendPoint/TrendSeries types; getTrend() and getMemoryTrend()
added to vmApi
- vm/page.tsx: Sparkline (pure SVG polyline+fill), TrendCard with
latest/avg/peak and threshold colouring, TrendsPanel with lazy load
on first open; Promise.allSettled() isolation for all 5 data panels
- vm-weekly-digest.sh: weekly Telegram digest via docker exec into
devops-backend to reach Prometheus; emoji severity indicators; cron
summary from /var/log/vm-cleanup.log
- systemd timer: Mon 08:00 UTC, Persistent=true (fires on next boot
if missed); first trigger 2026-06-02
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 3.1 — VM Score Card (0–100):
- 6 weighted dimensions: steal time, RAM, disk, service health,
maintenance hygiene, LLM readiness (matching roadmap scoring)
- Color-coded gauge + per-dimension progress bars with detail text
- Computed from health + cron + unhealthy data; degrades gracefully
when any source is unavailable
Phase 1.3 — Unhealthy Container Detail Panel:
- Loads independently from GET /api/vm/containers/unhealthy
- Per-container: name, unhealthy since, restart count, last health logs
- Expandable row for health check output
- One-click restart with spinner, feedback toast, auto-refresh after 3s
Phase 1.1 — Cron Status Panel:
- Loads from GET /api/vm/cron-status
- Table: 4 managed jobs × schedule | last run | freed MB | status | next
- Collapsible run history (last 10) with step-by-step log expansion
Phase 3.4 — Ollama/LLM Panel:
- Loads from GET /api/vm/ollama/models
- Currently-loaded section with RAM pressure warning (<4 GB free)
- RAM bar visualisation showing model footprint
- Model list with size + last-used time
- One-click unload button
Other improvements:
- All data fetched in parallel (Promise.allSettled) — any panel failure
does not block the rest of the page
- Add steal, failed_units, cron_missing_paths to CHECK_META/CHECK_ORDER
- Refresh now updates all 5 data sources atomically
- web/package-lock.json regenerated (was stale, caused build failure)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- code-quality/repository.ts: fix tsErrorMatch[3] → [4] for type field (group 3 is column, 4 is error|warning)
- code-quality/repository.ts: fix ESLint regex to make rule brackets optional (not all formatters include them)
- code-quality/repository.ts: fix Vitest test count — parse 'Tests' line (individual tests) instead of 'Test Files' (file count); improve Jest regex to capture pass/fail independently
- env/repository.ts: replace raw process.env.ENCRYPTION_KEY with config.ENCRYPTION_KEY so the validated default flows through a single source of truth
- config.ts: add startup console.warn when CSRF_SECRET or ENCRYPTION_KEY are using insecure defaults
- deployments/orchestrator.ts: refactor runDeploymentScript to use try/catch/finally — deployment record is now always written in the finally block, preventing zombie 'running' states if updateDeployment itself throws
- auth.tsx: remove dead 'user &&' guard (user is always truthy after the !user check above); remove debug console.log calls, keep console.error
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Primitives.tsx (TS2339):
- asChild branch read children.props.className before the cast applied,
making props typed as unknown. Extract typedChild first, then read props.
hermes/page.tsx + agents/page.tsx + tasks/page.tsx + tasks/[id]/page.tsx (TS2322):
- Badge.variant accepts 'neutral'|'success'|'warning'|'error'|'info' but
callers were passing 'danger' (should be 'error') and 'default' (should
be 'neutral'). MetricCard.tone is a separate type and is correct as-is.
Changes:
- statusTone map in hermes/page.tsx: 'danger' → 'error', 'default' → 'neutral'
- getTaskTone fallback: 'default' → 'neutral'; explicit return type added
- levelTone in tasks/[id]/page.tsx: 'danger' → 'error'; explicit return type added
- Inline Badge variants: all remaining 'danger' → 'error' across 3 files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Security (backend):
- env/routes: add requireAdmin to all 6 env endpoints — GET /env was
fully open, exposing all secret values to unauthenticated requests
- deployments/routes: add requireAdmin to all 4 GET endpoints (deployment
history and logs were publicly readable)
- health/routes: remove duplicate requireAdmin call from DELETE /health/cache
handler body (was already enforced via preHandler)
Frontend — auth/api:
- system/page: replace raw fetch + localStorage token with apiRequest
(mutations now go through CSRF flow)
- vm/page: same — replace raw fetch with vmApi from api.ts
- api.ts: add vmApi (getHealth, getCleanupLog, runCleanup) + shared
VmHealthResult / VmCheck / VmCheckLevel types
Shared utilities:
- utils.ts: add formatBytes() and getStatusColor() shared helpers
- system/page: remove duplicate formatBytes, import from utils
- health/page: remove duplicate getStatusColor, import from utils
- page.tsx (home): remove duplicate getStatusColor, import from utils
UX improvements:
- page.tsx: remove Seed Services button from normal header (debug tool)
- page.tsx: deploy button now always enabled; shows inline warning banner
when service is not 'up' instead of silently disabling the button
- metrics: fix bar chart — bars now grow from bottom (flex-col-reverse),
add empty state, fix date parsing timezone edge case
- sidebar-nav: theme toggle now functional — persists to localStorage and
toggles document.documentElement class 'dark'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add docker-compose.yml following trading web pattern
- Update web Dockerfile to use multi-stage build with metadata
- Add build metadata (commit SHA, branch, timestamp, author, message)
- Rewrite deploy.sh to use docker compose with build metadata
- Add hotcopy deployment script for quick updates
- Add comprehensive backend API with deployment orchestration
- Add health checks, service management, and monitoring endpoints
- Add CI/CD workflow configuration
- Add deployment documentation and guides
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Fix sidebar layout: use flexbox instead of margin-left approach
- Update sidebar to use responsive display (hidden on mobile, static on desktop)
- Fix mobile overlay z-index and positioning issues
- Add proper flex container structure to all pages
- Add new dashboard pages: health, metrics, system, env, code-quality, settings/cosmos
- Add comprehensive API client and type definitions
- Add error boundary and log viewer components
- Add test infrastructure with Vitest and Playwright
- Add Docker configuration and deployment scripts
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Add timeout to auth check to prevent hanging on API failures
- Add timeout to API requests to prevent infinite loading
- Add proper error state and error messages to dashboard
- Show empty states when no services/deployments are available
- Update E2E tests to handle authentication properly
- Improve user feedback when API is unavailable
This fixes the "Loading..." hang issue when backend APIs are unavailable
and provides better user experience with clear error messages and retry options.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Copy design tokens CSS directly into repo for Docker compatibility
- Simplify Primitives.tsx to use local design tokens instead of @bytelyst/ui
- Remove @bytelyst/ui dependency to avoid Docker build issues
- Update globals.css to import local tokens.css
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Added aria-label to logout button for better screen reader support
- Improves accessibility compliance while maintaining existing functionality
- Part of systematic UX improvements across ByteLyst applications
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Phase 1 of UX compliance implementation:
- Add .pnpmfile.cjs for local package resolution from common platform
- Install @bytelyst/ui for shared UI components
- Create Primitives.tsx product adapter for type-safe component extensions
- Integrate @bytelyst/design-tokens CSS variables
- Add lib/utils.ts with cn utility function
- Enable design token usage via CSS custom properties
This establishes the foundation for component normalization and
consistent styling across ByteLyst products, following the UX
implementation guide patterns.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>