Two threads, one commit because they're both about closing dashboard-
side roadmap items that don't need their own slice.
Phase 7 — auth coverage on hermes routes:
- `/api/hermes/ops` was the last unauthenticated Hermes endpoint —
despite revealing instance / gateway / Tailscale-IP / backup-repo /
warnings state. Now gated on `requireAdmin`, matching the new
`/api/hermes/telemetry/:instance` from the previous slice and
every other privileged route in this backend.
- Privilege-surface table in `dashboard/DEPLOYMENT.md` updated to
show `requireAdmin` for both Hermes routes; the previous
"no auth, read-only ops snapshot" carve-out is gone.
- Roadmap Phase 7 ticks for "require auth on hermes routes" + "keep
hermes data private-only" with verification notes.
Phase 4 — Bheem/Uma parity (delegation brief):
- Phase 4 is **VM ops, not codebase work** — it requires sudo on the
Hostinger VM, Uma-owned GitHub credentials, and Telegram bot
tokens. None of it is editable in this repo. Wrote
`docs/prompts/phase4-bheem-uma-parity.md` as a self-contained
delegation brief covering: Uma persistent-backup repo + timer,
Uma health watchdog, first restore rehearsal, quarterly drill
reminder, and the dashboard-side verification (the /hermes/ops +
/hermes/telemetry/bheem outputs that confirm the gap is closed).
- Phase 4 section header in the roadmap now points at the brief
and explains why the checkboxes stay open in this repo.
Verified: backend 57/57 unit tests ✅, web 7/7 E2E ✅ (Playwright
mocks bypass requireAdmin since they fulfill before the request
reaches Fastify; real auth'd users get the same flow as every other
admin route). Lint 0 errors, build green.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
6.0 KiB
Delegation Brief — Phase 4: Bheem/Uma parity
Self-contained task brief for a VM-side agent (Hermes
delegate_task, a manual ops session, or a fresh remote Devin session that has SSH/console access to the Hostinger VM). This is VM operations work, not codebase work — none of the steps below are made by editing files in this repo.Related:
docs/hermes_dashboard_v2_roadmap.md(Phase 4),docs/hermes-operations.md,docs/hermes-disaster-recovery.md,scripts/hermes-health-watchdog.py.
ROLE: Operator with sudo on the Hostinger VM and Telegram + Uma GitHub access.
OBJECTIVE: Bring the Bheem (Uma-user) Hermes instance up to parity with
Vijay (root) so the dashboard's getHermesOpsSnapshot() ops-panel stops
surfacing Bheem-only warnings (backup-timer-inactive, repo-not-readable,
google-token-missing). When this brief is done, "Healthy instances" should
read 2/2 and the per-instance roll-up cards on /hermes should show Bheem
green across the board.
CONTEXT (read first):
- VM:
bytelyst@hostinger-vm(Tailscale only — no public ingress). - Two Hermes instances colocated:
- Vijay: root user,
/root/.hermes, gateway =hermes-gateway.service, backup timer =hermes-root-backup.timer, backup repo =bytelyst/ bytelyst_hostinger_hermes_vmon GitHub, watchdog =scripts/hermes-health-watchdog.pyrunning under root systemd, alerts to root's Telegram chat. - Bheem:
umauser,/home/uma/.hermes, gateway =uma-hermes-gateway.service(user systemd), no backup timer yet, no persistent backup repo, no watchdog. This is the gap.
- Vijay: root user,
- Decision #5 (in the v2 roadmap): Bheem self-pushes its own backup with a Uma-owned, repo-scoped GitHub PAT. Root must NOT push Uma's backup. Each instance owns its own credentials.
GUARDRAILS:
- Tailscale-only access; never expose any new port publicly.
- Sanitize before commit/push:
state.db, SQLite WAL/SHM, secrets, OAuth tokens,.envfiles must all be gitignored in the persistent-backup repo. Use the same allowlist/denylist that root's backup uses (seescripts/hermes-persistent-backup.shor equivalent). - Don't commit credentials anywhere.
- Mirror root's design — don't invent a new pattern.
TASKS (in order):
-
Uma persistent-backup repo + timer.
- Create
umadev0931/uma_hostinger_hermes_vmon GitHub (private). - Generate a fine-grained PAT scoped to that repo only (
Contents: rw,Metadata: ro). Store it in~uma/.config/hermes/github_tokenmode600, owned byuma:uma. - Mirror
scripts/hermes-persistent-backup.shinto a Uma-owned variant (could be the same script withHERMES_HOME=/home/uma/.hermes HERMES_BACKUP_REPO=...). Run it once manually to populate the repo and confirm sanitization. - Install
uma-hermes-backup.service+uma-hermes-backup.timeras user systemd units (~uma/.config/systemd/user/). Enable withsystemctl --user --machine=uma@.host enable --now. - Verify:
runuser -u uma -- systemctl --user is-active uma-hermes- backup.timerreturnsactive. The dashboard's hermes-ops endpoint uses exactly this probe.
- Create
-
Uma health watchdog.
- Mirror
scripts/hermes-health-watchdog.pyinto a Uma-owned variant: same checks (gateway active, dashboard reachable, backup repo freshness, disk, memory), but reading from/home/uma/.hermesand posting to Uma's Telegram chat (separate token + chat ID from root's). - Telegram credentials: store in
~uma/.config/hermes/telegrammode600. Format: two lines,BOT_TOKEN=...thenCHAT_ID=.... - Silent on success — only post when something is wrong (mirror the root watchdog's behaviour). Verify by manually breaking a check (e.g. stop the gateway briefly) and confirming the alert lands in Uma's Telegram, not root's.
- Install as
uma-hermes-health-watchdog.timeruser-systemd unit, run every 5 minutes.
- Mirror
-
First Uma restore rehearsal.
- Pick a temporary
HERMES_HOME=/tmp/uma-restore-rehearsal-<date>. - Clone
umadev0931/uma_hostinger_hermes_vminto it. - Verify the rehearsal Hermes starts cleanly (gateway probe + sessions load). Tear down the rehearsal dir.
- Document the exact steps you ran in
docs/hermes-disaster-recovery.mdunder a new "Bheem (Uma) restore" section — same depth as the existing root section.
- Pick a temporary
-
Quarterly restore-drill reminder.
- Add a calendar reminder (or a Hermes cron entry on either instance)
to repeat the restore rehearsal every 90 days. Document the cadence
in
docs/hermes-operations.md.
- Add a calendar reminder (or a Hermes cron entry on either instance)
to repeat the restore rehearsal every 90 days. Document the cadence
in
-
Confirm the dashboard agrees.
- Hit
GET /api/hermes/ops(admin token, via Tailscale or SSH tunnel). Theinstances[].backup.timer.activefor Bheem should betrue,instances[].backup.repo.statusshould beup, andinstances[].google.workspaceTokenshould betrue(if Google workspace integration is part of Bheem's scope — confirm with the founder before scoping that in). - Hit
GET /api/hermes/telemetry/bheem.backupHistory.statusshould beup(the new Uma backup repo is readable),watchdog.statusshould beup(the new watchdog log exists and parses). - Open the Mission Control dashboard. The "Per-instance roll-up" section should show Bheem with no warnings; the ops panel should read "Healthy instances 2/2".
- Hit
REPORTING: When finished, report (commit-style summary):
- Repo URL of the new Uma backup repo + sample commit list.
- Paths of the new systemd unit files.
- Output of
runuser -u uma -- systemctl --user list-timers. - Output of
GET /api/hermes/ops(sanitized). - Output of
GET /api/hermes/telemetry/bheem(sanitized). - A summary diff of
docs/hermes-disaster-recovery.mdanddocs/hermes-operations.md.
DEFINITION OF DONE:
- All five Bheem-only warnings closed in
getHermesOpsSnapshot(). - Telemetry endpoint reports
upfor backup-history + watchdog on bheem. - Restore drill is documented and the next-drill reminder is scheduled.
- Operator (you) signs off in the corresponding roadmap checkboxes.