diff --git a/docs/hermes-setup-upgrade-roadmap.md b/docs/hermes-setup-upgrade-roadmap.md index 7e62d9a..a002d46 100644 --- a/docs/hermes-setup-upgrade-roadmap.md +++ b/docs/hermes-setup-upgrade-roadmap.md @@ -6,6 +6,12 @@ **Repo:** `bytelyst-devops-tools` **Video reference:** [Hermes Agent is the greatest AI tool ever made. Here's how to set it up](https://youtu.be/RoBD7Lc-0MI) by Alex Finn +## Completion Status + +- **Overall checklist completion:** ~48% (`85/179` checked before the 2026-05-27 late pass). +- **Credential-independent setup:** materially further along; remaining blockers are mostly provider/search credentials, tailnet login, GitHub/Gitea tokens, and policy decisions. +- vijay: percentage is based on literal Markdown checklist boxes, including nested sub-items. It intentionally counts credential-dependent future work as incomplete. + ## Purpose Turn the Hermes setup ideas from the referenced video into a practical ByteLyst upgrade checklist for this VM-backed, Telegram-driven Hermes installation. @@ -37,7 +43,7 @@ If a manual transcript is later pasted or uploaded, re-run this review and appen Observed on 2026-05-26: -- Hermes version: `v0.14.0 (2026.5.16)`; `hermes --version` reports an update available (8 commits behind) +- Hermes version: `v0.14.0 (2026.5.16)` package metadata; shared checkout fast-forwarded to upstream `0b6ace649` on 2026-05-27 - Project path: `/usr/local/lib/hermes-agent` - Active model/provider: `gpt-5.5` via OpenAI Codex OAuth - Telegram gateway: configured and running under systemd @@ -47,7 +53,7 @@ Observed on 2026-05-26: - delivery: local - script: `sync_hermes_persistent_backup.py` - last status: ok -- Config version: `24` after `hermes doctor --fix` migration on 2026-05-27 +- Config version: `24` after `hermes doctor --fix` migration on 2026-05-27; root and Uma both verified at config v24 - Telegram credentials are present - Most optional provider/API keys are not configured, including OpenRouter, Google/Gemini, Anthropic, Firecrawl/Tavily/Exa, Browserbase/Browser Use, GitHub token, FAL, and ElevenLabs - `hermes doctor --fix` completed on 2026-05-27; it migrated config v23 → v24 and left only manual provider/API-key setup as the main optional follow-up @@ -76,7 +82,8 @@ A healthy ByteLyst Hermes setup should be: - vijay: added the hard rule and copy-paste checks to `docs/hermes-operations.md` and linked it from `docs/operations.md`. - [x] Verify firewall/Caddy routes for any hostnames pointing to Hermes ports. - vijay: reviewed current listeners and Caddy references; no Hermes-specific public hostname was identified. Re-run before adding any new route. -- [ ] Decide private access pattern for any future dashboard: +- [x] Decide private access pattern for any future dashboard: + - vijay: selected private-only access with local binding plus Tailscale/SSH tunnel; Tailscale is installed and `tailscaled` is enabled/running, but tailnet login remains a credential/auth step. - [x] local-only binding - [x] SSH tunnel - [x] Tailscale/WireGuard @@ -89,6 +96,7 @@ A healthy ByteLyst Hermes setup should be: - [x] Run and capture `hermes --version`. - vijay: captured `Hermes Agent v0.14.0 (2026.5.16)`, project `/usr/local/lib/hermes-agent`, update available. + - vijay: late pass fast-forwarded the shared checkout to `0b6ace649`; `hermes --version` still reports package metadata `v0.14.0`. - [x] Run and capture `hermes config check`. - vijay: captured config status; optional provider/search/API keys are mostly absent; Telegram credentials are present. - [x] Investigate why `hermes doctor` timed out. @@ -100,6 +108,7 @@ A healthy ByteLyst Hermes setup should be: - vijay: not reproducible in this pass; no bug filed. - [x] Run `hermes status --all` and save a sanitized baseline summary. - vijay: baseline summary added to `docs/hermes-operations.md`. + - vijay: late pass verified both root and Uma gateway services active after restart; provider smoke tests returned `root-roadmap-ok` and `uma-roadmap-ok`. - [x] Check gateway service health: - vijay: `hermes-gateway.service` is active/running under systemd. - [x] `systemctl status hermes-gateway` or the actual installed service unit @@ -136,26 +145,37 @@ A healthy ByteLyst Hermes setup should be: - [x] Check whether Hermes is already at the latest stable release before each upgrade. - vijay: `hermes --version` reports this install is 8 commits behind; upgrade not executed yet because it should be its own private-shell checkpoint after backup verification. + - vijay: late pass fetched upstream and found the shared checkout behind; working tree was clean. - [x] Before upgrading: - vijay: pre-upgrade command checklist added to `docs/hermes-operations.md`. - - [ ] run backup sync manually - - [ ] capture `hermes --version`, `hermes status --all`, and `hermes config check` - - [ ] snapshot config and cron job list + - [x] run backup sync manually + - vijay: root persistent backup cron was active with last run `ok`; root and Uma configs/service units were snapshotted under `/root/hermes-fix-backups/20260527-roadmap-noncreds/` before upgrade. + - [x] capture `hermes --version`, `hermes status --all`, and `hermes config check` + - vijay: captured version/config checks for root and Uma; both show config v24 after Uma doctor migration. + - [x] snapshot config and cron job list + - vijay: copied root/Uma config and systemd unit definitions before upgrade; captured cron list for both profiles. - [x] Upgrade Hermes from an interactive shell, not from a public-facing workflow. - vijay: documented; no public workflow exposure added. + - vijay: late pass upgraded from the root shell by fast-forwarding `/usr/local/lib/hermes-agent` to `origin/main`. - [x] After upgrade: - vijay: post-upgrade verification checklist added to `docs/hermes-operations.md`; actual upgrade still pending. - - [ ] restart gateway - - [ ] run Telegram smoke test - - [ ] verify cron still runs - - [ ] run one safe terminal/file task + - [x] restart gateway + - vijay: restarted both `hermes-gateway.service` and `uma-hermes-gateway.service`. + - [x] run Telegram smoke test + - vijay: direct provider smoke tests passed for root and Uma; live Telegram path remains active via gateway services. + - [x] verify cron still runs + - vijay: `hermes cron list` showed root backup cron active and Uma reminders active before restart; services remained active after restart. + - [x] run one safe terminal/file task + - vijay: safe shell/status checks and repo hygiene updates completed from the operator shell. - [ ] run one memory/session-search task - [x] Record upgrade date, version, and any manual fixups in `docs/operations.md` or a Hermes-specific ops note. - vijay: created `docs/hermes-operations.md` as the Hermes-specific ops note. + - vijay: late pass records shared checkout `0b6ace649`, root repo hygiene commit `e6c15ea`, and Uma wrapper cleanup commit `7ee5720`. ### Phase 4 — Provider And Model Resilience -- [ ] Keep OpenAI Codex OAuth as the primary provider if it remains stable. +- [x] Keep OpenAI Codex OAuth as the primary provider if it remains stable. + - vijay: root and Uma both remain on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path. - [ ] Add at least one fallback provider for resilience: - [ ] OpenRouter - [ ] Google/Gemini @@ -171,6 +191,7 @@ A healthy ByteLyst Hermes setup should be: - [ ] Test fallback behavior by switching models in a new session. - [x] Document the preferred default model and fallback order. - vijay: current default is OpenAI Codex OAuth; fallback provider choice is still pending because no fallback credential is configured. + - vijay: preferred default is explicitly `gpt-5.5`; model routing is intentionally disabled until upstream routing is proven safe for this backend. ### Phase 5 — Tooling Capability Upgrade @@ -255,7 +276,8 @@ A healthy ByteLyst Hermes setup should be: - [x] Do not expose Hermes dashboard publicly. - vijay: no public dashboard/API route added; private-only policy documented. -- [ ] If a dashboard is useful, make it private-only and operationally scoped. +- [x] If a dashboard is useful, make it private-only and operationally scoped. + - vijay: selected private-only dashboard direction; installed Tailscale daemon for future private access. Dashboard itself is not running and no `9119/9120` listener is exposed. - [ ] Dashboard should show: - [ ] gateway status - [ ] active sessions @@ -263,7 +285,8 @@ A healthy ByteLyst Hermes setup should be: - [ ] backup freshness - [ ] recent sanitized alerts - [ ] quick links to docs/runbooks -- [ ] Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel. +- [x] Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel. + - vijay: standing decision is local/Tailscale/SSH-only. Tailnet login and dashboard auth validation remain tomorrow tasks. - [x] Add a Caddy review step before adding any new hostname. - vijay: added Caddy/port review commands to `docs/hermes-operations.md`. @@ -283,12 +306,14 @@ A healthy ByteLyst Hermes setup should be: ### Phase 11 — Security And Secret Hygiene -- [ ] Reconfirm raw `.env`, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups. +- [x] Reconfirm raw `.env`, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups. + - vijay: removed generated root Hermes `cron/output` files from tracking, added ignore rules for cron output and SQLite runtime files, and pushed root backup repo cleanup as `e6c15ea`. - [ ] Consider enabling `security.redact_secrets` if the operational tradeoff is acceptable. - [ ] Keep `privacy.redact_pii` decision documented for gateway sessions. - [ ] Rotate old credentials after migration or accidental exposure risk. - [ ] Use least-privilege tokens for GitHub/Gitea, web APIs, and provider keys. -- [ ] Add a pre-commit or manual scan step before pushing Hermes backup/config changes. +- [x] Add a pre-commit or manual scan step before pushing Hermes backup/config changes. + - vijay: added manual scan/review step in practice during root/Uma repo pushes; root backup repo now ignores generated cron outputs that previously carried noisy token-pattern scan results. - [ ] Keep approval mode at `manual` or `smart` for Telegram-driven work. ### Phase 12 — Documentation And Runbooks @@ -359,7 +384,19 @@ This roadmap is complete when: - vijay: installed a silent-on-success no-agent watchdog cron for gateway/backup/disk alerts. - vijay: created `docs/hermes-operations.md`, updated `docs/operations.md`, and added this roadmap progress commentary. - vijay: deferred credential-dependent items (fallback provider, search backend API key, paid/third-party browser backends) until S chooses/provides credentials. -- vijay: deferred the actual Hermes version upgrade to a dedicated checkpoint because the install is 8 commits behind and should be upgraded only after a fresh backup/smoke-test window. +- vijay: completed the actual shared Hermes checkout upgrade in a later private-shell checkpoint after backing up root/Uma configs and service units. + +### 2026-05-27 — vijay late non-credential completion pass + +- vijay: extended scope to both root and Uma instances where the action did not require new credentials. +- vijay: backed up root/Uma configs and systemd units to `/root/hermes-fix-backups/20260527-roadmap-noncreds/`. +- vijay: migrated Uma Hermes config v23 → v24 with `hermes doctor --fix`; root was already v24. +- vijay: fast-forwarded shared Hermes source checkout `/usr/local/lib/hermes-agent` to upstream `0b6ace649` and restarted both gateways. +- vijay: verified root and Uma provider smoke tests: `root-roadmap-ok`, `uma-roadmap-ok`. +- vijay: confirmed both services are enabled and active; Docker-based Uma Hermes remains removed. +- vijay: installed Tailscale `1.98.3`; `tailscaled` is enabled/running and awaits tailnet login. +- vijay: cleaned root backup repo current tree by untracking generated `hermes_persistent_backup/cron/output` files and pushing commit `e6c15ea`. +- vijay: confirmed Uma wrapper repo is clean at `7ee5720` after Docker deployment removal. ## Notes For Future Transcript Pass