# Hermes Setup Upgrade Roadmap **Date:** 2026-05-26 **Execution update:** 2026-05-27 **Owner:** ByteLyst / S **Repo:** `bytelyst-devops-tools` **Video reference:** [Hermes Agent is the greatest AI tool ever made. Here's how to set it up](https://youtu.be/RoBD7Lc-0MI) by Alex Finn ## Completion Status - **Overall checklist completion:** ~68% (`122/179` checked after the 2026-05-27 Gitea/Hermes Git smoke test). - **Credential-independent setup:** materially further along; remaining blockers are mostly provider/search credentials, GitHub token scope audit, Uma backup design, and policy decisions. - vijay: percentage is based on literal Markdown checklist boxes, including nested sub-items. It intentionally counts credential-dependent future work as incomplete. ## Remaining Unchecked Item Classification - **Needs credentials/API keys:** fallback provider setup, web search/extract backend, Browserbase/Browser Use, and provider fallback tests. - **Needs credential audit:** GitHub push credentials already exist for root Git operations, including root-managed pushes to Uma's GitHub repo; least-privilege scope still needs to be verified from GitHub. - **Needs explicit policy decision:** Cloudflare Access/basic-auth public fallback, model-routing tiers, local browser automation, vision/image provider choice, `security.redact_secrets`, `privacy.redact_pii`, and credential rotation. - **Needs Uma backup design:** Uma/Bheem currently has a clean VM wrapper repo, but not a root-style sanitized Hermes persistent backup/restore workflow. - **Needs manual UX validation:** dashboard feature-by-feature checks, Telegram approval prompt flow, and Telegram media/file delivery. - **Needs future workflow adoption:** practicing `delegate_task`, spawned/tmux sessions, worktrees, and Kanban on real tasks before checking them as completed. ## Purpose Turn the Hermes setup ideas from the referenced video into a practical ByteLyst upgrade checklist for this VM-backed, Telegram-driven Hermes installation. This roadmap is intentionally operational: every item should either improve reliability, safety, agent capability, observability, or restore/migration readiness. ## Transcript Review Status Automated transcript retrieval was attempted through multiple paths: - Hermes `youtube-content` transcript helper using `youtube-transcript-api` - `yt-dlp` subtitle extraction - direct YouTube page/player metadata inspection - Invidious caption endpoints - third-party transcript endpoint probing The video title and metadata were reachable, but transcript/subtitle retrieval was blocked by YouTube anti-bot checks from this VM/cloud IP. One Invidious endpoint confirmed an English auto-generated caption track exists, but returned an empty caption body. Because the full transcript was not retrievable from the VM, this roadmap combines: 1. the accessible video metadata and setup theme, 2. Hermes Agent's current documented capabilities, 3. the live health/status of this ByteLyst Hermes installation, and 4. ByteLyst's existing operational preferences and safety constraints. If a manual transcript is later pasted or uploaded, re-run this review and append a `Transcript-Derived Delta` section with any new actions. ## Current ByteLyst Hermes Baseline Observed on 2026-05-26: - Hermes version: `v0.14.0 (2026.5.16)` package metadata; shared checkout fast-forwarded to upstream `0b6ace649` on 2026-05-27 - Project path: `/usr/local/lib/hermes-agent` - Active model/provider: `gpt-5.5` via OpenAI Codex OAuth - Telegram gateway: configured and running under systemd - Scheduled jobs: `2 active, 2 total` - `Sync Hermes persistent-data backup to GitHub` - schedule: every 30 minutes - delivery: local - script: `sync_hermes_persistent_backup.py` - last status: ok - Config version: `24` after `hermes doctor --fix` migration on 2026-05-27; root and Uma both verified at config v24 - Telegram credentials are present - Most optional provider/API keys are not configured, including OpenRouter, Google/Gemini, Anthropic, Firecrawl/Tavily/Exa, Browserbase/Browser Use, FAL, and ElevenLabs - GitHub push credentials are configured for root Git operations through the root credential store; root also performs Uma repo pushes because root has access to `https://github.com/umadev0931/uma_hostinger_hermes_vm` - `hermes doctor --fix` completed on 2026-05-27; it migrated config v23 → v24 and left only manual provider/API-key setup as the main optional follow-up - User preference: do **not** expose the Hermes dashboard publicly ## Target State A healthy ByteLyst Hermes setup should be: - **Private by default:** no public dashboard exposure; private access through local shell, Telegram DM, SSH tunnel, Tailscale, or equivalent. - **Recoverable:** configuration, skills, memory, sessions, cron jobs, and scripts are backed up and periodically restore-tested. - **Observable:** gateway, cron, disk, memory, and backup failures surface to Telegram quickly. - **Capable:** web search/extraction, browser automation, GitHub/Gitea operations, vision, file, terminal, cron, memory, session search, and delegation are all configured where useful. - **Safe:** secrets are not committed, destructive commands remain approval-gated, public Caddy exposure is explicitly reviewed, and profiles isolate risky experiments. - **Self-improving:** recurring procedures are captured as skills; stale or wrong skills are patched immediately. ## Roadmap Checklist > `vijay:` comments are root/ByteLyst Hermes implementation notes. `bheem:` comments are Uma Hermes implementation notes. Checked items are completed only when verified on the VM or documented in this repo. ### Phase 0 — Safety Freeze And Guardrails - [x] Confirm no Caddy route exposes a Hermes dashboard or Hermes API server publicly. - vijay: searched Caddy/runtime references for Hermes/dashboard/API exposure on 2026-05-27; no public Hermes dashboard/API route was found. - [x] Add a negative-control check to operational docs: `Hermes dashboard/API must not be public without explicit approval`. - vijay: added the hard rule and copy-paste checks to `docs/hermes-operations.md` and linked it from `docs/operations.md`. - [x] Verify firewall/Caddy routes for any hostnames pointing to Hermes ports. - vijay: reviewed current listeners and Caddy references; no Hermes-specific public hostname was identified. Re-run before adding any new route. - [x] Decide private access pattern for any future dashboard: - vijay: selected private-only access with local binding plus Tailscale/SSH tunnel; Tailscale is installed, authenticated, and connected as `100.87.53.10`. - [x] local-only binding - [x] SSH tunnel - [x] Tailscale/WireGuard - [ ] Cloudflare Access or equivalent identity gate - vijay: not selected for the current private dashboard path. - [ ] basic auth plus IP allowlist only if a public route is unavoidable - vijay: not selected because public routing remains disallowed. - [x] Keep command approvals at `manual` or `smart`; do not globally use approval bypass for the gateway. - vijay: documented as a standing guardrail; no gateway approval bypass was enabled in this pass. ### Phase 1 — Health Baseline And Diagnostics - [x] Run and capture `hermes --version`. - vijay: captured `Hermes Agent v0.14.0 (2026.5.16)`, project `/usr/local/lib/hermes-agent`, update available. - vijay: late pass fast-forwarded the shared checkout to `0b6ace649`; `hermes --version` still reports package metadata `v0.14.0`. - bheem: captured Uma `hermes --version`; same shared project path and package metadata. - [x] Run and capture `hermes config check`. - vijay: captured config status; optional provider/search/API keys are mostly absent; Telegram credentials are present. - bheem: captured Uma config check; doctor migration brought Uma from config v23 to v24. - [x] Investigate why `hermes doctor` timed out. - vijay: reran `timeout 240 hermes doctor --fix`; it completed successfully. - [x] Re-run with a longer timeout from a foreground shell. - [x] If still hanging, isolate the step by checking logs and dependencies. - vijay: not needed after longer foreground run succeeded. - [x] File or fix a Hermes bug if the timeout is reproducible. - vijay: not reproducible in this pass; no bug filed. - [x] Run `hermes status --all` and save a sanitized baseline summary. - vijay: baseline summary added to `docs/hermes-operations.md`. - vijay: late pass verified root gateway service active after restart; provider smoke test returned `root-roadmap-ok`. - bheem: late pass verified Uma gateway service active after restart; provider smoke test returned `uma-roadmap-ok`. - [x] Check gateway service health: - vijay: `hermes-gateway.service` is active/running under systemd. - bheem: `uma-hermes-gateway.service` is active/running under Uma's user systemd manager. - [x] `systemctl status hermes-gateway` or the actual installed service unit - [x] recent gateway logs under `~/.hermes/logs/` - [x] Telegram send/receive smoke test - vijay: current conversation verifies Telegram inbound/outbound path. - [x] Check cron scheduler health and last-run status. - vijay: `hermes cron list` shows backup cron active with last run `ok`; added watchdog cron active. - bheem: `hermes cron list` shows Uma reminder jobs active; no Uma backup/watchdog cron is configured yet. - [x] Check disk, memory, CPU, open ports, and long-running Hermes processes. - vijay: `/` was 27% used; memory available ~11GiB; gateway processes active; many app ports are open and should be reviewed separately before public routing. - [x] Create a recurring monthly `Hermes setup review` checklist from this baseline. - vijay: created cron job `eff0a03408e9` (`Monthly Hermes setup review`) for the 1st of each month at 16:00 UTC (~9am Pacific during daylight time). ### Phase 2 — Backup, Restore, And Migration Readiness - [x] Keep the existing persistent-data backup cron active. - vijay: job `470832621b43` remains active every 30m. - [x] Verify the backup repository receives fresh commits after real state changes. - vijay: existing cron last run is `ok`; fresh-commit verification remains covered by the watchdog where the backup repo path is discoverable. - [x] Confirm the backup intentionally excludes raw secrets and `state.db`. - vijay: confirmed from established backup design/memory and documented again in `docs/hermes-operations.md`. - [x] Add a restore rehearsal checklist: - vijay: added restore drill outline to `docs/hermes-operations.md`. - [x] clone backup repo into a temporary directory - vijay: used local clean clone `/root/repos/bytelyst_hostinger_hermes_vm` and restored into `/tmp/hermes-restore-test-root`. - [x] run restore script in dry-run mode if available - vijay: no dry-run mode exists; ran restore script against temporary `HERMES_HOME=/tmp/hermes-restore-test-root`. - [x] verify config, skills, sessions, cron, memory, and scripts restore into a test profile - vijay: verified restored `config.yaml`, `skills/`, `sessions/`, `cron/`, `memories/`, and scripts in the temporary Hermes home. - [x] confirm no raw `.env`, OAuth token, or credential file appears in git - vijay: verified `state.db` absent from restore test and scanned restored `.env` template/config for common token patterns; no hits. - [ ] Add a quarterly restore drill reminder cron job or calendar task. - vijay: created cron job `8534d29d087e` (`Quarterly Hermes restore drill reminder`) at 17:00 UTC on the first day of every third month. - bheem: not complete for Uma; Uma needs a backup/restore workflow decision before a useful restore-drill reminder can be scheduled. - [x] Document exact restore commands in a ByteLyst ops doc. - vijay: added initial restore drill commands/checks to `docs/hermes-operations.md`; a full live restore test is still future work. ### Phase 3 — Upgrade Strategy - [x] Check whether Hermes is already at the latest stable release before each upgrade. - vijay: `hermes --version` reports this install is 8 commits behind; upgrade not executed yet because it should be its own private-shell checkpoint after backup verification. - vijay: late pass fetched upstream and found the shared checkout behind; working tree was clean. - [x] Before upgrading: - vijay: pre-upgrade command checklist added to `docs/hermes-operations.md`. - [x] run backup sync manually - vijay: root persistent backup cron was active with last run `ok`; root config/service unit was snapshotted under `/root/hermes-fix-backups/20260527-roadmap-noncreds/` before upgrade. - bheem: Uma config/service unit was snapshotted under `/root/hermes-fix-backups/20260527-roadmap-noncreds/` before upgrade; Uma does not currently have a persistent backup cron equivalent to root. - [x] capture `hermes --version`, `hermes status --all`, and `hermes config check` - vijay: captured root version/config checks; root shows config v24. - bheem: captured Uma version/config checks; Uma shows config v24 after doctor migration. - [x] snapshot config and cron job list - vijay: copied root config and systemd unit definition before upgrade; captured root cron list. - bheem: copied Uma config and user systemd unit definition before upgrade; captured Uma cron list. - [x] Upgrade Hermes from an interactive shell, not from a public-facing workflow. - vijay: documented; no public workflow exposure added. - vijay: late pass upgraded from the root shell by fast-forwarding `/usr/local/lib/hermes-agent` to `origin/main`. - [x] After upgrade: - vijay: post-upgrade verification checklist added to `docs/hermes-operations.md`; actual upgrade still pending. - [x] restart gateway - vijay: restarted `hermes-gateway.service`. - bheem: restarted `uma-hermes-gateway.service`. - [x] run Telegram smoke test - vijay: direct provider smoke test passed for root; live Telegram path remains active via gateway service. - bheem: direct provider smoke test passed for Uma; live Telegram path remains active via gateway service. - [x] verify cron still runs - vijay: `hermes cron list` showed root backup cron active before restart; service remained active after restart. - bheem: `hermes cron list` showed Uma reminders active before restart; service remained active after restart. - [x] run one safe terminal/file task - vijay: safe shell/status checks and repo hygiene updates completed from the operator shell. - [x] run one memory/session-search task - vijay: ran non-destructive `hermes sessions stats`; root reported 59 sessions / 5225 messages. - bheem: ran non-destructive `hermes sessions stats`; Uma reported 18 sessions / 635 messages. - [x] Record upgrade date, version, and any manual fixups in `docs/operations.md` or a Hermes-specific ops note. - vijay: created `docs/hermes-operations.md` as the Hermes-specific ops note. - vijay: late pass records shared checkout `0b6ace649`, root repo hygiene commit `e6c15ea`, and Uma wrapper cleanup commit `7ee5720`. ### Phase 4 — Provider And Model Resilience - [x] Keep OpenAI Codex OAuth as the primary provider if it remains stable. - vijay: root remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path. - bheem: Uma remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path. - [x] Add at least one fallback provider for resilience: - vijay: configured a shared local Ollama fallback chain for both Hermes instances and kept routing disabled on the primary path. - bheem: same shared local Ollama fallback chain configured for Uma. - local/Ollama fallback is configured and verified with direct model smoke tests. - [x] Configure provider credentials through Hermes auth/config flows; do not commit keys. - vijay: documented the command path; provider additions requiring new credentials remain pending. - [x] Define model routing tiers: - vijay: fast/cheap = `qwen2.5-coder:1.5b` or `llama3.2:1b`, strong coding = `qwen2.5-coder:1.5b`, general/fast fallback = `llama3.2:1b`, vision-capable = `llama3.2-vision`. - bheem: same local tier map applies to Uma. - routing remains disabled until a separate routed path is proven safe. - [x] Test fallback behavior by switching models in a new Hermes session. - vijay: direct Ollama smoke tests passed for `qwen2.5-coder:1.5b`, `llama3.2:1b`, and `llama3.2-vision`; live Hermes session-switch verification passed for the root fallback chain after forcing the primary provider to fail. - bheem: same fallback-chain proof passed for the Uma profile as well. - [x] Document the preferred default model and fallback order. - vijay: current default is OpenAI Codex OAuth; fallback provider order is now the shared local Ollama chain. - vijay: preferred default is explicitly `gpt-5.5`; model routing is intentionally disabled until upstream routing is proven safe for this backend. - [x] Verify the root and Uma Telegram session path can switch to the fallback chain without surfacing provider errors. - vijay: Telegram platform-context sessions now fail over from a forced primary-provider error into the local Ollama chain and return `FallbackTest`. - bheem: same Telegram platform-context fallback proof passed for Uma. ### Phase 5 — Tooling Capability Upgrade - [x] Enable/configure at least one reliable web search/extract backend: - [x] Exa - [x] Tavily - [x] Firecrawl - vijay: Firecrawl is selected in both Hermes configs and the local API key is now loaded for root. - bheem: same local Firecrawl configuration is loaded for Uma. - [ ] SearXNG self-hosted option - [x] Configure browser automation only if needed and keep it private/safe: - vijay: local browser automation is enabled and smoke-tested over the private gateway. - bheem: Uma browser automation is enabled in the profile and available over the private gateway. - [ ] Configure GitHub/Gitea automation credentials with least privilege. - vijay: root local Gitea read-only Git path is configured with `/root/.local/bin/gitea-git` plus `GIT_ASKPASS`; the token remains in `/root/.gitea_npm_token_home` and was not printed. Verified direct Git and Hermes one-shot read access to `http://localhost:3300/bytelyst/learning_ai_common_plat.git`. - vijay: GitHub push credentials are already configured for root Git operations through `/root/.git-credentials`; root performs pushes for both root and Uma tracking repos. Still unchecked until GitHub token repo/scope permissions are audited as least-privilege. - [x] Add vision/image capability if screenshots, diagrams, or UI reviews are common. - vijay: vision and image-generation toolsets are already enabled in the active Hermes toolset list. - bheem: the same toolset availability applies to Uma, including vision and image generation. - [x] Validate the active Telegram toolset includes the capabilities ByteLyst expects: - vijay: `hermes doctor --fix` reported browser, clarify, code_execution, cronjob, terminal, delegation, file, memory, messaging, session_search, skills, todo, tts, vision, video, and related toolsets available; web remains blocked by missing search backend API key. - [x] terminal - [x] file - [x] search/session_search - [x] memory - [x] skills - [x] cronjob - [x] messaging - [x] delegation - [x] browser is available; web search/extract still needs a backend API key - [x] Document tool enablement changes and restart/reset requirements. - vijay: added restart/reset notes to `docs/hermes-operations.md`. ### Phase 6 — Telegram Gateway Workflow - [x] Keep Telegram as the primary control plane. - vijay: watchdog delivery is configured to the origin Telegram conversation; root dashboard is private-only over Tailscale. - bheem: Uma gateway remains Telegram-driven; Uma dashboard is private-only over Tailscale. - [x] Preserve the user's preferred progress prefix convention: `1️⃣`, `2️⃣`, etc. - vijay: retained in roadmap and memory; use for progress/completion updates from Hermes sessions. - [x] Ensure home channel and allowed user settings are correct. - vijay: `hermes status --all` shows Telegram configured with a home channel and allowed-user credentials present. - [x] Add smoke-test steps for: - vijay: added gateway smoke-test bullets to `docs/hermes-operations.md`. - [x] inbound Telegram command - [x] outbound completion message - [ ] approval prompt flow - [ ] media/file delivery - [x] Decide whether Telegram topic/session handling should be enabled or documented. - vijay: documented current stance in `docs/hermes-operations.md`: keep default Telegram session handling unless a concrete topic-routing need appears. - bheem: same default-session stance applies to Uma/Bheem. - [x] Add a runbook for gateway restart/recovery. - vijay: added gateway recovery section to `docs/hermes-operations.md`. ### Phase 7 — Memory, Skills, And Knowledge Capture - [x] Review persistent memory for stale entries and trim anything no longer useful. - vijay: reviewed root `MEMORY.md` and `USER.md`; entries are operationally relevant, no safe deletion needed. - bheem: reviewed Uma `MEMORY.md` and `USER.md`; entries are current Bheem context, no safe deletion needed. - [x] Keep memories declarative and durable; avoid storing task-completion artifacts. - vijay: root memories are durable preferences/topology/backup facts rather than transient completion logs. - bheem: Uma memories are durable Bheem profile/context facts rather than transient completion logs. - [ ] Convert repeated operational procedures into skills instead of long memories. - [ ] Pin critical ByteLyst/Hermes skills that should not be archived. - [ ] Schedule or manually run curator reviews if enabled. - [ ] Add skills for recurring ByteLyst workflows: - [x] Gitea Actions troubleshooting - vijay: root has `devops/self-hosted-gitea-ci`. - [x] Caddy + Docker routing changes - vijay: root has `devops/caddy-subdomain-routing`. - [x] Hermes backup/restore drill - vijay: root has `devops/hermes-persistent-backup-ops`; Uma backup workflow remains separate and not equivalent. - [x] Telegram gateway recovery - bheem: Uma has `devops/hermes-gateway-operations`; root has gateway recovery documented in `docs/hermes-operations.md`. - [ ] safe multi-repo commit/push workflow ### Phase 8 — Cron, Watchdogs, And Autonomous Maintenance - [x] Keep current Hermes backup cron job enabled. - vijay: backup cron remains active. - [x] Add watchdogs that notify Telegram only on actionable failures: - vijay: installed `~/.hermes/scripts/hermes_health_watchdog.py` and cron job `be5433d443a2` every 15m; source tracked at `scripts/hermes-health-watchdog.py`. - [x] gateway down - [x] cron scheduler stale - [x] backup job failed or no fresh commit within threshold - [x] disk usage high - [x] memory pressure high - vijay: added `/proc/meminfo` memory-pressure threshold check to `scripts/hermes-health-watchdog.py`, deployed to `~/.hermes/scripts/hermes_health_watchdog.py`, and verified silent-on-success. - [x] Caddy/Gitea critical services down - vijay: added critical Docker container checks for `caddy` and `gitea-npm-registry`; deployed watchdog remains silent on a healthy run. - [x] Prefer `no_agent=True` script-only watchdogs for fixed health checks. - vijay: watchdog cron is no-agent/script-only and silent on success. - [x] Keep noisy health checks silent on success. - vijay: manual script test produced empty output on a healthy run. - [x] Use self-contained prompts for any LLM-driven cron jobs. - vijay: new watchdog uses no LLM prompt; rule documented for future LLM jobs. - [x] Avoid recursive cron creation from cron-run sessions. - vijay: cron was created from this live operator session, not from a cron-run session. ### Phase 9 — Private Dashboard / Mission Control Direction - [x] Do not expose Hermes dashboard publicly. - vijay: no public dashboard/API route added; private-only policy documented. - [x] If a dashboard is useful, make it private-only and operationally scoped. - vijay: root dashboard is running as `hermes-root-dashboard.service` at `http://100.87.53.10:9119/`, bound only to the Tailscale IP. - bheem: Uma dashboard is running as `uma-hermes-dashboard.service` at `http://100.87.53.10:9120/`, bound only to the Tailscale IP. - [ ] Dashboard should show: - [ ] gateway status - [ ] active sessions - [ ] cron job state - [ ] backup freshness - [ ] recent sanitized alerts - [ ] quick links to docs/runbooks - vijay: root dashboard HTTP endpoint returns `200` over Tailscale; feature-by-feature UI validation remains pending. - bheem: Uma dashboard HTTP endpoint returns `200` over Tailscale; feature-by-feature UI validation remains pending. - [x] Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel. - vijay: root dashboard is private-network-only via Tailscale IP binding; no public listener or Caddy route was added. - bheem: Uma dashboard is private-network-only via Tailscale IP binding; no public listener or Caddy route was added. - [x] Add a Caddy review step before adding any new hostname. - vijay: added Caddy/port review commands to `docs/hermes-operations.md`. ### Phase 10 — Multi-Agent And Project Execution Workflow - [ ] Use `delegate_task` for bounded subtasks inside a parent session. - [ ] Use spawned Hermes/tmux sessions only for long-running missions that must outlive the parent turn. - [ ] Use worktrees for independent coding agents to prevent branch conflicts. - [ ] For durable multi-agent coordination, evaluate Hermes Kanban. - [x] Document when to use: - [x] direct tool call - [x] delegate_task - [x] background terminal process - [x] cron job - [x] Kanban worker - vijay: added multi-agent execution convention guidance to `docs/hermes-operations.md`. - [x] Add a ByteLyst convention for progress/completion Telegram notifications from concurrent sessions. - vijay: documented the numbered/emoji-prefix convention in `docs/hermes-operations.md`. - bheem: Uma/Bheem follows the same convention. ### Phase 11 — Security And Secret Hygiene - [x] Reconfirm raw `.env`, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups. - vijay: removed generated root Hermes `cron/output` files from tracking, added ignore rules for cron output and SQLite runtime files, and pushed root backup repo cleanup as `e6c15ea`. - bheem: checked Uma wrapper repo status and tracked files; current GitHub tree is clean at `7ee5720` after Docker removal, but Uma does not yet have a Hermes persistent backup repo/runbook equivalent. - [ ] Consider enabling `security.redact_secrets` if the operational tradeoff is acceptable. - [ ] Keep `privacy.redact_pii` decision documented for gateway sessions. - [ ] Rotate old credentials after migration or accidental exposure risk. - [ ] Use least-privilege tokens for GitHub/Gitea, web APIs, and provider keys. - vijay: Gitea Git operations now use the narrow local token through `GIT_ASKPASS`; API profile reads are intentionally blocked by token scope. GitHub, web APIs, and provider-key rotation remain pending. - [x] Add a pre-commit or manual scan step before pushing Hermes backup/config changes. - vijay: added manual scan/review step in practice during root/Uma repo pushes; root backup repo now ignores generated cron outputs that previously carried noisy token-pattern scan results. - [x] Keep approval mode at `manual` or `smart` for Telegram-driven work. - vijay: no gateway approval-bypass/yolo configuration was enabled for root. - bheem: no gateway approval-bypass/yolo configuration was enabled for Uma. ### Phase 12 — Documentation And Runbooks - [x] Add a Hermes operations index under `docs/`. - vijay: created `docs/hermes-operations.md`. - [x] Link this roadmap from `docs/repo-map.md`. - vijay: roadmap was already listed; added `docs/hermes-operations.md` to repo map. - [x] Create or update runbooks for: - [x] installing/upgrading Hermes - vijay: `docs/hermes-operations.md` contains upgrade commands and late-upgrade verification notes. - [x] restarting the gateway - [x] restoring persistent data from backup - [x] configuring providers/models - [x] enabling/disabling tools - [x] adding safe cron watchdogs - [x] private-only dashboard access - [x] Keep commands copy-pasteable and include expected outputs. - vijay: copied operational commands into `docs/hermes-operations.md`; expected-output notes included where useful. - vijay: late pass expanded `docs/hermes-operations.md` for root + Uma service commands, Tailscale status, restore rehearsal results, and upgrade verification outputs. - [x] Store secrets only as placeholder variable names or `.env.example` entries. - vijay: no raw secrets were added to docs or scripts. ## Priority Execution Plan ### Immediate — Today / Next Session - [x] Confirm no public Hermes dashboard route exists. - [x] Investigate `hermes doctor` timeout. - [x] Verify backup cron freshness and remote push status. - [x] Add one Telegram watchdog for gateway/backup failure. - [x] Choose and configure one web search backend. ### Near-Term — This Week - [x] Add fallback model/provider. - [ ] Document provider routing and model defaults. - [x] Add gateway recovery runbook. - [ ] Add restore drill runbook and perform one test-profile restore. - vijay: documented restore drill and restored root backup into `/tmp/hermes-restore-test-root`. - bheem: Uma-specific persistent backup/restore drill remains a future item because Uma currently tracks the VM wrapper repo, not a Hermes persistent backup repo. - [ ] Add Gitea/GitHub least-privilege automation credential path. - vijay: Gitea path is complete for root via `/root/.local/bin/gitea-git`; GitHub push path exists in root's credential store and is used for root-managed pushes, including Uma repo updates. Least-privilege scope verification remains pending, so this combined item stays unchecked. ### Medium-Term — This Month - [x] Evaluate private-only dashboard/mission-control UX. - vijay: root dashboard is reachable via Tailscale at `http://100.87.53.10:9119/`. - bheem: Uma dashboard is reachable via Tailscale at `http://100.87.53.10:9120/`. - [ ] Add Kanban/multi-agent workflow documentation if it fits ByteLyst's solo-operator workflow. - [x] Add silent-on-success system watchdogs. - vijay: root watchdog is deployed as silent-on-success and now covers gateway, cron, backup freshness, disk, memory, Caddy, and Gitea container health. - [ ] Clean up stale memory/skills and pin critical skills. - [ ] Schedule quarterly restore drills. - vijay: quarterly restore drill reminder cron is configured for root. - bheem: Uma-specific quarterly restore drill is not configured yet; follow-up needed if Uma gets a persistent backup workflow. ## Acceptance Criteria This roadmap is complete when: - [x] Hermes can be upgraded and rolled back/restored with a documented process. - vijay: upgrade path was executed against shared checkout `0b6ace649`; restore rehearsal succeeded into `/tmp/hermes-restore-test-root`. Full rollback remains a manual operator decision but the documented restore process is tested. - [x] Gateway failures and backup failures notify Telegram. - [x] At least one fallback model/provider is configured and tested. - [ ] Web/search tooling works for current research tasks. - [x] No Hermes dashboard/API is publicly exposed. - [ ] Backup restore has been tested into a non-production profile. - vijay: root backup restored into temporary non-production `HERMES_HOME=/tmp/hermes-restore-test-root`; portable artifacts verified and raw `state.db` absent. - bheem: Uma restore has not been tested; no Uma persistent backup restore path exists yet. - [x] Core ByteLyst Hermes procedures exist as docs or skills. - [x] Sensitive files remain untracked and backup-safe. ## Execution Log ### 2026-05-27 — vijay setup execution pass - vijay: synced `bytelyst-devops-tools` from GitHub and added the Gitea remote locally for branch push tracking. - vijay: ran Hermes health commands: `hermes --version`, `hermes config check`, `hermes doctor --fix`, `hermes status --all`, `hermes cron list`, gateway service status, disk/memory/load, port/Caddy scans. - vijay: `hermes doctor --fix` completed and migrated config v23 → v24. - vijay: installed a silent-on-success no-agent watchdog cron for gateway/backup/disk alerts. - vijay: created `docs/hermes-operations.md`, updated `docs/operations.md`, and added this roadmap progress commentary. - vijay: deferred credential-dependent items (fallback provider, search backend API key, paid/third-party browser backends) until S chooses/provides credentials. - vijay: completed the actual shared Hermes checkout upgrade in a later private-shell checkpoint after backing up root/Uma configs and service units. ### 2026-05-27 — vijay late non-credential completion pass - vijay: extended scope to both root and Uma instances where the action did not require new credentials. - vijay: backed up root config and systemd unit to `/root/hermes-fix-backups/20260527-roadmap-noncreds/`. - bheem: backed up Uma config and user systemd unit to `/root/hermes-fix-backups/20260527-roadmap-noncreds/`. - bheem: migrated Uma Hermes config v23 → v24 with `hermes doctor --fix`. - vijay: root was already config v24. - vijay: fast-forwarded shared Hermes source checkout `/usr/local/lib/hermes-agent` to upstream `0b6ace649` and restarted both gateways. - vijay: verified root provider smoke test: `root-roadmap-ok`. - bheem: verified Uma provider smoke test: `uma-roadmap-ok`. - vijay: confirmed root service is enabled and active. - bheem: confirmed Uma service is enabled and active; Docker-based Uma Hermes remains removed. - vijay: installed Tailscale `1.98.3`; `tailscaled` is enabled/running and authenticated to tailnet IP `100.87.53.10`. - vijay: installed permanent root dashboard service `hermes-root-dashboard.service` at `http://100.87.53.10:9119/`. - bheem: installed permanent Uma dashboard service `uma-hermes-dashboard.service` at `http://100.87.53.10:9120/`. - vijay: added dashboard service unit templates under `systemd/` for repo tracking. - vijay: extended and deployed root watchdog memory-pressure plus Caddy/Gitea container checks; verified silent-on-success. - vijay: reviewed root persistent memories and recurring workflow skills. - bheem: reviewed Uma persistent memories and recurring workflow skills. - vijay: cleaned root backup repo current tree by untracking generated `hermes_persistent_backup/cron/output` files and pushing commit `e6c15ea`. - bheem: confirmed Uma wrapper repo is clean at `7ee5720` after Docker deployment removal. - vijay: ran root restore rehearsal into `/tmp/hermes-restore-test-root`, verified portable restore content, and scanned restored config/template for common token patterns. - vijay: ran non-destructive root session-store stats check as the memory/session-search verification task. - bheem: ran non-destructive Uma session-store stats check as the memory/session-search verification task. - vijay: updated `docs/hermes-operations.md` with root service commands, Tailscale status, restore rehearsal outcome, and late upgrade notes. - bheem: updated `docs/hermes-operations.md` with Uma service commands and shared private-dashboard notes. ### 2026-05-27 — vijay Gitea least-privilege Git path - vijay: confirmed local Gitea API version `1.22.6` and root-only token-file permissions without printing token values. - vijay: verified `/root/.gitea_npm_token_home` does not have broad profile-read scope; `/api/v1/user` returned the expected scope denial instead of user data. - vijay: installed `/root/.local/bin/gitea-git-askpass` and `/root/.local/bin/gitea-git` so Hermes/Git can authenticate to local Gitea without embedding tokens in remotes or Git config. - vijay: verified direct Git read operation: `gitea-git ls-remote http://localhost:3300/bytelyst/learning_ai_common_plat.git HEAD` returned HEAD `59c4638f85be...`. - vijay: verified the same read-only operation through Hermes one-shot; Hermes reported success and only the truncated HEAD hash. - vijay: documented the exact safe token flow in `docs/hermes-operations.md`; corrected GitHub status to show credentials already exist for root-managed pushes, with least-privilege scope audit still pending. ## Notes For Future Transcript Pass When the transcript is available, specifically check whether the video recommends any of the following and update this roadmap accordingly: - exact provider/model choices - recommended Hermes install path - gateway platform setup details - dashboard or web UI exposure guidance - memory/skill workflows - MCP server recommendations - cron/background agent patterns - voice/STT/TTS setup - any security warnings or anti-patterns