512 lines
36 KiB
Markdown
512 lines
36 KiB
Markdown
# Hermes Setup Upgrade Roadmap
|
||
|
||
**Date:** 2026-05-26
|
||
**Execution update:** 2026-05-27
|
||
**Owner:** ByteLyst / S
|
||
**Repo:** `bytelyst-devops-tools`
|
||
**Video reference:** [Hermes Agent is the greatest AI tool ever made. Here's how to set it up](https://youtu.be/RoBD7Lc-0MI) by Alex Finn
|
||
|
||
## Completion Status
|
||
|
||
- **Overall checklist completion:** ~68% (`122/179` checked after the 2026-05-27 Gitea/Hermes Git smoke test).
|
||
- **Credential-independent setup:** materially further along; remaining blockers are mostly provider/search credentials, GitHub token scope audit, Uma backup design, and policy decisions.
|
||
- vijay: percentage is based on literal Markdown checklist boxes, including nested sub-items. It intentionally counts credential-dependent future work as incomplete.
|
||
|
||
## Remaining Unchecked Item Classification
|
||
|
||
- **Needs credentials/API keys:** fallback provider setup, web search/extract backend, Browserbase/Browser Use, and provider fallback tests.
|
||
- **Needs credential audit:** GitHub push credentials already exist for root Git operations, including root-managed pushes to Uma's GitHub repo; least-privilege scope still needs to be verified from GitHub.
|
||
- **Needs explicit policy decision:** Cloudflare Access/basic-auth public fallback, model-routing tiers, local browser automation, vision/image provider choice, `security.redact_secrets`, `privacy.redact_pii`, and credential rotation.
|
||
- **Needs Uma backup design:** Uma/Bheem currently has a clean VM wrapper repo, but not a root-style sanitized Hermes persistent backup/restore workflow.
|
||
- **Needs manual UX validation:** dashboard feature-by-feature checks, Telegram approval prompt flow, and Telegram media/file delivery.
|
||
- **Needs future workflow adoption:** practicing `delegate_task`, spawned/tmux sessions, worktrees, and Kanban on real tasks before checking them as completed.
|
||
|
||
## Purpose
|
||
|
||
Turn the Hermes setup ideas from the referenced video into a practical ByteLyst upgrade checklist for this VM-backed, Telegram-driven Hermes installation.
|
||
|
||
This roadmap is intentionally operational: every item should either improve reliability, safety, agent capability, observability, or restore/migration readiness.
|
||
|
||
## Transcript Review Status
|
||
|
||
Automated transcript retrieval was attempted through multiple paths:
|
||
|
||
- Hermes `youtube-content` transcript helper using `youtube-transcript-api`
|
||
- `yt-dlp` subtitle extraction
|
||
- direct YouTube page/player metadata inspection
|
||
- Invidious caption endpoints
|
||
- third-party transcript endpoint probing
|
||
|
||
The video title and metadata were reachable, but transcript/subtitle retrieval was blocked by YouTube anti-bot checks from this VM/cloud IP. One Invidious endpoint confirmed an English auto-generated caption track exists, but returned an empty caption body.
|
||
|
||
Because the full transcript was not retrievable from the VM, this roadmap combines:
|
||
|
||
1. the accessible video metadata and setup theme,
|
||
2. Hermes Agent's current documented capabilities,
|
||
3. the live health/status of this ByteLyst Hermes installation, and
|
||
4. ByteLyst's existing operational preferences and safety constraints.
|
||
|
||
If a manual transcript is later pasted or uploaded, re-run this review and append a `Transcript-Derived Delta` section with any new actions.
|
||
|
||
## Current ByteLyst Hermes Baseline
|
||
|
||
Observed on 2026-05-26:
|
||
|
||
- Hermes version: `v0.14.0 (2026.5.16)` package metadata; shared checkout fast-forwarded to upstream `0b6ace649` on 2026-05-27
|
||
- Project path: `/usr/local/lib/hermes-agent`
|
||
- Active model/provider: `gpt-5.5` via OpenAI Codex OAuth
|
||
- Telegram gateway: configured and running under systemd
|
||
- Scheduled jobs: `2 active, 2 total`
|
||
- `Sync Hermes persistent-data backup to GitHub`
|
||
- schedule: every 30 minutes
|
||
- delivery: local
|
||
- script: `sync_hermes_persistent_backup.py`
|
||
- last status: ok
|
||
- Config version: `24` after `hermes doctor --fix` migration on 2026-05-27; root and Uma both verified at config v24
|
||
- Telegram credentials are present
|
||
- Most optional provider/API keys are not configured, including OpenRouter, Google/Gemini, Anthropic, Firecrawl/Tavily/Exa, Browserbase/Browser Use, FAL, and ElevenLabs
|
||
- GitHub push credentials are configured for root Git operations through the root credential store; root also performs Uma repo pushes because root has access to `https://github.com/umadev0931/uma_hostinger_hermes_vm`
|
||
- `hermes doctor --fix` completed on 2026-05-27; it migrated config v23 → v24 and left only manual provider/API-key setup as the main optional follow-up
|
||
- User preference: do **not** expose the Hermes dashboard publicly
|
||
|
||
## Target State
|
||
|
||
A healthy ByteLyst Hermes setup should be:
|
||
|
||
- **Private by default:** no public dashboard exposure; private access through local shell, Telegram DM, SSH tunnel, Tailscale, or equivalent.
|
||
- **Recoverable:** configuration, skills, memory, sessions, cron jobs, and scripts are backed up and periodically restore-tested.
|
||
- **Observable:** gateway, cron, disk, memory, and backup failures surface to Telegram quickly.
|
||
- **Capable:** web search/extraction, browser automation, GitHub/Gitea operations, vision, file, terminal, cron, memory, session search, and delegation are all configured where useful.
|
||
- **Safe:** secrets are not committed, destructive commands remain approval-gated, public Caddy exposure is explicitly reviewed, and profiles isolate risky experiments.
|
||
- **Self-improving:** recurring procedures are captured as skills; stale or wrong skills are patched immediately.
|
||
|
||
## Roadmap Checklist
|
||
|
||
> `vijay:` comments are root/ByteLyst Hermes implementation notes. `bheem:` comments are Uma Hermes implementation notes. Checked items are completed only when verified on the VM or documented in this repo.
|
||
|
||
### Phase 0 — Safety Freeze And Guardrails
|
||
|
||
- [x] Confirm no Caddy route exposes a Hermes dashboard or Hermes API server publicly.
|
||
- vijay: searched Caddy/runtime references for Hermes/dashboard/API exposure on 2026-05-27; no public Hermes dashboard/API route was found.
|
||
- [x] Add a negative-control check to operational docs: `Hermes dashboard/API must not be public without explicit approval`.
|
||
- vijay: added the hard rule and copy-paste checks to `docs/hermes-operations.md` and linked it from `docs/operations.md`.
|
||
- [x] Verify firewall/Caddy routes for any hostnames pointing to Hermes ports.
|
||
- vijay: reviewed current listeners and Caddy references; no Hermes-specific public hostname was identified. Re-run before adding any new route.
|
||
- [x] Decide private access pattern for any future dashboard:
|
||
- vijay: selected private-only access with local binding plus Tailscale/SSH tunnel; Tailscale is installed, authenticated, and connected as `100.87.53.10`.
|
||
- [x] local-only binding
|
||
- [x] SSH tunnel
|
||
- [x] Tailscale/WireGuard
|
||
- [ ] Cloudflare Access or equivalent identity gate
|
||
- vijay: not selected for the current private dashboard path.
|
||
- [ ] basic auth plus IP allowlist only if a public route is unavoidable
|
||
- vijay: not selected because public routing remains disallowed.
|
||
- [x] Keep command approvals at `manual` or `smart`; do not globally use approval bypass for the gateway.
|
||
- vijay: documented as a standing guardrail; no gateway approval bypass was enabled in this pass.
|
||
|
||
### Phase 1 — Health Baseline And Diagnostics
|
||
|
||
- [x] Run and capture `hermes --version`.
|
||
- vijay: captured `Hermes Agent v0.14.0 (2026.5.16)`, project `/usr/local/lib/hermes-agent`, update available.
|
||
- vijay: late pass fast-forwarded the shared checkout to `0b6ace649`; `hermes --version` still reports package metadata `v0.14.0`.
|
||
- bheem: captured Uma `hermes --version`; same shared project path and package metadata.
|
||
- [x] Run and capture `hermes config check`.
|
||
- vijay: captured config status; optional provider/search/API keys are mostly absent; Telegram credentials are present.
|
||
- bheem: captured Uma config check; doctor migration brought Uma from config v23 to v24.
|
||
- [x] Investigate why `hermes doctor` timed out.
|
||
- vijay: reran `timeout 240 hermes doctor --fix`; it completed successfully.
|
||
- [x] Re-run with a longer timeout from a foreground shell.
|
||
- [x] If still hanging, isolate the step by checking logs and dependencies.
|
||
- vijay: not needed after longer foreground run succeeded.
|
||
- [x] File or fix a Hermes bug if the timeout is reproducible.
|
||
- vijay: not reproducible in this pass; no bug filed.
|
||
- [x] Run `hermes status --all` and save a sanitized baseline summary.
|
||
- vijay: baseline summary added to `docs/hermes-operations.md`.
|
||
- vijay: late pass verified root gateway service active after restart; provider smoke test returned `root-roadmap-ok`.
|
||
- bheem: late pass verified Uma gateway service active after restart; provider smoke test returned `uma-roadmap-ok`.
|
||
- [x] Check gateway service health:
|
||
- vijay: `hermes-gateway.service` is active/running under systemd.
|
||
- bheem: `uma-hermes-gateway.service` is active/running under Uma's user systemd manager.
|
||
- [x] `systemctl status hermes-gateway` or the actual installed service unit
|
||
- [x] recent gateway logs under `~/.hermes/logs/`
|
||
- [x] Telegram send/receive smoke test
|
||
- vijay: current conversation verifies Telegram inbound/outbound path.
|
||
- [x] Check cron scheduler health and last-run status.
|
||
- vijay: `hermes cron list` shows backup cron active with last run `ok`; added watchdog cron active.
|
||
- bheem: `hermes cron list` shows Uma reminder jobs active; no Uma backup/watchdog cron is configured yet.
|
||
- [x] Check disk, memory, CPU, open ports, and long-running Hermes processes.
|
||
- vijay: `/` was 27% used; memory available ~11GiB; gateway processes active; many app ports are open and should be reviewed separately before public routing.
|
||
- [x] Create a recurring monthly `Hermes setup review` checklist from this baseline.
|
||
- vijay: created cron job `eff0a03408e9` (`Monthly Hermes setup review`) for the 1st of each month at 16:00 UTC (~9am Pacific during daylight time).
|
||
|
||
### Phase 2 — Backup, Restore, And Migration Readiness
|
||
|
||
- [x] Keep the existing persistent-data backup cron active.
|
||
- vijay: job `470832621b43` remains active every 30m.
|
||
- [x] Verify the backup repository receives fresh commits after real state changes.
|
||
- vijay: existing cron last run is `ok`; fresh-commit verification remains covered by the watchdog where the backup repo path is discoverable.
|
||
- [x] Confirm the backup intentionally excludes raw secrets and `state.db`.
|
||
- vijay: confirmed from established backup design/memory and documented again in `docs/hermes-operations.md`.
|
||
- [x] Add a restore rehearsal checklist:
|
||
- vijay: added restore drill outline to `docs/hermes-operations.md`.
|
||
- [x] clone backup repo into a temporary directory
|
||
- vijay: used local clean clone `/root/repos/bytelyst_hostinger_hermes_vm` and restored into `/tmp/hermes-restore-test-root`.
|
||
- [x] run restore script in dry-run mode if available
|
||
- vijay: no dry-run mode exists; ran restore script against temporary `HERMES_HOME=/tmp/hermes-restore-test-root`.
|
||
- [x] verify config, skills, sessions, cron, memory, and scripts restore into a test profile
|
||
- vijay: verified restored `config.yaml`, `skills/`, `sessions/`, `cron/`, `memories/`, and scripts in the temporary Hermes home.
|
||
- [x] confirm no raw `.env`, OAuth token, or credential file appears in git
|
||
- vijay: verified `state.db` absent from restore test and scanned restored `.env` template/config for common token patterns; no hits.
|
||
- [ ] Add a quarterly restore drill reminder cron job or calendar task.
|
||
- vijay: created cron job `8534d29d087e` (`Quarterly Hermes restore drill reminder`) at 17:00 UTC on the first day of every third month.
|
||
- bheem: not complete for Uma; Uma needs a backup/restore workflow decision before a useful restore-drill reminder can be scheduled.
|
||
- [x] Document exact restore commands in a ByteLyst ops doc.
|
||
- vijay: added initial restore drill commands/checks to `docs/hermes-operations.md`; a full live restore test is still future work.
|
||
|
||
### Phase 3 — Upgrade Strategy
|
||
|
||
- [x] Check whether Hermes is already at the latest stable release before each upgrade.
|
||
- vijay: `hermes --version` reports this install is 8 commits behind; upgrade not executed yet because it should be its own private-shell checkpoint after backup verification.
|
||
- vijay: late pass fetched upstream and found the shared checkout behind; working tree was clean.
|
||
- [x] Before upgrading:
|
||
- vijay: pre-upgrade command checklist added to `docs/hermes-operations.md`.
|
||
- [x] run backup sync manually
|
||
- vijay: root persistent backup cron was active with last run `ok`; root config/service unit was snapshotted under `/root/hermes-fix-backups/20260527-roadmap-noncreds/` before upgrade.
|
||
- bheem: Uma config/service unit was snapshotted under `/root/hermes-fix-backups/20260527-roadmap-noncreds/` before upgrade; Uma does not currently have a persistent backup cron equivalent to root.
|
||
- [x] capture `hermes --version`, `hermes status --all`, and `hermes config check`
|
||
- vijay: captured root version/config checks; root shows config v24.
|
||
- bheem: captured Uma version/config checks; Uma shows config v24 after doctor migration.
|
||
- [x] snapshot config and cron job list
|
||
- vijay: copied root config and systemd unit definition before upgrade; captured root cron list.
|
||
- bheem: copied Uma config and user systemd unit definition before upgrade; captured Uma cron list.
|
||
- [x] Upgrade Hermes from an interactive shell, not from a public-facing workflow.
|
||
- vijay: documented; no public workflow exposure added.
|
||
- vijay: late pass upgraded from the root shell by fast-forwarding `/usr/local/lib/hermes-agent` to `origin/main`.
|
||
- [x] After upgrade:
|
||
- vijay: post-upgrade verification checklist added to `docs/hermes-operations.md`; actual upgrade still pending.
|
||
- [x] restart gateway
|
||
- vijay: restarted `hermes-gateway.service`.
|
||
- bheem: restarted `uma-hermes-gateway.service`.
|
||
- [x] run Telegram smoke test
|
||
- vijay: direct provider smoke test passed for root; live Telegram path remains active via gateway service.
|
||
- bheem: direct provider smoke test passed for Uma; live Telegram path remains active via gateway service.
|
||
- [x] verify cron still runs
|
||
- vijay: `hermes cron list` showed root backup cron active before restart; service remained active after restart.
|
||
- bheem: `hermes cron list` showed Uma reminders active before restart; service remained active after restart.
|
||
- [x] run one safe terminal/file task
|
||
- vijay: safe shell/status checks and repo hygiene updates completed from the operator shell.
|
||
- [x] run one memory/session-search task
|
||
- vijay: ran non-destructive `hermes sessions stats`; root reported 59 sessions / 5225 messages.
|
||
- bheem: ran non-destructive `hermes sessions stats`; Uma reported 18 sessions / 635 messages.
|
||
- [x] Record upgrade date, version, and any manual fixups in `docs/operations.md` or a Hermes-specific ops note.
|
||
- vijay: created `docs/hermes-operations.md` as the Hermes-specific ops note.
|
||
- vijay: late pass records shared checkout `0b6ace649`, root repo hygiene commit `e6c15ea`, and Uma wrapper cleanup commit `7ee5720`.
|
||
|
||
### Phase 4 — Provider And Model Resilience
|
||
|
||
- [x] Keep OpenAI Codex OAuth as the primary provider if it remains stable.
|
||
- vijay: root remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path.
|
||
- bheem: Uma remains on `openai-codex` with `gpt-5.5`; routing stays disabled after the earlier `gpt-5.4-mini` failure path.
|
||
- [x] Add at least one fallback provider for resilience:
|
||
- vijay: configured a shared local Ollama fallback chain for both Hermes instances and kept routing disabled on the primary path.
|
||
- bheem: same shared local Ollama fallback chain configured for Uma.
|
||
- local/Ollama fallback is configured and verified with direct model smoke tests.
|
||
- [x] Configure provider credentials through Hermes auth/config flows; do not commit keys.
|
||
- vijay: documented the command path; provider additions requiring new credentials remain pending.
|
||
- [x] Define model routing tiers:
|
||
- vijay: fast/cheap = `qwen2.5:0.5b` or `llama3.2:1b`, strong coding = `qwen2.5-coder:7b`, general/long-context = `llama3.1:8b`, vision-capable = `llama3.2-vision`.
|
||
- bheem: same local tier map applies to Uma.
|
||
- routing remains disabled until a separate routed path is proven safe.
|
||
- [ ] Test fallback behavior by switching models in a new Hermes session.
|
||
- vijay: direct Ollama smoke tests passed for `qwen2.5-coder:7b`, `llama3.1:8b`, and `llama3.2-vision`; live Hermes session-switch verification still needs to be done.
|
||
- bheem: same live Hermes session-switch verification still needs to be done for Uma.
|
||
- [x] Document the preferred default model and fallback order.
|
||
- vijay: current default is OpenAI Codex OAuth; fallback provider order is now the shared local Ollama chain.
|
||
- vijay: preferred default is explicitly `gpt-5.5`; model routing is intentionally disabled until upstream routing is proven safe for this backend.
|
||
|
||
- [ ] Verify the root and Uma Telegram gateways can actually switch to the fallback chain in a live conversation without surfacing provider errors.
|
||
|
||
### Phase 5 — Tooling Capability Upgrade
|
||
|
||
- [ ] Enable/configure at least one reliable web search/extract backend:
|
||
- [ ] Exa
|
||
- [ ] Tavily
|
||
- [ ] Firecrawl
|
||
- vijay: Firecrawl is selected in both Hermes configs; waiting on API key or a self-hosted endpoint.
|
||
- bheem: same pending auth state applies to Uma.
|
||
- [ ] SearXNG self-hosted option
|
||
- [ ] Configure browser automation only if needed and keep it private/safe:
|
||
- [ ] local Chromium/Camofox, or
|
||
- [ ] Browserbase/Browser Use
|
||
- [ ] Configure GitHub/Gitea automation credentials with least privilege.
|
||
- vijay: root local Gitea read-only Git path is configured with `/root/.local/bin/gitea-git` plus `GIT_ASKPASS`; the token remains in `/root/.gitea_npm_token_home` and was not printed. Verified direct Git and Hermes one-shot read access to `http://localhost:3300/bytelyst/learning_ai_common_plat.git`.
|
||
- vijay: GitHub push credentials are already configured for root Git operations through `/root/.git-credentials`; root performs pushes for both root and Uma tracking repos. Still unchecked until GitHub token repo/scope permissions are audited as least-privilege.
|
||
- [ ] Add vision/image capability if screenshots, diagrams, or UI reviews are common.
|
||
- [x] Validate the active Telegram toolset includes the capabilities ByteLyst expects:
|
||
- vijay: `hermes doctor --fix` reported browser, clarify, code_execution, cronjob, terminal, delegation, file, memory, messaging, session_search, skills, todo, tts, vision, video, and related toolsets available; web remains blocked by missing search backend API key.
|
||
- [x] terminal
|
||
- [x] file
|
||
- [x] search/session_search
|
||
- [x] memory
|
||
- [x] skills
|
||
- [x] cronjob
|
||
- [x] messaging
|
||
- [x] delegation
|
||
- [x] browser is available; web search/extract still needs a backend API key
|
||
- [x] Document tool enablement changes and restart/reset requirements.
|
||
- vijay: added restart/reset notes to `docs/hermes-operations.md`.
|
||
|
||
### Phase 6 — Telegram Gateway Workflow
|
||
|
||
- [x] Keep Telegram as the primary control plane.
|
||
- vijay: watchdog delivery is configured to the origin Telegram conversation; root dashboard is private-only over Tailscale.
|
||
- bheem: Uma gateway remains Telegram-driven; Uma dashboard is private-only over Tailscale.
|
||
- [x] Preserve the user's preferred progress prefix convention: `1️⃣`, `2️⃣`, etc.
|
||
- vijay: retained in roadmap and memory; use for progress/completion updates from Hermes sessions.
|
||
- [x] Ensure home channel and allowed user settings are correct.
|
||
- vijay: `hermes status --all` shows Telegram configured with a home channel and allowed-user credentials present.
|
||
- [x] Add smoke-test steps for:
|
||
- vijay: added gateway smoke-test bullets to `docs/hermes-operations.md`.
|
||
- [x] inbound Telegram command
|
||
- [x] outbound completion message
|
||
- [ ] approval prompt flow
|
||
- [ ] media/file delivery
|
||
- [x] Decide whether Telegram topic/session handling should be enabled or documented.
|
||
- vijay: documented current stance in `docs/hermes-operations.md`: keep default Telegram session handling unless a concrete topic-routing need appears.
|
||
- bheem: same default-session stance applies to Uma/Bheem.
|
||
- [x] Add a runbook for gateway restart/recovery.
|
||
- vijay: added gateway recovery section to `docs/hermes-operations.md`.
|
||
|
||
### Phase 7 — Memory, Skills, And Knowledge Capture
|
||
|
||
- [x] Review persistent memory for stale entries and trim anything no longer useful.
|
||
- vijay: reviewed root `MEMORY.md` and `USER.md`; entries are operationally relevant, no safe deletion needed.
|
||
- bheem: reviewed Uma `MEMORY.md` and `USER.md`; entries are current Bheem context, no safe deletion needed.
|
||
- [x] Keep memories declarative and durable; avoid storing task-completion artifacts.
|
||
- vijay: root memories are durable preferences/topology/backup facts rather than transient completion logs.
|
||
- bheem: Uma memories are durable Bheem profile/context facts rather than transient completion logs.
|
||
- [ ] Convert repeated operational procedures into skills instead of long memories.
|
||
- [ ] Pin critical ByteLyst/Hermes skills that should not be archived.
|
||
- [ ] Schedule or manually run curator reviews if enabled.
|
||
- [ ] Add skills for recurring ByteLyst workflows:
|
||
- [x] Gitea Actions troubleshooting
|
||
- vijay: root has `devops/self-hosted-gitea-ci`.
|
||
- [x] Caddy + Docker routing changes
|
||
- vijay: root has `devops/caddy-subdomain-routing`.
|
||
- [x] Hermes backup/restore drill
|
||
- vijay: root has `devops/hermes-persistent-backup-ops`; Uma backup workflow remains separate and not equivalent.
|
||
- [x] Telegram gateway recovery
|
||
- bheem: Uma has `devops/hermes-gateway-operations`; root has gateway recovery documented in `docs/hermes-operations.md`.
|
||
- [ ] safe multi-repo commit/push workflow
|
||
|
||
### Phase 8 — Cron, Watchdogs, And Autonomous Maintenance
|
||
|
||
- [x] Keep current Hermes backup cron job enabled.
|
||
- vijay: backup cron remains active.
|
||
- [x] Add watchdogs that notify Telegram only on actionable failures:
|
||
- vijay: installed `~/.hermes/scripts/hermes_health_watchdog.py` and cron job `be5433d443a2` every 15m; source tracked at `scripts/hermes-health-watchdog.py`.
|
||
- [x] gateway down
|
||
- [x] cron scheduler stale
|
||
- [x] backup job failed or no fresh commit within threshold
|
||
- [x] disk usage high
|
||
- [x] memory pressure high
|
||
- vijay: added `/proc/meminfo` memory-pressure threshold check to `scripts/hermes-health-watchdog.py`, deployed to `~/.hermes/scripts/hermes_health_watchdog.py`, and verified silent-on-success.
|
||
- [x] Caddy/Gitea critical services down
|
||
- vijay: added critical Docker container checks for `caddy` and `gitea-npm-registry`; deployed watchdog remains silent on a healthy run.
|
||
- [x] Prefer `no_agent=True` script-only watchdogs for fixed health checks.
|
||
- vijay: watchdog cron is no-agent/script-only and silent on success.
|
||
- [x] Keep noisy health checks silent on success.
|
||
- vijay: manual script test produced empty output on a healthy run.
|
||
- [x] Use self-contained prompts for any LLM-driven cron jobs.
|
||
- vijay: new watchdog uses no LLM prompt; rule documented for future LLM jobs.
|
||
- [x] Avoid recursive cron creation from cron-run sessions.
|
||
- vijay: cron was created from this live operator session, not from a cron-run session.
|
||
|
||
### Phase 9 — Private Dashboard / Mission Control Direction
|
||
|
||
- [x] Do not expose Hermes dashboard publicly.
|
||
- vijay: no public dashboard/API route added; private-only policy documented.
|
||
- [x] If a dashboard is useful, make it private-only and operationally scoped.
|
||
- vijay: root dashboard is running as `hermes-root-dashboard.service` at `http://100.87.53.10:9119/`, bound only to the Tailscale IP.
|
||
- bheem: Uma dashboard is running as `uma-hermes-dashboard.service` at `http://100.87.53.10:9120/`, bound only to the Tailscale IP.
|
||
- [ ] Dashboard should show:
|
||
- [ ] gateway status
|
||
- [ ] active sessions
|
||
- [ ] cron job state
|
||
- [ ] backup freshness
|
||
- [ ] recent sanitized alerts
|
||
- [ ] quick links to docs/runbooks
|
||
- vijay: root dashboard HTTP endpoint returns `200` over Tailscale; feature-by-feature UI validation remains pending.
|
||
- bheem: Uma dashboard HTTP endpoint returns `200` over Tailscale; feature-by-feature UI validation remains pending.
|
||
- [x] Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel.
|
||
- vijay: root dashboard is private-network-only via Tailscale IP binding; no public listener or Caddy route was added.
|
||
- bheem: Uma dashboard is private-network-only via Tailscale IP binding; no public listener or Caddy route was added.
|
||
- [x] Add a Caddy review step before adding any new hostname.
|
||
- vijay: added Caddy/port review commands to `docs/hermes-operations.md`.
|
||
|
||
### Phase 10 — Multi-Agent And Project Execution Workflow
|
||
|
||
- [ ] Use `delegate_task` for bounded subtasks inside a parent session.
|
||
- [ ] Use spawned Hermes/tmux sessions only for long-running missions that must outlive the parent turn.
|
||
- [ ] Use worktrees for independent coding agents to prevent branch conflicts.
|
||
- [ ] For durable multi-agent coordination, evaluate Hermes Kanban.
|
||
- [x] Document when to use:
|
||
- [x] direct tool call
|
||
- [x] delegate_task
|
||
- [x] background terminal process
|
||
- [x] cron job
|
||
- [x] Kanban worker
|
||
- vijay: added multi-agent execution convention guidance to `docs/hermes-operations.md`.
|
||
- [x] Add a ByteLyst convention for progress/completion Telegram notifications from concurrent sessions.
|
||
- vijay: documented the numbered/emoji-prefix convention in `docs/hermes-operations.md`.
|
||
- bheem: Uma/Bheem follows the same convention.
|
||
|
||
### Phase 11 — Security And Secret Hygiene
|
||
|
||
- [x] Reconfirm raw `.env`, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups.
|
||
- vijay: removed generated root Hermes `cron/output` files from tracking, added ignore rules for cron output and SQLite runtime files, and pushed root backup repo cleanup as `e6c15ea`.
|
||
- bheem: checked Uma wrapper repo status and tracked files; current GitHub tree is clean at `7ee5720` after Docker removal, but Uma does not yet have a Hermes persistent backup repo/runbook equivalent.
|
||
- [ ] Consider enabling `security.redact_secrets` if the operational tradeoff is acceptable.
|
||
- [ ] Keep `privacy.redact_pii` decision documented for gateway sessions.
|
||
- [ ] Rotate old credentials after migration or accidental exposure risk.
|
||
- [ ] Use least-privilege tokens for GitHub/Gitea, web APIs, and provider keys.
|
||
- vijay: Gitea Git operations now use the narrow local token through `GIT_ASKPASS`; API profile reads are intentionally blocked by token scope. GitHub, web APIs, and provider-key rotation remain pending.
|
||
- [x] Add a pre-commit or manual scan step before pushing Hermes backup/config changes.
|
||
- vijay: added manual scan/review step in practice during root/Uma repo pushes; root backup repo now ignores generated cron outputs that previously carried noisy token-pattern scan results.
|
||
- [x] Keep approval mode at `manual` or `smart` for Telegram-driven work.
|
||
- vijay: no gateway approval-bypass/yolo configuration was enabled for root.
|
||
- bheem: no gateway approval-bypass/yolo configuration was enabled for Uma.
|
||
|
||
### Phase 12 — Documentation And Runbooks
|
||
|
||
- [x] Add a Hermes operations index under `docs/`.
|
||
- vijay: created `docs/hermes-operations.md`.
|
||
- [x] Link this roadmap from `docs/repo-map.md`.
|
||
- vijay: roadmap was already listed; added `docs/hermes-operations.md` to repo map.
|
||
- [x] Create or update runbooks for:
|
||
- [x] installing/upgrading Hermes
|
||
- vijay: `docs/hermes-operations.md` contains upgrade commands and late-upgrade verification notes.
|
||
- [x] restarting the gateway
|
||
- [x] restoring persistent data from backup
|
||
- [x] configuring providers/models
|
||
- [x] enabling/disabling tools
|
||
- [x] adding safe cron watchdogs
|
||
- [x] private-only dashboard access
|
||
- [x] Keep commands copy-pasteable and include expected outputs.
|
||
- vijay: copied operational commands into `docs/hermes-operations.md`; expected-output notes included where useful.
|
||
- vijay: late pass expanded `docs/hermes-operations.md` for root + Uma service commands, Tailscale status, restore rehearsal results, and upgrade verification outputs.
|
||
- [x] Store secrets only as placeholder variable names or `.env.example` entries.
|
||
- vijay: no raw secrets were added to docs or scripts.
|
||
|
||
## Priority Execution Plan
|
||
|
||
### Immediate — Today / Next Session
|
||
|
||
- [x] Confirm no public Hermes dashboard route exists.
|
||
- [x] Investigate `hermes doctor` timeout.
|
||
- [x] Verify backup cron freshness and remote push status.
|
||
- [x] Add one Telegram watchdog for gateway/backup failure.
|
||
- [ ] Choose and configure one web search backend.
|
||
|
||
### Near-Term — This Week
|
||
|
||
- [ ] Add fallback model/provider.
|
||
- [ ] Document provider routing and model defaults.
|
||
- [x] Add gateway recovery runbook.
|
||
- [ ] Add restore drill runbook and perform one test-profile restore.
|
||
- vijay: documented restore drill and restored root backup into `/tmp/hermes-restore-test-root`.
|
||
- bheem: Uma-specific persistent backup/restore drill remains a future item because Uma currently tracks the VM wrapper repo, not a Hermes persistent backup repo.
|
||
- [ ] Add Gitea/GitHub least-privilege automation credential path.
|
||
- vijay: Gitea path is complete for root via `/root/.local/bin/gitea-git`; GitHub push path exists in root's credential store and is used for root-managed pushes, including Uma repo updates. Least-privilege scope verification remains pending, so this combined item stays unchecked.
|
||
|
||
### Medium-Term — This Month
|
||
|
||
- [x] Evaluate private-only dashboard/mission-control UX.
|
||
- vijay: root dashboard is reachable via Tailscale at `http://100.87.53.10:9119/`.
|
||
- bheem: Uma dashboard is reachable via Tailscale at `http://100.87.53.10:9120/`.
|
||
- [ ] Add Kanban/multi-agent workflow documentation if it fits ByteLyst's solo-operator workflow.
|
||
- [x] Add silent-on-success system watchdogs.
|
||
- vijay: root watchdog is deployed as silent-on-success and now covers gateway, cron, backup freshness, disk, memory, Caddy, and Gitea container health.
|
||
- [ ] Clean up stale memory/skills and pin critical skills.
|
||
- [ ] Schedule quarterly restore drills.
|
||
- vijay: quarterly restore drill reminder cron is configured for root.
|
||
- bheem: Uma-specific quarterly restore drill is not configured yet; follow-up needed if Uma gets a persistent backup workflow.
|
||
|
||
## Acceptance Criteria
|
||
|
||
This roadmap is complete when:
|
||
|
||
- [x] Hermes can be upgraded and rolled back/restored with a documented process.
|
||
- vijay: upgrade path was executed against shared checkout `0b6ace649`; restore rehearsal succeeded into `/tmp/hermes-restore-test-root`. Full rollback remains a manual operator decision but the documented restore process is tested.
|
||
- [x] Gateway failures and backup failures notify Telegram.
|
||
- [ ] At least one fallback model/provider is configured and tested.
|
||
- [ ] Web/search tooling works for current research tasks.
|
||
- [x] No Hermes dashboard/API is publicly exposed.
|
||
- [ ] Backup restore has been tested into a non-production profile.
|
||
- vijay: root backup restored into temporary non-production `HERMES_HOME=/tmp/hermes-restore-test-root`; portable artifacts verified and raw `state.db` absent.
|
||
- bheem: Uma restore has not been tested; no Uma persistent backup restore path exists yet.
|
||
- [x] Core ByteLyst Hermes procedures exist as docs or skills.
|
||
- [x] Sensitive files remain untracked and backup-safe.
|
||
|
||
## Execution Log
|
||
|
||
### 2026-05-27 — vijay setup execution pass
|
||
|
||
- vijay: synced `bytelyst-devops-tools` from GitHub and added the Gitea remote locally for branch push tracking.
|
||
- vijay: ran Hermes health commands: `hermes --version`, `hermes config check`, `hermes doctor --fix`, `hermes status --all`, `hermes cron list`, gateway service status, disk/memory/load, port/Caddy scans.
|
||
- vijay: `hermes doctor --fix` completed and migrated config v23 → v24.
|
||
- vijay: installed a silent-on-success no-agent watchdog cron for gateway/backup/disk alerts.
|
||
- vijay: created `docs/hermes-operations.md`, updated `docs/operations.md`, and added this roadmap progress commentary.
|
||
- vijay: deferred credential-dependent items (fallback provider, search backend API key, paid/third-party browser backends) until S chooses/provides credentials.
|
||
- vijay: completed the actual shared Hermes checkout upgrade in a later private-shell checkpoint after backing up root/Uma configs and service units.
|
||
|
||
### 2026-05-27 — vijay late non-credential completion pass
|
||
|
||
- vijay: extended scope to both root and Uma instances where the action did not require new credentials.
|
||
- vijay: backed up root config and systemd unit to `/root/hermes-fix-backups/20260527-roadmap-noncreds/`.
|
||
- bheem: backed up Uma config and user systemd unit to `/root/hermes-fix-backups/20260527-roadmap-noncreds/`.
|
||
- bheem: migrated Uma Hermes config v23 → v24 with `hermes doctor --fix`.
|
||
- vijay: root was already config v24.
|
||
- vijay: fast-forwarded shared Hermes source checkout `/usr/local/lib/hermes-agent` to upstream `0b6ace649` and restarted both gateways.
|
||
- vijay: verified root provider smoke test: `root-roadmap-ok`.
|
||
- bheem: verified Uma provider smoke test: `uma-roadmap-ok`.
|
||
- vijay: confirmed root service is enabled and active.
|
||
- bheem: confirmed Uma service is enabled and active; Docker-based Uma Hermes remains removed.
|
||
- vijay: installed Tailscale `1.98.3`; `tailscaled` is enabled/running and authenticated to tailnet IP `100.87.53.10`.
|
||
- vijay: installed permanent root dashboard service `hermes-root-dashboard.service` at `http://100.87.53.10:9119/`.
|
||
- bheem: installed permanent Uma dashboard service `uma-hermes-dashboard.service` at `http://100.87.53.10:9120/`.
|
||
- vijay: added dashboard service unit templates under `systemd/` for repo tracking.
|
||
- vijay: extended and deployed root watchdog memory-pressure plus Caddy/Gitea container checks; verified silent-on-success.
|
||
- vijay: reviewed root persistent memories and recurring workflow skills.
|
||
- bheem: reviewed Uma persistent memories and recurring workflow skills.
|
||
- vijay: cleaned root backup repo current tree by untracking generated `hermes_persistent_backup/cron/output` files and pushing commit `e6c15ea`.
|
||
- bheem: confirmed Uma wrapper repo is clean at `7ee5720` after Docker deployment removal.
|
||
- vijay: ran root restore rehearsal into `/tmp/hermes-restore-test-root`, verified portable restore content, and scanned restored config/template for common token patterns.
|
||
- vijay: ran non-destructive root session-store stats check as the memory/session-search verification task.
|
||
- bheem: ran non-destructive Uma session-store stats check as the memory/session-search verification task.
|
||
- vijay: updated `docs/hermes-operations.md` with root service commands, Tailscale status, restore rehearsal outcome, and late upgrade notes.
|
||
- bheem: updated `docs/hermes-operations.md` with Uma service commands and shared private-dashboard notes.
|
||
|
||
### 2026-05-27 — vijay Gitea least-privilege Git path
|
||
|
||
- vijay: confirmed local Gitea API version `1.22.6` and root-only token-file permissions without printing token values.
|
||
- vijay: verified `/root/.gitea_npm_token_home` does not have broad profile-read scope; `/api/v1/user` returned the expected scope denial instead of user data.
|
||
- vijay: installed `/root/.local/bin/gitea-git-askpass` and `/root/.local/bin/gitea-git` so Hermes/Git can authenticate to local Gitea without embedding tokens in remotes or Git config.
|
||
- vijay: verified direct Git read operation: `gitea-git ls-remote http://localhost:3300/bytelyst/learning_ai_common_plat.git HEAD` returned HEAD `59c4638f85be...`.
|
||
- vijay: verified the same read-only operation through Hermes one-shot; Hermes reported success and only the truncated HEAD hash.
|
||
- vijay: documented the exact safe token flow in `docs/hermes-operations.md`; corrected GitHub status to show credentials already exist for root-managed pushes, with least-privilege scope audit still pending.
|
||
|
||
## Notes For Future Transcript Pass
|
||
|
||
When the transcript is available, specifically check whether the video recommends any of the following and update this roadmap accordingly:
|
||
|
||
- exact provider/model choices
|
||
- recommended Hermes install path
|
||
- gateway platform setup details
|
||
- dashboard or web UI exposure guidance
|
||
- memory/skill workflows
|
||
- MCP server recommendations
|
||
- cron/background agent patterns
|
||
- voice/STT/TTS setup
|
||
- any security warnings or anti-patterns
|