24 KiB
Hermes Setup Upgrade Roadmap
Date: 2026-05-26
Execution update: 2026-05-27
Owner: ByteLyst / S
Repo: bytelyst-devops-tools
Video reference: Hermes Agent is the greatest AI tool ever made. Here's how to set it up by Alex Finn
Completion Status
- Overall checklist completion: ~48% (
85/179checked before the 2026-05-27 late pass). - Credential-independent setup: materially further along; remaining blockers are mostly provider/search credentials, tailnet login, GitHub/Gitea tokens, and policy decisions.
- vijay: percentage is based on literal Markdown checklist boxes, including nested sub-items. It intentionally counts credential-dependent future work as incomplete.
Purpose
Turn the Hermes setup ideas from the referenced video into a practical ByteLyst upgrade checklist for this VM-backed, Telegram-driven Hermes installation.
This roadmap is intentionally operational: every item should either improve reliability, safety, agent capability, observability, or restore/migration readiness.
Transcript Review Status
Automated transcript retrieval was attempted through multiple paths:
- Hermes
youtube-contenttranscript helper usingyoutube-transcript-api yt-dlpsubtitle extraction- direct YouTube page/player metadata inspection
- Invidious caption endpoints
- third-party transcript endpoint probing
The video title and metadata were reachable, but transcript/subtitle retrieval was blocked by YouTube anti-bot checks from this VM/cloud IP. One Invidious endpoint confirmed an English auto-generated caption track exists, but returned an empty caption body.
Because the full transcript was not retrievable from the VM, this roadmap combines:
- the accessible video metadata and setup theme,
- Hermes Agent's current documented capabilities,
- the live health/status of this ByteLyst Hermes installation, and
- ByteLyst's existing operational preferences and safety constraints.
If a manual transcript is later pasted or uploaded, re-run this review and append a Transcript-Derived Delta section with any new actions.
Current ByteLyst Hermes Baseline
Observed on 2026-05-26:
- Hermes version:
v0.14.0 (2026.5.16)package metadata; shared checkout fast-forwarded to upstream0b6ace649on 2026-05-27 - Project path:
/usr/local/lib/hermes-agent - Active model/provider:
gpt-5.5via OpenAI Codex OAuth - Telegram gateway: configured and running under systemd
- Scheduled jobs:
2 active, 2 totalSync Hermes persistent-data backup to GitHub- schedule: every 30 minutes
- delivery: local
- script:
sync_hermes_persistent_backup.py - last status: ok
- Config version:
24afterhermes doctor --fixmigration on 2026-05-27; root and Uma both verified at config v24 - Telegram credentials are present
- Most optional provider/API keys are not configured, including OpenRouter, Google/Gemini, Anthropic, Firecrawl/Tavily/Exa, Browserbase/Browser Use, GitHub token, FAL, and ElevenLabs
hermes doctor --fixcompleted on 2026-05-27; it migrated config v23 → v24 and left only manual provider/API-key setup as the main optional follow-up- User preference: do not expose the Hermes dashboard publicly
Target State
A healthy ByteLyst Hermes setup should be:
- Private by default: no public dashboard exposure; private access through local shell, Telegram DM, SSH tunnel, Tailscale, or equivalent.
- Recoverable: configuration, skills, memory, sessions, cron jobs, and scripts are backed up and periodically restore-tested.
- Observable: gateway, cron, disk, memory, and backup failures surface to Telegram quickly.
- Capable: web search/extraction, browser automation, GitHub/Gitea operations, vision, file, terminal, cron, memory, session search, and delegation are all configured where useful.
- Safe: secrets are not committed, destructive commands remain approval-gated, public Caddy exposure is explicitly reviewed, and profiles isolate risky experiments.
- Self-improving: recurring procedures are captured as skills; stale or wrong skills are patched immediately.
Roadmap Checklist
vijay:comments are live implementation notes from the 2026-05-27 setup execution pass. Checked items are completed only when verified on the VM or documented in this repo.
Phase 0 — Safety Freeze And Guardrails
- Confirm no Caddy route exposes a Hermes dashboard or Hermes API server publicly.
- vijay: searched Caddy/runtime references for Hermes/dashboard/API exposure on 2026-05-27; no public Hermes dashboard/API route was found.
- Add a negative-control check to operational docs:
Hermes dashboard/API must not be public without explicit approval.- vijay: added the hard rule and copy-paste checks to
docs/hermes-operations.mdand linked it fromdocs/operations.md.
- vijay: added the hard rule and copy-paste checks to
- Verify firewall/Caddy routes for any hostnames pointing to Hermes ports.
- vijay: reviewed current listeners and Caddy references; no Hermes-specific public hostname was identified. Re-run before adding any new route.
- Decide private access pattern for any future dashboard:
- vijay: selected private-only access with local binding plus Tailscale/SSH tunnel; Tailscale is installed and
tailscaledis enabled/running, but tailnet login remains a credential/auth step. - local-only binding
- SSH tunnel
- Tailscale/WireGuard
- Cloudflare Access or equivalent identity gate
- basic auth plus IP allowlist only if a public route is unavoidable
- vijay: selected private-only access with local binding plus Tailscale/SSH tunnel; Tailscale is installed and
- Keep command approvals at
manualorsmart; do not globally use approval bypass for the gateway.- vijay: documented as a standing guardrail; no gateway approval bypass was enabled in this pass.
Phase 1 — Health Baseline And Diagnostics
- Run and capture
hermes --version.- vijay: captured
Hermes Agent v0.14.0 (2026.5.16), project/usr/local/lib/hermes-agent, update available. - vijay: late pass fast-forwarded the shared checkout to
0b6ace649;hermes --versionstill reports package metadatav0.14.0.
- vijay: captured
- Run and capture
hermes config check.- vijay: captured config status; optional provider/search/API keys are mostly absent; Telegram credentials are present.
- Investigate why
hermes doctortimed out.- vijay: reran
timeout 240 hermes doctor --fix; it completed successfully. - Re-run with a longer timeout from a foreground shell.
- If still hanging, isolate the step by checking logs and dependencies.
- vijay: not needed after longer foreground run succeeded.
- File or fix a Hermes bug if the timeout is reproducible.
- vijay: not reproducible in this pass; no bug filed.
- vijay: reran
- Run
hermes status --alland save a sanitized baseline summary.- vijay: baseline summary added to
docs/hermes-operations.md. - vijay: late pass verified both root and Uma gateway services active after restart; provider smoke tests returned
root-roadmap-okanduma-roadmap-ok.
- vijay: baseline summary added to
- Check gateway service health:
- vijay:
hermes-gateway.serviceis active/running under systemd. systemctl status hermes-gatewayor the actual installed service unit- recent gateway logs under
~/.hermes/logs/ - Telegram send/receive smoke test
- vijay: current conversation verifies Telegram inbound/outbound path.
- vijay:
- Check cron scheduler health and last-run status.
- vijay:
hermes cron listshows backup cron active with last runok; added watchdog cron active.
- vijay:
- Check disk, memory, CPU, open ports, and long-running Hermes processes.
- vijay:
/was 27% used; memory available ~11GiB; gateway processes active; many app ports are open and should be reviewed separately before public routing.
- vijay:
- Create a recurring monthly
Hermes setup reviewchecklist from this baseline.- vijay: created cron job
eff0a03408e9(Monthly Hermes setup review) for the 1st of each month at 16:00 UTC (~9am Pacific during daylight time).
- vijay: created cron job
Phase 2 — Backup, Restore, And Migration Readiness
- Keep the existing persistent-data backup cron active.
- vijay: job
470832621b43remains active every 30m.
- vijay: job
- Verify the backup repository receives fresh commits after real state changes.
- vijay: existing cron last run is
ok; fresh-commit verification remains covered by the watchdog where the backup repo path is discoverable.
- vijay: existing cron last run is
- Confirm the backup intentionally excludes raw secrets and
state.db.- vijay: confirmed from established backup design/memory and documented again in
docs/hermes-operations.md.
- vijay: confirmed from established backup design/memory and documented again in
- Add a restore rehearsal checklist:
- vijay: added restore drill outline to
docs/hermes-operations.md. - clone backup repo into a temporary directory
- run restore script in dry-run mode if available
- verify config, skills, sessions, cron, memory, and scripts restore into a test profile
- confirm no raw
.env, OAuth token, or credential file appears in git
- vijay: added restore drill outline to
- Add a quarterly restore drill reminder cron job or calendar task.
- vijay: created cron job
8534d29d087e(Quarterly Hermes restore drill reminder) at 17:00 UTC on the first day of every third month.
- vijay: created cron job
- Document exact restore commands in a ByteLyst ops doc.
- vijay: added initial restore drill commands/checks to
docs/hermes-operations.md; a full live restore test is still future work.
- vijay: added initial restore drill commands/checks to
Phase 3 — Upgrade Strategy
- Check whether Hermes is already at the latest stable release before each upgrade.
- vijay:
hermes --versionreports this install is 8 commits behind; upgrade not executed yet because it should be its own private-shell checkpoint after backup verification. - vijay: late pass fetched upstream and found the shared checkout behind; working tree was clean.
- vijay:
- Before upgrading:
- vijay: pre-upgrade command checklist added to
docs/hermes-operations.md. - run backup sync manually
- vijay: root persistent backup cron was active with last run
ok; root and Uma configs/service units were snapshotted under/root/hermes-fix-backups/20260527-roadmap-noncreds/before upgrade.
- vijay: root persistent backup cron was active with last run
- capture
hermes --version,hermes status --all, andhermes config check- vijay: captured version/config checks for root and Uma; both show config v24 after Uma doctor migration.
- snapshot config and cron job list
- vijay: copied root/Uma config and systemd unit definitions before upgrade; captured cron list for both profiles.
- vijay: pre-upgrade command checklist added to
- Upgrade Hermes from an interactive shell, not from a public-facing workflow.
- vijay: documented; no public workflow exposure added.
- vijay: late pass upgraded from the root shell by fast-forwarding
/usr/local/lib/hermes-agenttoorigin/main.
- After upgrade:
- vijay: post-upgrade verification checklist added to
docs/hermes-operations.md; actual upgrade still pending. - restart gateway
- vijay: restarted both
hermes-gateway.serviceanduma-hermes-gateway.service.
- vijay: restarted both
- run Telegram smoke test
- vijay: direct provider smoke tests passed for root and Uma; live Telegram path remains active via gateway services.
- verify cron still runs
- vijay:
hermes cron listshowed root backup cron active and Uma reminders active before restart; services remained active after restart.
- vijay:
- run one safe terminal/file task
- vijay: safe shell/status checks and repo hygiene updates completed from the operator shell.
- run one memory/session-search task
- vijay: post-upgrade verification checklist added to
- Record upgrade date, version, and any manual fixups in
docs/operations.mdor a Hermes-specific ops note.- vijay: created
docs/hermes-operations.mdas the Hermes-specific ops note. - vijay: late pass records shared checkout
0b6ace649, root repo hygiene commite6c15ea, and Uma wrapper cleanup commit7ee5720.
- vijay: created
Phase 4 — Provider And Model Resilience
- Keep OpenAI Codex OAuth as the primary provider if it remains stable.
- vijay: root and Uma both remain on
openai-codexwithgpt-5.5; routing stays disabled after the earliergpt-5.4-minifailure path.
- vijay: root and Uma both remain on
- Add at least one fallback provider for resilience:
- OpenRouter
- Google/Gemini
- Anthropic
- local/Ollama if useful for low-risk offline tasks
- Configure provider credentials through Hermes auth/config flows; do not commit keys.
- vijay: documented the command path; provider additions requiring new credentials remain pending.
- Define model routing tiers:
- fast/cheap model for routine summaries and simple ops
- strong coding model for repo work
- vision-capable model for screenshots/images
- long-context model for large transcripts and audits
- Test fallback behavior by switching models in a new session.
- Document the preferred default model and fallback order.
- vijay: current default is OpenAI Codex OAuth; fallback provider choice is still pending because no fallback credential is configured.
- vijay: preferred default is explicitly
gpt-5.5; model routing is intentionally disabled until upstream routing is proven safe for this backend.
Phase 5 — Tooling Capability Upgrade
- Enable/configure at least one reliable web search/extract backend:
- Exa
- Tavily
- Firecrawl
- SearXNG self-hosted option
- Configure browser automation only if needed and keep it private/safe:
- local Chromium/Camofox, or
- Browserbase/Browser Use
- Configure GitHub/Gitea automation credentials with least privilege.
- Add vision/image capability if screenshots, diagrams, or UI reviews are common.
- Validate the active Telegram toolset includes the capabilities ByteLyst expects:
- vijay:
hermes doctor --fixreported browser, clarify, code_execution, cronjob, terminal, delegation, file, memory, messaging, session_search, skills, todo, tts, vision, video, and related toolsets available; web remains blocked by missing search backend API key. - terminal
- file
- search/session_search
- memory
- skills
- cronjob
- messaging
- delegation
- browser is available; web search/extract still needs a backend API key
- vijay:
- Document tool enablement changes and restart/reset requirements.
- vijay: added restart/reset notes to
docs/hermes-operations.md.
- vijay: added restart/reset notes to
Phase 6 — Telegram Gateway Workflow
- Keep Telegram as the primary control plane.
- vijay: watchdog delivery is configured to the origin Telegram conversation; dashboard remains private-only/pending.
- Preserve the user's preferred progress prefix convention:
1️⃣,2️⃣, etc.- vijay: retained in roadmap and memory; use for progress/completion updates from Hermes sessions.
- Ensure home channel and allowed user settings are correct.
- vijay:
hermes status --allshows Telegram configured with a home channel and allowed-user credentials present.
- vijay:
- Add smoke-test steps for:
- vijay: added gateway smoke-test bullets to
docs/hermes-operations.md. - inbound Telegram command
- outbound completion message
- approval prompt flow
- media/file delivery
- vijay: added gateway smoke-test bullets to
- Decide whether Telegram topic/session handling should be enabled or documented.
- Add a runbook for gateway restart/recovery.
- vijay: added gateway recovery section to
docs/hermes-operations.md.
- vijay: added gateway recovery section to
Phase 7 — Memory, Skills, And Knowledge Capture
- Review persistent memory for stale entries and trim anything no longer useful.
- Keep memories declarative and durable; avoid storing task-completion artifacts.
- Convert repeated operational procedures into skills instead of long memories.
- Pin critical ByteLyst/Hermes skills that should not be archived.
- Schedule or manually run curator reviews if enabled.
- Add skills for recurring ByteLyst workflows:
- Gitea Actions troubleshooting
- Caddy + Docker routing changes
- Hermes backup/restore drill
- Telegram gateway recovery
- safe multi-repo commit/push workflow
Phase 8 — Cron, Watchdogs, And Autonomous Maintenance
- Keep current Hermes backup cron job enabled.
- vijay: backup cron remains active.
- Add watchdogs that notify Telegram only on actionable failures:
- vijay: installed
~/.hermes/scripts/hermes_health_watchdog.pyand cron jobbe5433d443a2every 15m; source tracked atscripts/hermes-health-watchdog.py. - gateway down
- cron scheduler stale
- backup job failed or no fresh commit within threshold
- disk usage high
- memory pressure high
- Caddy/Gitea critical services down
- vijay: installed
- Prefer
no_agent=Truescript-only watchdogs for fixed health checks.- vijay: watchdog cron is no-agent/script-only and silent on success.
- Keep noisy health checks silent on success.
- vijay: manual script test produced empty output on a healthy run.
- Use self-contained prompts for any LLM-driven cron jobs.
- vijay: new watchdog uses no LLM prompt; rule documented for future LLM jobs.
- Avoid recursive cron creation from cron-run sessions.
- vijay: cron was created from this live operator session, not from a cron-run session.
Phase 9 — Private Dashboard / Mission Control Direction
- Do not expose Hermes dashboard publicly.
- vijay: no public dashboard/API route added; private-only policy documented.
- If a dashboard is useful, make it private-only and operationally scoped.
- vijay: selected private-only dashboard direction; installed Tailscale daemon for future private access. Dashboard itself is not running and no
9119/9120listener is exposed.
- vijay: selected private-only dashboard direction; installed Tailscale daemon for future private access. Dashboard itself is not running and no
- Dashboard should show:
- gateway status
- active sessions
- cron job state
- backup freshness
- recent sanitized alerts
- quick links to docs/runbooks
- Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel.
- vijay: standing decision is local/Tailscale/SSH-only. Tailnet login and dashboard auth validation remain tomorrow tasks.
- Add a Caddy review step before adding any new hostname.
- vijay: added Caddy/port review commands to
docs/hermes-operations.md.
- vijay: added Caddy/port review commands to
Phase 10 — Multi-Agent And Project Execution Workflow
- Use
delegate_taskfor bounded subtasks inside a parent session. - Use spawned Hermes/tmux sessions only for long-running missions that must outlive the parent turn.
- Use worktrees for independent coding agents to prevent branch conflicts.
- For durable multi-agent coordination, evaluate Hermes Kanban.
- Document when to use:
- direct tool call
- delegate_task
- background terminal process
- cron job
- Kanban worker
- Add a ByteLyst convention for progress/completion Telegram notifications from concurrent sessions.
Phase 11 — Security And Secret Hygiene
- Reconfirm raw
.env, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups.- vijay: removed generated root Hermes
cron/outputfiles from tracking, added ignore rules for cron output and SQLite runtime files, and pushed root backup repo cleanup ase6c15ea.
- vijay: removed generated root Hermes
- Consider enabling
security.redact_secretsif the operational tradeoff is acceptable. - Keep
privacy.redact_piidecision documented for gateway sessions. - Rotate old credentials after migration or accidental exposure risk.
- Use least-privilege tokens for GitHub/Gitea, web APIs, and provider keys.
- Add a pre-commit or manual scan step before pushing Hermes backup/config changes.
- vijay: added manual scan/review step in practice during root/Uma repo pushes; root backup repo now ignores generated cron outputs that previously carried noisy token-pattern scan results.
- Keep approval mode at
manualorsmartfor Telegram-driven work.
Phase 12 — Documentation And Runbooks
- Add a Hermes operations index under
docs/.- vijay: created
docs/hermes-operations.md.
- vijay: created
- Link this roadmap from
docs/repo-map.md.- vijay: roadmap was already listed; added
docs/hermes-operations.mdto repo map.
- vijay: roadmap was already listed; added
- Create or update runbooks for:
- installing/upgrading Hermes
- restarting the gateway
- restoring persistent data from backup
- configuring providers/models
- enabling/disabling tools
- adding safe cron watchdogs
- private-only dashboard access
- Keep commands copy-pasteable and include expected outputs.
- vijay: copied operational commands into
docs/hermes-operations.md; expected-output notes included where useful.
- vijay: copied operational commands into
- Store secrets only as placeholder variable names or
.env.exampleentries.- vijay: no raw secrets were added to docs or scripts.
Priority Execution Plan
Immediate — Today / Next Session
- Confirm no public Hermes dashboard route exists.
- Investigate
hermes doctortimeout. - Verify backup cron freshness and remote push status.
- Add one Telegram watchdog for gateway/backup failure.
- Choose and configure one web search backend.
Near-Term — This Week
- Add fallback model/provider.
- Document provider routing and model defaults.
- Add gateway recovery runbook.
- Add restore drill runbook and perform one test-profile restore.
- Add Gitea/GitHub least-privilege automation credential path.
Medium-Term — This Month
- Evaluate private-only dashboard/mission-control UX.
- Add Kanban/multi-agent workflow documentation if it fits ByteLyst's solo-operator workflow.
- Add silent-on-success system watchdogs.
- Clean up stale memory/skills and pin critical skills.
- Schedule quarterly restore drills.
Acceptance Criteria
This roadmap is complete when:
- Hermes can be upgraded and rolled back/restored with a documented process.
- Gateway failures and backup failures notify Telegram.
- At least one fallback model/provider is configured and tested.
- Web/search tooling works for current research tasks.
- No Hermes dashboard/API is publicly exposed.
- Backup restore has been tested into a non-production profile.
- Core ByteLyst Hermes procedures exist as docs or skills.
- Sensitive files remain untracked and backup-safe.
Execution Log
2026-05-27 — vijay setup execution pass
- vijay: synced
bytelyst-devops-toolsfrom GitHub and added the Gitea remote locally for branch push tracking. - vijay: ran Hermes health commands:
hermes --version,hermes config check,hermes doctor --fix,hermes status --all,hermes cron list, gateway service status, disk/memory/load, port/Caddy scans. - vijay:
hermes doctor --fixcompleted and migrated config v23 → v24. - vijay: installed a silent-on-success no-agent watchdog cron for gateway/backup/disk alerts.
- vijay: created
docs/hermes-operations.md, updateddocs/operations.md, and added this roadmap progress commentary. - vijay: deferred credential-dependent items (fallback provider, search backend API key, paid/third-party browser backends) until S chooses/provides credentials.
- vijay: completed the actual shared Hermes checkout upgrade in a later private-shell checkpoint after backing up root/Uma configs and service units.
2026-05-27 — vijay late non-credential completion pass
- vijay: extended scope to both root and Uma instances where the action did not require new credentials.
- vijay: backed up root/Uma configs and systemd units to
/root/hermes-fix-backups/20260527-roadmap-noncreds/. - vijay: migrated Uma Hermes config v23 → v24 with
hermes doctor --fix; root was already v24. - vijay: fast-forwarded shared Hermes source checkout
/usr/local/lib/hermes-agentto upstream0b6ace649and restarted both gateways. - vijay: verified root and Uma provider smoke tests:
root-roadmap-ok,uma-roadmap-ok. - vijay: confirmed both services are enabled and active; Docker-based Uma Hermes remains removed.
- vijay: installed Tailscale
1.98.3;tailscaledis enabled/running and awaits tailnet login. - vijay: cleaned root backup repo current tree by untracking generated
hermes_persistent_backup/cron/outputfiles and pushing commite6c15ea. - vijay: confirmed Uma wrapper repo is clean at
7ee5720after Docker deployment removal.
Notes For Future Transcript Pass
When the transcript is available, specifically check whether the video recommends any of the following and update this roadmap accordingly:
- exact provider/model choices
- recommended Hermes install path
- gateway platform setup details
- dashboard or web UI exposure guidance
- memory/skill workflows
- MCP server recommendations
- cron/background agent patterns
- voice/STT/TTS setup
- any security warnings or anti-patterns