bytelyst-devops-tools/docs/hermes-setup-upgrade-roadmap.md

315 lines
14 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Hermes Setup Upgrade Roadmap
**Date:** 2026-05-26
**Owner:** ByteLyst / S
**Repo:** `bytelyst-devops-tools`
**Video reference:** [Hermes Agent is the greatest AI tool ever made. Here's how to set it up](https://youtu.be/RoBD7Lc-0MI) by Alex Finn
## Purpose
Turn the Hermes setup ideas from the referenced video into a practical ByteLyst upgrade checklist for this VM-backed, Telegram-driven Hermes installation.
This roadmap is intentionally operational: every item should either improve reliability, safety, agent capability, observability, or restore/migration readiness.
## Transcript Review Status
Automated transcript retrieval was attempted through multiple paths:
- Hermes `youtube-content` transcript helper using `youtube-transcript-api`
- `yt-dlp` subtitle extraction
- direct YouTube page/player metadata inspection
- Invidious caption endpoints
- third-party transcript endpoint probing
The video title and metadata were reachable, but transcript/subtitle retrieval was blocked by YouTube anti-bot checks from this VM/cloud IP. One Invidious endpoint confirmed an English auto-generated caption track exists, but returned an empty caption body.
Because the full transcript was not retrievable from the VM, this roadmap combines:
1. the accessible video metadata and setup theme,
2. Hermes Agent's current documented capabilities,
3. the live health/status of this ByteLyst Hermes installation, and
4. ByteLyst's existing operational preferences and safety constraints.
If a manual transcript is later pasted or uploaded, re-run this review and append a `Transcript-Derived Delta` section with any new actions.
## Current ByteLyst Hermes Baseline
Observed on 2026-05-26:
- Hermes version: `v0.14.0 (2026.5.16)`
- Project path: `/usr/local/lib/hermes-agent`
- Active model/provider: `gpt-5.4` via OpenAI Codex OAuth
- Telegram gateway: configured and running under systemd
- Scheduled jobs: `1 active, 1 total`
- `Sync Hermes persistent-data backup to GitHub`
- schedule: every 30 minutes
- delivery: local
- script: `sync_hermes_persistent_backup.py`
- last status: ok
- Config version: `23`
- Telegram credentials are present
- Most optional provider/API keys are not configured, including OpenRouter, Google/Gemini, Anthropic, Firecrawl/Tavily/Exa, Browserbase/Browser Use, GitHub token, FAL, and ElevenLabs
- `hermes doctor` timed out during this review and needs a dedicated diagnostic pass
- User preference: do **not** expose the Hermes dashboard publicly
## Target State
A healthy ByteLyst Hermes setup should be:
- **Private by default:** no public dashboard exposure; private access through local shell, Telegram DM, SSH tunnel, Tailscale, or equivalent.
- **Recoverable:** configuration, skills, memory, sessions, cron jobs, and scripts are backed up and periodically restore-tested.
- **Observable:** gateway, cron, disk, memory, and backup failures surface to Telegram quickly.
- **Capable:** web search/extraction, browser automation, GitHub/Gitea operations, vision, file, terminal, cron, memory, session search, and delegation are all configured where useful.
- **Safe:** secrets are not committed, destructive commands remain approval-gated, public Caddy exposure is explicitly reviewed, and profiles isolate risky experiments.
- **Self-improving:** recurring procedures are captured as skills; stale or wrong skills are patched immediately.
## Roadmap Checklist
### Phase 0 — Safety Freeze And Guardrails
- [ ] Confirm no Caddy route exposes a Hermes dashboard or Hermes API server publicly.
- [ ] Add a negative-control check to operational docs: `Hermes dashboard/API must not be public without explicit approval`.
- [ ] Verify firewall/Caddy routes for any hostnames pointing to Hermes ports.
- [ ] Decide private access pattern for any future dashboard:
- [ ] local-only binding
- [ ] SSH tunnel
- [ ] Tailscale/WireGuard
- [ ] Cloudflare Access or equivalent identity gate
- [ ] basic auth plus IP allowlist only if a public route is unavoidable
- [ ] Keep command approvals at `manual` or `smart`; do not globally use approval bypass for the gateway.
### Phase 1 — Health Baseline And Diagnostics
- [ ] Run and capture `hermes --version`.
- [ ] Run and capture `hermes config check`.
- [ ] Investigate why `hermes doctor` timed out.
- [ ] Re-run with a longer timeout from a foreground shell.
- [ ] If still hanging, isolate the step by checking logs and dependencies.
- [ ] File or fix a Hermes bug if the timeout is reproducible.
- [ ] Run `hermes status --all` and save a sanitized baseline summary.
- [ ] Check gateway service health:
- [ ] `systemctl status hermes-gateway` or the actual installed service unit
- [ ] recent gateway logs under `~/.hermes/logs/`
- [ ] Telegram send/receive smoke test
- [ ] Check cron scheduler health and last-run status.
- [ ] Check disk, memory, CPU, open ports, and long-running Hermes processes.
- [ ] Create a recurring monthly `Hermes setup review` checklist from this baseline.
### Phase 2 — Backup, Restore, And Migration Readiness
- [ ] Keep the existing persistent-data backup cron active.
- [ ] Verify the backup repository receives fresh commits after real state changes.
- [ ] Confirm the backup intentionally excludes raw secrets and `state.db`.
- [ ] Add a restore rehearsal checklist:
- [ ] clone backup repo into a temporary directory
- [ ] run restore script in dry-run mode if available
- [ ] verify config, skills, sessions, cron, memory, and scripts restore into a test profile
- [ ] confirm no raw `.env`, OAuth token, or credential file appears in git
- [ ] Add a quarterly restore drill reminder cron job or calendar task.
- [ ] Document exact restore commands in a ByteLyst ops doc.
### Phase 3 — Upgrade Strategy
- [ ] Check whether Hermes is already at the latest stable release before each upgrade.
- [ ] Before upgrading:
- [ ] run backup sync manually
- [ ] capture `hermes --version`, `hermes status --all`, and `hermes config check`
- [ ] snapshot config and cron job list
- [ ] Upgrade Hermes from an interactive shell, not from a public-facing workflow.
- [ ] After upgrade:
- [ ] restart gateway
- [ ] run Telegram smoke test
- [ ] verify cron still runs
- [ ] run one safe terminal/file task
- [ ] run one memory/session-search task
- [ ] Record upgrade date, version, and any manual fixups in `docs/operations.md` or a Hermes-specific ops note.
### Phase 4 — Provider And Model Resilience
- [ ] Keep OpenAI Codex OAuth as the primary provider if it remains stable.
- [ ] Add at least one fallback provider for resilience:
- [ ] OpenRouter
- [ ] Google/Gemini
- [ ] Anthropic
- [ ] local/Ollama if useful for low-risk offline tasks
- [ ] Configure provider credentials through Hermes auth/config flows; do not commit keys.
- [ ] Define model routing tiers:
- [ ] fast/cheap model for routine summaries and simple ops
- [ ] strong coding model for repo work
- [ ] vision-capable model for screenshots/images
- [ ] long-context model for large transcripts and audits
- [ ] Test fallback behavior by switching models in a new session.
- [ ] Document the preferred default model and fallback order.
### Phase 5 — Tooling Capability Upgrade
- [ ] Enable/configure at least one reliable web search/extract backend:
- [ ] Exa
- [ ] Tavily
- [ ] Firecrawl
- [ ] SearXNG self-hosted option
- [ ] Configure browser automation only if needed and keep it private/safe:
- [ ] local Chromium/Camofox, or
- [ ] Browserbase/Browser Use
- [ ] Configure GitHub/Gitea automation credentials with least privilege.
- [ ] Add vision/image capability if screenshots, diagrams, or UI reviews are common.
- [ ] Validate the active Telegram toolset includes the capabilities ByteLyst expects:
- [ ] terminal
- [ ] file
- [ ] search/session_search
- [ ] memory
- [ ] skills
- [ ] cronjob
- [ ] messaging
- [ ] delegation
- [ ] browser/web if configured
- [ ] Document tool enablement changes and restart/reset requirements.
### Phase 6 — Telegram Gateway Workflow
- [ ] Keep Telegram as the primary control plane.
- [ ] Preserve the user's preferred progress prefix convention: `1⃣`, `2⃣`, etc.
- [ ] Ensure home channel and allowed user settings are correct.
- [ ] Add smoke-test steps for:
- [ ] inbound Telegram command
- [ ] outbound completion message
- [ ] approval prompt flow
- [ ] media/file delivery
- [ ] Decide whether Telegram topic/session handling should be enabled or documented.
- [ ] Add a runbook for gateway restart/recovery.
### Phase 7 — Memory, Skills, And Knowledge Capture
- [ ] Review persistent memory for stale entries and trim anything no longer useful.
- [ ] Keep memories declarative and durable; avoid storing task-completion artifacts.
- [ ] Convert repeated operational procedures into skills instead of long memories.
- [ ] Pin critical ByteLyst/Hermes skills that should not be archived.
- [ ] Schedule or manually run curator reviews if enabled.
- [ ] Add skills for recurring ByteLyst workflows:
- [ ] Gitea Actions troubleshooting
- [ ] Caddy + Docker routing changes
- [ ] Hermes backup/restore drill
- [ ] Telegram gateway recovery
- [ ] safe multi-repo commit/push workflow
### Phase 8 — Cron, Watchdogs, And Autonomous Maintenance
- [ ] Keep current Hermes backup cron job enabled.
- [ ] Add watchdogs that notify Telegram only on actionable failures:
- [ ] gateway down
- [ ] cron scheduler stale
- [ ] backup job failed or no fresh commit within threshold
- [ ] disk usage high
- [ ] memory pressure high
- [ ] Caddy/Gitea critical services down
- [ ] Prefer `no_agent=True` script-only watchdogs for fixed health checks.
- [ ] Keep noisy health checks silent on success.
- [ ] Use self-contained prompts for any LLM-driven cron jobs.
- [ ] Avoid recursive cron creation from cron-run sessions.
### Phase 9 — Private Dashboard / Mission Control Direction
- [ ] Do not expose Hermes dashboard publicly.
- [ ] If a dashboard is useful, make it private-only and operationally scoped.
- [ ] Dashboard should show:
- [ ] gateway status
- [ ] active sessions
- [ ] cron job state
- [ ] backup freshness
- [ ] recent sanitized alerts
- [ ] quick links to docs/runbooks
- [ ] Any dashboard actions must require authentication and ideally remain reachable only over private network/tunnel.
- [ ] Add a Caddy review step before adding any new hostname.
### Phase 10 — Multi-Agent And Project Execution Workflow
- [ ] Use `delegate_task` for bounded subtasks inside a parent session.
- [ ] Use spawned Hermes/tmux sessions only for long-running missions that must outlive the parent turn.
- [ ] Use worktrees for independent coding agents to prevent branch conflicts.
- [ ] For durable multi-agent coordination, evaluate Hermes Kanban.
- [ ] Document when to use:
- [ ] direct tool call
- [ ] delegate_task
- [ ] background terminal process
- [ ] cron job
- [ ] Kanban worker
- [ ] Add a ByteLyst convention for progress/completion Telegram notifications from concurrent sessions.
### Phase 11 — Security And Secret Hygiene
- [ ] Reconfirm raw `.env`, OAuth credentials, tokens, logs, and SQLite WAL/SHM files are excluded from git backups.
- [ ] Consider enabling `security.redact_secrets` if the operational tradeoff is acceptable.
- [ ] Keep `privacy.redact_pii` decision documented for gateway sessions.
- [ ] Rotate old credentials after migration or accidental exposure risk.
- [ ] Use least-privilege tokens for GitHub/Gitea, web APIs, and provider keys.
- [ ] Add a pre-commit or manual scan step before pushing Hermes backup/config changes.
- [ ] Keep approval mode at `manual` or `smart` for Telegram-driven work.
### Phase 12 — Documentation And Runbooks
- [ ] Add a Hermes operations index under `docs/`.
- [ ] Link this roadmap from `docs/repo-map.md`.
- [ ] Create or update runbooks for:
- [ ] installing/upgrading Hermes
- [ ] restarting the gateway
- [ ] restoring persistent data from backup
- [ ] configuring providers/models
- [ ] enabling/disabling tools
- [ ] adding safe cron watchdogs
- [ ] private-only dashboard access
- [ ] Keep commands copy-pasteable and include expected outputs.
- [ ] Store secrets only as placeholder variable names or `.env.example` entries.
## Priority Execution Plan
### Immediate — Today / Next Session
- [ ] Confirm no public Hermes dashboard route exists.
- [ ] Investigate `hermes doctor` timeout.
- [ ] Verify backup cron freshness and remote push status.
- [ ] Add one Telegram watchdog for gateway/backup failure.
- [ ] Choose and configure one web search backend.
### Near-Term — This Week
- [ ] Add fallback model/provider.
- [ ] Document provider routing and model defaults.
- [ ] Add gateway recovery runbook.
- [ ] Add restore drill runbook and perform one test-profile restore.
- [ ] Add Gitea/GitHub least-privilege automation credential path.
### Medium-Term — This Month
- [ ] Evaluate private-only dashboard/mission-control UX.
- [ ] Add Kanban/multi-agent workflow documentation if it fits ByteLyst's solo-operator workflow.
- [ ] Add silent-on-success system watchdogs.
- [ ] Clean up stale memory/skills and pin critical skills.
- [ ] Schedule quarterly restore drills.
## Acceptance Criteria
This roadmap is complete when:
- [ ] Hermes can be upgraded and rolled back/restored with a documented process.
- [ ] Gateway failures and backup failures notify Telegram.
- [ ] At least one fallback model/provider is configured and tested.
- [ ] Web/search tooling works for current research tasks.
- [ ] No Hermes dashboard/API is publicly exposed.
- [ ] Backup restore has been tested into a non-production profile.
- [ ] Core ByteLyst Hermes procedures exist as docs or skills.
- [ ] Sensitive files remain untracked and backup-safe.
## Notes For Future Transcript Pass
When the transcript is available, specifically check whether the video recommends any of the following and update this roadmap accordingly:
- exact provider/model choices
- recommended Hermes install path
- gateway platform setup details
- dashboard or web UI exposure guidance
- memory/skill workflows
- MCP server recommendations
- cron/background agent patterns
- voice/STT/TTS setup
- any security warnings or anti-patterns