From a8cf61a281a33729770cd3e133a376b0eaa2603d Mon Sep 17 00:00:00 2001 From: Hermes VM Date: Sat, 30 May 2026 08:05:52 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20Phase=208=20=E2=80=94=20Telegram=20conv?= =?UTF-8?q?ention=20+=20delegation=20brief?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the Phase 8 line that's actually a docs/codebase change. The other two Phase 8 items are VM-ops work (bot tokens + watchdog extensions) and live as a delegation brief. What's in this repo - `docs/hermes-operations.md` gains a "Telegram Notification Convention" section codifying: * routing per instance (Vijay → root chat, Bheem → Uma chat, cross-cutting → root) * silent-on-healthy + post-on-recovery * the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) and why it survives Telegram client rendering * approval-prompt UI expectation * "don't paste secrets" pointer back to `lib/logger.ts`'s redaction path-list - `docs/prompts/phase8-telegram-loop.md` — full delegation brief for the VM-side implementation. Design: dashboard backend writes new warnings (with `instance=` tag, deduped over 1h) to an append-only log; both watchdogs tail it and route through the existing Telegram delivery path. Avoids splitting the delivery code into two places that would each need rate-limit + token- rotation handling. Brief is gated on Phase 4 — Uma's watchdog must exist first. - Roadmap Phase 8 ticked for "preserve numbered-emoji convention" (codified in operations doc); the other two items have notes pointing at the brief. Phase 8 doesn't fully close in this repo because the delivery loop needs real bot tokens and the Phase 4 Uma watchdog before it can be end-to-end validated. The codebase's contribution is everything that doesn't need a token: the convention, the design, and the delegation brief. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com> --- docs/hermes-operations.md | 38 +++++++++ docs/hermes_dashboard_v2_roadmap.md | 10 ++- docs/prompts/phase8-telegram-loop.md | 121 +++++++++++++++++++++++++++ 3 files changed, 165 insertions(+), 4 deletions(-) create mode 100644 docs/prompts/phase8-telegram-loop.md diff --git a/docs/hermes-operations.md b/docs/hermes-operations.md index 85c9e8d..e96d42f 100644 --- a/docs/hermes-operations.md +++ b/docs/hermes-operations.md @@ -392,3 +392,41 @@ Safe sequence: 5. Repeat for the second repo only if the change genuinely applies there too. Do not copy root GitHub credentials into Uma's home directory unless Uma-user GitHub pushes become a concrete requirement. + +## Telegram Notification Convention + +Phase 8 of the dashboard roadmap (and the watchdog scripts that ship Telegram +alerts today) follow a small set of conventions worth keeping consistent. + +**Routing per instance** +- Vijay (root) alerts go to the root Telegram chat. +- Bheem (uma) alerts go to Uma's Telegram chat. +- Cross-cutting alerts (e.g. "the dashboard itself is unreachable") go to the + root chat — root is the operator account. + +**Silent on healthy** +- Watchdog scripts and (in future) the dashboard's own Telegram hook **only** + post when something is wrong. A green poll is a no-op. +- Recoveries ARE a Telegram event (one line: "back to healthy") so the chat + history reflects the full incident lifecycle. + +**Numbered-emoji progress convention** +- When a multi-step operation is being narrated to Telegram, prefix each step + with the corresponding numbered emoji: `1️⃣`, `2️⃣`, `3️⃣`, … up to `🔟`. +- This survives copy-paste across clients (unlike `1.`, which Telegram tends + to render inconsistently in dark mode) and makes the chat scannable. +- The watchdog scripts already emit completion updates this way; any + dashboard-originated message that runs through the same delivery path + should match. + +**Approval prompts** +- Approval-required actions still land in Telegram with two inline buttons + (✅ approve / ❌ deny). The dashboard does not yet trigger these — see the + Phase 8 delegation brief in `docs/prompts/phase8-telegram-loop.md` for the + design that closes the loop end-to-end. + +**Don't paste secrets** +- Bot tokens and chat IDs live in `~/.config/hermes/telegram` mode `600`, + never in repo files. The dashboard's `lib/logger.ts` redacts + `Authorization` / `Cookie` / `*.token` paths from any logged object so an + accidental `req.log.info({ tg })` won't dump credentials. diff --git a/docs/hermes_dashboard_v2_roadmap.md b/docs/hermes_dashboard_v2_roadmap.md index 05c71e4..3955977 100644 --- a/docs/hermes_dashboard_v2_roadmap.md +++ b/docs/hermes_dashboard_v2_roadmap.md @@ -148,9 +148,11 @@ This is the biggest operational asymmetry and the reason half the ops-panel warn ## Phase 8 — Notifications & Telegram loop (G9) -- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. -- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items). -- [ ] Preserve the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) for completion updates. +> **Mostly VM ops + bot-token configuration**, with two small backend hooks. Full delegation brief in [`docs/prompts/phase8-telegram-loop.md`](./prompts/phase8-telegram-loop.md). The dashboard's documentation half is already done — see `docs/hermes-operations.md` "Telegram Notification Convention". + +- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. *(Design captured in the brief: `lib/dashboard-alerts.ts` writes new warnings to a tag-prefixed log; both watchdogs tail it. Implementation gated on Phase 4 (Uma watchdog must exist first) and on bot tokens.)* +- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items). *(Brief item 3.)* +- [x] Preserve the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) for completion updates. *(Codified in `docs/hermes-operations.md` under a new "Telegram Notification Convention" section, alongside the routing-per-instance, silent-on-healthy, and never-paste-secrets rules. The brief references this as the source of truth so VM-side implementers stay consistent.)* --- @@ -194,7 +196,7 @@ Update only with evidence (source review, tests, build output, or browser/VM ver - [x] Phase 5 — App/CI hardening (P0/P1/P2 done; P2 follow-ups in DEPLOYMENT.md mitigation roadmap remain) - [x] Phase 6 — UX polish (severity tags + deep links + per-instance actions; trend cards + theme toggle deferred) - [x] Phase 7 — Security & access (auth on hermes routes + privacy stance documented; redact_secrets/redact_pii decision deferred) -- [ ] Phase 8 — Notifications & Telegram +- [ ] Phase 8 — Notifications & Telegram (convention codified; delivery loop is VM ops, see brief) ## Decisions (resolved 2026-05-30) diff --git a/docs/prompts/phase8-telegram-loop.md b/docs/prompts/phase8-telegram-loop.md new file mode 100644 index 0000000..52f8a3d --- /dev/null +++ b/docs/prompts/phase8-telegram-loop.md @@ -0,0 +1,121 @@ +# Delegation Brief — Phase 8: Telegram notification loop + +> Self-contained task brief. Mostly **VM ops + bot-token configuration**, with +> two small backend-side hooks. The dashboard has already done its half of +> Phase 8 — see `docs/hermes-operations.md` "Telegram Notification Convention". +> +> Related: `docs/hermes_dashboard_v2_roadmap.md` (Phase 8), +> `docs/hermes-operations.md`, `scripts/hermes-health-watchdog.py`, +> `docs/prompts/phase4-bheem-uma-parity.md` (Bheem watchdog needs to exist +> first). + +--- + +ROLE: Operator with sudo on the Hostinger VM and admin access to both Telegram +bots (root + Uma). + +OBJECTIVE: Close the loop between dashboard-detected warnings and the +existing watchdog Telegram delivery path so that: + +1. New warnings that the dashboard surfaces in `getHermesOpsSnapshot()` (and + in the per-instance telemetry endpoint) reach the right Telegram chat + (Vijay → root; Bheem → Uma). +2. Approval-required actions (currently only the watchdog uses these) work + end-to-end including media/file delivery — these are the two unchecked + items left over from Hermes v1. +3. The numbered-emoji progress convention is preserved. + +PREREQUISITES: +- Phase 4 (Bheem/Uma parity) must be complete so Uma has its own watchdog + + bot. Without Uma's bot, Bheem warnings have nowhere to go. Don't start + Phase 8 until the Phase 4 brief signs off. +- Root + Uma watchdog scripts already deliver to Telegram successfully on a + manually-broken probe. Confirm before proceeding. + +DESIGN (least-invasive, no new long-lived service): + +The dashboard does NOT open its own Telegram connection. Instead, the +backend writes new dashboard-detected warnings to a small append-only log +that the existing watchdog tails. + +- Path: `/var/log/hermes-dashboard-warnings.log` (root-writeable; world- + readable so both watchdogs can tail). +- Format: one line per warning, RFC3339-ish timestamp + severity token + + message — same shape as `hermes-health-watchdog.log`. Reuse the parser in + `backend/src/modules/hermes-telemetry/repository.ts:WATCHDOG_LINE`. +- Routing: each line carries an explicit `instance=` tag + so the watchdog knows which Telegram bot to use. `instance=all` posts to + both chats (cross-cutting). +- De-dup: the dashboard backend keeps a 1h in-memory hash of recent + warnings and only appends each one once. Restart resets the hash — that's + fine; an alert reappearing post-restart is signal, not noise. + +TASKS: + +1. **Backend hook** (small): + - Add `lib/dashboard-alerts.ts` that exposes + `appendDashboardWarning({ severity, instance, message })`. Internals: + append + dedup hash. Tests should mock `fs.promises.appendFile`. + - Wire it into `getHermesOpsSnapshot()` so each new warning in + `snapshot.warnings` (only the ones not in the dedup hash) is written + out. Same wiring on `getHermesTelemetrySnapshot()` for `warnings` and + `watchdog.alerts` of severity `critical`. + - Gate the file-write behind an env flag `HERMES_DASHBOARD_ALERT_LOG` + pointing at the path so dev/CI doesn't try to write to `/var/log`. + - Unit-test: appendDashboardWarning de-dups within the window, expires, + and writes the right line format. Add to the coverage gate. + +2. **Watchdog tail-extension** (VM ops): + - Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the + new dashboard-warnings log. Filter by `instance=` tag — root's + watchdog only acts on `instance=vijay` or `instance=all`; Uma's only + on `instance=bheem` or `instance=all`. + - Forward each parsed line into the existing Telegram delivery (same + format / same numbered-emoji convention). Silent on no-new-lines. + +3. **Approval-prompt + media validation** (the two unchecked v1 items): + - Pick a Telegram approval-required action that already exists in the + watchdog (e.g. "restart degraded gateway"). Confirm the inline ✅/❌ + buttons land, the callback hits the watchdog, and the action runs. + - Confirm the watchdog can deliver a small file (e.g. last 200 lines of + a log) when an alert says "investigate", and that the file lands as + a Telegram document, not a truncated message. + - Document both flows in `docs/hermes-operations.md` under "Telegram + Notification Convention" so anyone reading it knows what's wired. + +4. **End-to-end test**: + - From the dashboard, trigger a transient warning (e.g. stop a non- + critical timer for 30 seconds). + - Confirm the right Telegram chat receives one alert (numbered-emoji + formatted) and one recovery message when the timer comes back. The + OTHER chat must stay silent. + - Repeat for Bheem. + - Repeat with `instance=all` (cross-cutting) and confirm BOTH chats + receive the alert. + +GUARDRAILS: +- Bot tokens never go in repo files. They live in `~/.config/hermes/ + telegram` mode `600`, owned by the right user. +- The dashboard backend only WRITES to the alert log. It must NOT call + Telegram directly (that would split the delivery path and create two + places where rate limits / token rotation matter). +- Don't emit chatty health pings — silent-on-success is the rule. +- Numbered-emoji convention is mandatory for completion-update messages + (`1️⃣`, `2️⃣`, …); see `docs/hermes-operations.md`. + +REPORTING: +When finished, report: +- Diff of `lib/dashboard-alerts.ts` and the wiring sites. +- Output of `tail -20 /var/log/hermes-dashboard-warnings.log` after the + end-to-end test. +- Screenshots / chat-export of the test alerts in both Telegram chats + (sanitized). +- Updated `docs/hermes-operations.md` "Telegram Notification Convention" + section with the wired approval + media flows. + +DEFINITION OF DONE: +- Dashboard-detected warnings reach the right Telegram chat per instance. +- Recoveries reach the same chat. +- Approval prompt + file delivery validated end-to-end. +- Numbered-emoji convention preserved. +- Operator (you) ticks the corresponding Phase 8 roadmap checkboxes.