docs: Phase 8 — Telegram convention + delegation brief

Closes the Phase 8 line that's actually a docs/codebase change. The
other two Phase 8 items are VM-ops work (bot tokens + watchdog
extensions) and live as a delegation brief.

What's in this repo
  - `docs/hermes-operations.md` gains a "Telegram Notification
    Convention" section codifying:
      * routing per instance (Vijay → root chat, Bheem → Uma chat,
        cross-cutting → root)
      * silent-on-healthy + post-on-recovery
      * the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) and
        why it survives Telegram client rendering
      * approval-prompt UI expectation
      * "don't paste secrets" pointer back to `lib/logger.ts`'s
        redaction path-list
  - `docs/prompts/phase8-telegram-loop.md` — full delegation brief
    for the VM-side implementation. Design: dashboard backend writes
    new warnings (with `instance=<id>` tag, deduped over 1h) to an
    append-only log; both watchdogs tail it and route through the
    existing Telegram delivery path. Avoids splitting the delivery
    code into two places that would each need rate-limit + token-
    rotation handling. Brief is gated on Phase 4 — Uma's watchdog
    must exist first.
  - Roadmap Phase 8 ticked for "preserve numbered-emoji convention"
    (codified in operations doc); the other two items have notes
    pointing at the brief.

Phase 8 doesn't fully close in this repo because the delivery loop
needs real bot tokens and the Phase 4 Uma watchdog before it can be
end-to-end validated. The codebase's contribution is everything that
doesn't need a token: the convention, the design, and the delegation
brief.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
This commit is contained in:
Hermes VM 2026-05-30 08:05:52 +00:00
parent 14c7a8f59a
commit a8cf61a281
3 changed files with 165 additions and 4 deletions

View File

@ -392,3 +392,41 @@ Safe sequence:
5. Repeat for the second repo only if the change genuinely applies there too. 5. Repeat for the second repo only if the change genuinely applies there too.
Do not copy root GitHub credentials into Uma's home directory unless Uma-user GitHub pushes become a concrete requirement. Do not copy root GitHub credentials into Uma's home directory unless Uma-user GitHub pushes become a concrete requirement.
## Telegram Notification Convention
Phase 8 of the dashboard roadmap (and the watchdog scripts that ship Telegram
alerts today) follow a small set of conventions worth keeping consistent.
**Routing per instance**
- Vijay (root) alerts go to the root Telegram chat.
- Bheem (uma) alerts go to Uma's Telegram chat.
- Cross-cutting alerts (e.g. "the dashboard itself is unreachable") go to the
root chat — root is the operator account.
**Silent on healthy**
- Watchdog scripts and (in future) the dashboard's own Telegram hook **only**
post when something is wrong. A green poll is a no-op.
- Recoveries ARE a Telegram event (one line: "back to healthy") so the chat
history reflects the full incident lifecycle.
**Numbered-emoji progress convention**
- When a multi-step operation is being narrated to Telegram, prefix each step
with the corresponding numbered emoji: `1⃣`, `2⃣`, `3⃣`, … up to `🔟`.
- This survives copy-paste across clients (unlike `1.`, which Telegram tends
to render inconsistently in dark mode) and makes the chat scannable.
- The watchdog scripts already emit completion updates this way; any
dashboard-originated message that runs through the same delivery path
should match.
**Approval prompts**
- Approval-required actions still land in Telegram with two inline buttons
(✅ approve / ❌ deny). The dashboard does not yet trigger these — see the
Phase 8 delegation brief in `docs/prompts/phase8-telegram-loop.md` for the
design that closes the loop end-to-end.
**Don't paste secrets**
- Bot tokens and chat IDs live in `~<user>/.config/hermes/telegram` mode `600`,
never in repo files. The dashboard's `lib/logger.ts` redacts
`Authorization` / `Cookie` / `*.token` paths from any logged object so an
accidental `req.log.info({ tg })` won't dump credentials.

View File

@ -148,9 +148,11 @@ This is the biggest operational asymmetry and the reason half the ops-panel warn
## Phase 8 — Notifications & Telegram loop (G9) ## Phase 8 — Notifications & Telegram loop (G9)
- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. > **Mostly VM ops + bot-token configuration**, with two small backend hooks. Full delegation brief in [`docs/prompts/phase8-telegram-loop.md`](./prompts/phase8-telegram-loop.md). The dashboard's documentation half is already done — see `docs/hermes-operations.md` "Telegram Notification Convention".
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items).
- [ ] Preserve the numbered-emoji progress convention (`1⃣`, `2⃣`, …) for completion updates. - [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. *(Design captured in the brief: `lib/dashboard-alerts.ts` writes new warnings to a tag-prefixed log; both watchdogs tail it. Implementation gated on Phase 4 (Uma watchdog must exist first) and on bot tokens.)*
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items). *(Brief item 3.)*
- [x] Preserve the numbered-emoji progress convention (`1⃣`, `2⃣`, …) for completion updates. *(Codified in `docs/hermes-operations.md` under a new "Telegram Notification Convention" section, alongside the routing-per-instance, silent-on-healthy, and never-paste-secrets rules. The brief references this as the source of truth so VM-side implementers stay consistent.)*
--- ---
@ -194,7 +196,7 @@ Update only with evidence (source review, tests, build output, or browser/VM ver
- [x] Phase 5 — App/CI hardening (P0/P1/P2 done; P2 follow-ups in DEPLOYMENT.md mitigation roadmap remain) - [x] Phase 5 — App/CI hardening (P0/P1/P2 done; P2 follow-ups in DEPLOYMENT.md mitigation roadmap remain)
- [x] Phase 6 — UX polish (severity tags + deep links + per-instance actions; trend cards + theme toggle deferred) - [x] Phase 6 — UX polish (severity tags + deep links + per-instance actions; trend cards + theme toggle deferred)
- [x] Phase 7 — Security & access (auth on hermes routes + privacy stance documented; redact_secrets/redact_pii decision deferred) - [x] Phase 7 — Security & access (auth on hermes routes + privacy stance documented; redact_secrets/redact_pii decision deferred)
- [ ] Phase 8 — Notifications & Telegram - [ ] Phase 8 — Notifications & Telegram (convention codified; delivery loop is VM ops, see brief)
## Decisions (resolved 2026-05-30) ## Decisions (resolved 2026-05-30)

View File

@ -0,0 +1,121 @@
# Delegation Brief — Phase 8: Telegram notification loop
> Self-contained task brief. Mostly **VM ops + bot-token configuration**, with
> two small backend-side hooks. The dashboard has already done its half of
> Phase 8 — see `docs/hermes-operations.md` "Telegram Notification Convention".
>
> Related: `docs/hermes_dashboard_v2_roadmap.md` (Phase 8),
> `docs/hermes-operations.md`, `scripts/hermes-health-watchdog.py`,
> `docs/prompts/phase4-bheem-uma-parity.md` (Bheem watchdog needs to exist
> first).
---
ROLE: Operator with sudo on the Hostinger VM and admin access to both Telegram
bots (root + Uma).
OBJECTIVE: Close the loop between dashboard-detected warnings and the
existing watchdog Telegram delivery path so that:
1. New warnings that the dashboard surfaces in `getHermesOpsSnapshot()` (and
in the per-instance telemetry endpoint) reach the right Telegram chat
(Vijay → root; Bheem → Uma).
2. Approval-required actions (currently only the watchdog uses these) work
end-to-end including media/file delivery — these are the two unchecked
items left over from Hermes v1.
3. The numbered-emoji progress convention is preserved.
PREREQUISITES:
- Phase 4 (Bheem/Uma parity) must be complete so Uma has its own watchdog +
bot. Without Uma's bot, Bheem warnings have nowhere to go. Don't start
Phase 8 until the Phase 4 brief signs off.
- Root + Uma watchdog scripts already deliver to Telegram successfully on a
manually-broken probe. Confirm before proceeding.
DESIGN (least-invasive, no new long-lived service):
The dashboard does NOT open its own Telegram connection. Instead, the
backend writes new dashboard-detected warnings to a small append-only log
that the existing watchdog tails.
- Path: `/var/log/hermes-dashboard-warnings.log` (root-writeable; world-
readable so both watchdogs can tail).
- Format: one line per warning, RFC3339-ish timestamp + severity token +
message — same shape as `hermes-health-watchdog.log`. Reuse the parser in
`backend/src/modules/hermes-telemetry/repository.ts:WATCHDOG_LINE`.
- Routing: each line carries an explicit `instance=<vijay|bheem|all>` tag
so the watchdog knows which Telegram bot to use. `instance=all` posts to
both chats (cross-cutting).
- De-dup: the dashboard backend keeps a 1h in-memory hash of recent
warnings and only appends each one once. Restart resets the hash — that's
fine; an alert reappearing post-restart is signal, not noise.
TASKS:
1. **Backend hook** (small):
- Add `lib/dashboard-alerts.ts` that exposes
`appendDashboardWarning({ severity, instance, message })`. Internals:
append + dedup hash. Tests should mock `fs.promises.appendFile`.
- Wire it into `getHermesOpsSnapshot()` so each new warning in
`snapshot.warnings` (only the ones not in the dedup hash) is written
out. Same wiring on `getHermesTelemetrySnapshot()` for `warnings` and
`watchdog.alerts` of severity `critical`.
- Gate the file-write behind an env flag `HERMES_DASHBOARD_ALERT_LOG`
pointing at the path so dev/CI doesn't try to write to `/var/log`.
- Unit-test: appendDashboardWarning de-dups within the window, expires,
and writes the right line format. Add to the coverage gate.
2. **Watchdog tail-extension** (VM ops):
- Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the
new dashboard-warnings log. Filter by `instance=` tag — root's
watchdog only acts on `instance=vijay` or `instance=all`; Uma's only
on `instance=bheem` or `instance=all`.
- Forward each parsed line into the existing Telegram delivery (same
format / same numbered-emoji convention). Silent on no-new-lines.
3. **Approval-prompt + media validation** (the two unchecked v1 items):
- Pick a Telegram approval-required action that already exists in the
watchdog (e.g. "restart degraded gateway"). Confirm the inline ✅/❌
buttons land, the callback hits the watchdog, and the action runs.
- Confirm the watchdog can deliver a small file (e.g. last 200 lines of
a log) when an alert says "investigate", and that the file lands as
a Telegram document, not a truncated message.
- Document both flows in `docs/hermes-operations.md` under "Telegram
Notification Convention" so anyone reading it knows what's wired.
4. **End-to-end test**:
- From the dashboard, trigger a transient warning (e.g. stop a non-
critical timer for 30 seconds).
- Confirm the right Telegram chat receives one alert (numbered-emoji
formatted) and one recovery message when the timer comes back. The
OTHER chat must stay silent.
- Repeat for Bheem.
- Repeat with `instance=all` (cross-cutting) and confirm BOTH chats
receive the alert.
GUARDRAILS:
- Bot tokens never go in repo files. They live in `~<user>/.config/hermes/
telegram` mode `600`, owned by the right user.
- The dashboard backend only WRITES to the alert log. It must NOT call
Telegram directly (that would split the delivery path and create two
places where rate limits / token rotation matter).
- Don't emit chatty health pings — silent-on-success is the rule.
- Numbered-emoji convention is mandatory for completion-update messages
(`1⃣`, `2⃣`, …); see `docs/hermes-operations.md`.
REPORTING:
When finished, report:
- Diff of `lib/dashboard-alerts.ts` and the wiring sites.
- Output of `tail -20 /var/log/hermes-dashboard-warnings.log` after the
end-to-end test.
- Screenshots / chat-export of the test alerts in both Telegram chats
(sanitized).
- Updated `docs/hermes-operations.md` "Telegram Notification Convention"
section with the wired approval + media flows.
DEFINITION OF DONE:
- Dashboard-detected warnings reach the right Telegram chat per instance.
- Recoveries reach the same chat.
- Approval prompt + file delivery validated end-to-end.
- Numbered-emoji convention preserved.
- Operator (you) ticks the corresponding Phase 8 roadmap checkboxes.