docs: Phase 8 — Telegram convention + delegation brief

Closes the Phase 8 line that's actually a docs/codebase change. The
other two Phase 8 items are VM-ops work (bot tokens + watchdog
extensions) and live as a delegation brief.

What's in this repo
  - `docs/hermes-operations.md` gains a "Telegram Notification
    Convention" section codifying:
      * routing per instance (Vijay → root chat, Bheem → Uma chat,
        cross-cutting → root)
      * silent-on-healthy + post-on-recovery
      * the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) and
        why it survives Telegram client rendering
      * approval-prompt UI expectation
      * "don't paste secrets" pointer back to `lib/logger.ts`'s
        redaction path-list
  - `docs/prompts/phase8-telegram-loop.md` — full delegation brief
    for the VM-side implementation. Design: dashboard backend writes
    new warnings (with `instance=<id>` tag, deduped over 1h) to an
    append-only log; both watchdogs tail it and route through the
    existing Telegram delivery path. Avoids splitting the delivery
    code into two places that would each need rate-limit + token-
    rotation handling. Brief is gated on Phase 4 — Uma's watchdog
    must exist first.
  - Roadmap Phase 8 ticked for "preserve numbered-emoji convention"
    (codified in operations doc); the other two items have notes
    pointing at the brief.

Phase 8 doesn't fully close in this repo because the delivery loop
needs real bot tokens and the Phase 4 Uma watchdog before it can be
end-to-end validated. The codebase's contribution is everything that
doesn't need a token: the convention, the design, and the delegation
brief.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
This commit is contained in:
Hermes VM 2026-05-30 08:05:52 +00:00
parent 14c7a8f59a
commit a8cf61a281
3 changed files with 165 additions and 4 deletions

View File

@ -392,3 +392,41 @@ Safe sequence:
5. Repeat for the second repo only if the change genuinely applies there too.
Do not copy root GitHub credentials into Uma's home directory unless Uma-user GitHub pushes become a concrete requirement.
## Telegram Notification Convention
Phase 8 of the dashboard roadmap (and the watchdog scripts that ship Telegram
alerts today) follow a small set of conventions worth keeping consistent.
**Routing per instance**
- Vijay (root) alerts go to the root Telegram chat.
- Bheem (uma) alerts go to Uma's Telegram chat.
- Cross-cutting alerts (e.g. "the dashboard itself is unreachable") go to the
root chat — root is the operator account.
**Silent on healthy**
- Watchdog scripts and (in future) the dashboard's own Telegram hook **only**
post when something is wrong. A green poll is a no-op.
- Recoveries ARE a Telegram event (one line: "back to healthy") so the chat
history reflects the full incident lifecycle.
**Numbered-emoji progress convention**
- When a multi-step operation is being narrated to Telegram, prefix each step
with the corresponding numbered emoji: `1⃣`, `2⃣`, `3⃣`, … up to `🔟`.
- This survives copy-paste across clients (unlike `1.`, which Telegram tends
to render inconsistently in dark mode) and makes the chat scannable.
- The watchdog scripts already emit completion updates this way; any
dashboard-originated message that runs through the same delivery path
should match.
**Approval prompts**
- Approval-required actions still land in Telegram with two inline buttons
(✅ approve / ❌ deny). The dashboard does not yet trigger these — see the
Phase 8 delegation brief in `docs/prompts/phase8-telegram-loop.md` for the
design that closes the loop end-to-end.
**Don't paste secrets**
- Bot tokens and chat IDs live in `~<user>/.config/hermes/telegram` mode `600`,
never in repo files. The dashboard's `lib/logger.ts` redacts
`Authorization` / `Cookie` / `*.token` paths from any logged object so an
accidental `req.log.info({ tg })` won't dump credentials.

View File

@ -148,9 +148,11 @@ This is the biggest operational asymmetry and the reason half the ops-panel warn
## Phase 8 — Notifications & Telegram loop (G9)
- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy.
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items).
- [ ] Preserve the numbered-emoji progress convention (`1⃣`, `2⃣`, …) for completion updates.
> **Mostly VM ops + bot-token configuration**, with two small backend hooks. Full delegation brief in [`docs/prompts/phase8-telegram-loop.md`](./prompts/phase8-telegram-loop.md). The dashboard's documentation half is already done — see `docs/hermes-operations.md` "Telegram Notification Convention".
- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. *(Design captured in the brief: `lib/dashboard-alerts.ts` writes new warnings to a tag-prefixed log; both watchdogs tail it. Implementation gated on Phase 4 (Uma watchdog must exist first) and on bot tokens.)*
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items). *(Brief item 3.)*
- [x] Preserve the numbered-emoji progress convention (`1⃣`, `2⃣`, …) for completion updates. *(Codified in `docs/hermes-operations.md` under a new "Telegram Notification Convention" section, alongside the routing-per-instance, silent-on-healthy, and never-paste-secrets rules. The brief references this as the source of truth so VM-side implementers stay consistent.)*
---
@ -194,7 +196,7 @@ Update only with evidence (source review, tests, build output, or browser/VM ver
- [x] Phase 5 — App/CI hardening (P0/P1/P2 done; P2 follow-ups in DEPLOYMENT.md mitigation roadmap remain)
- [x] Phase 6 — UX polish (severity tags + deep links + per-instance actions; trend cards + theme toggle deferred)
- [x] Phase 7 — Security & access (auth on hermes routes + privacy stance documented; redact_secrets/redact_pii decision deferred)
- [ ] Phase 8 — Notifications & Telegram
- [ ] Phase 8 — Notifications & Telegram (convention codified; delivery loop is VM ops, see brief)
## Decisions (resolved 2026-05-30)

View File

@ -0,0 +1,121 @@
# Delegation Brief — Phase 8: Telegram notification loop
> Self-contained task brief. Mostly **VM ops + bot-token configuration**, with
> two small backend-side hooks. The dashboard has already done its half of
> Phase 8 — see `docs/hermes-operations.md` "Telegram Notification Convention".
>
> Related: `docs/hermes_dashboard_v2_roadmap.md` (Phase 8),
> `docs/hermes-operations.md`, `scripts/hermes-health-watchdog.py`,
> `docs/prompts/phase4-bheem-uma-parity.md` (Bheem watchdog needs to exist
> first).
---
ROLE: Operator with sudo on the Hostinger VM and admin access to both Telegram
bots (root + Uma).
OBJECTIVE: Close the loop between dashboard-detected warnings and the
existing watchdog Telegram delivery path so that:
1. New warnings that the dashboard surfaces in `getHermesOpsSnapshot()` (and
in the per-instance telemetry endpoint) reach the right Telegram chat
(Vijay → root; Bheem → Uma).
2. Approval-required actions (currently only the watchdog uses these) work
end-to-end including media/file delivery — these are the two unchecked
items left over from Hermes v1.
3. The numbered-emoji progress convention is preserved.
PREREQUISITES:
- Phase 4 (Bheem/Uma parity) must be complete so Uma has its own watchdog +
bot. Without Uma's bot, Bheem warnings have nowhere to go. Don't start
Phase 8 until the Phase 4 brief signs off.
- Root + Uma watchdog scripts already deliver to Telegram successfully on a
manually-broken probe. Confirm before proceeding.
DESIGN (least-invasive, no new long-lived service):
The dashboard does NOT open its own Telegram connection. Instead, the
backend writes new dashboard-detected warnings to a small append-only log
that the existing watchdog tails.
- Path: `/var/log/hermes-dashboard-warnings.log` (root-writeable; world-
readable so both watchdogs can tail).
- Format: one line per warning, RFC3339-ish timestamp + severity token +
message — same shape as `hermes-health-watchdog.log`. Reuse the parser in
`backend/src/modules/hermes-telemetry/repository.ts:WATCHDOG_LINE`.
- Routing: each line carries an explicit `instance=<vijay|bheem|all>` tag
so the watchdog knows which Telegram bot to use. `instance=all` posts to
both chats (cross-cutting).
- De-dup: the dashboard backend keeps a 1h in-memory hash of recent
warnings and only appends each one once. Restart resets the hash — that's
fine; an alert reappearing post-restart is signal, not noise.
TASKS:
1. **Backend hook** (small):
- Add `lib/dashboard-alerts.ts` that exposes
`appendDashboardWarning({ severity, instance, message })`. Internals:
append + dedup hash. Tests should mock `fs.promises.appendFile`.
- Wire it into `getHermesOpsSnapshot()` so each new warning in
`snapshot.warnings` (only the ones not in the dedup hash) is written
out. Same wiring on `getHermesTelemetrySnapshot()` for `warnings` and
`watchdog.alerts` of severity `critical`.
- Gate the file-write behind an env flag `HERMES_DASHBOARD_ALERT_LOG`
pointing at the path so dev/CI doesn't try to write to `/var/log`.
- Unit-test: appendDashboardWarning de-dups within the window, expires,
and writes the right line format. Add to the coverage gate.
2. **Watchdog tail-extension** (VM ops):
- Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the
new dashboard-warnings log. Filter by `instance=` tag — root's
watchdog only acts on `instance=vijay` or `instance=all`; Uma's only
on `instance=bheem` or `instance=all`.
- Forward each parsed line into the existing Telegram delivery (same
format / same numbered-emoji convention). Silent on no-new-lines.
3. **Approval-prompt + media validation** (the two unchecked v1 items):
- Pick a Telegram approval-required action that already exists in the
watchdog (e.g. "restart degraded gateway"). Confirm the inline ✅/❌
buttons land, the callback hits the watchdog, and the action runs.
- Confirm the watchdog can deliver a small file (e.g. last 200 lines of
a log) when an alert says "investigate", and that the file lands as
a Telegram document, not a truncated message.
- Document both flows in `docs/hermes-operations.md` under "Telegram
Notification Convention" so anyone reading it knows what's wired.
4. **End-to-end test**:
- From the dashboard, trigger a transient warning (e.g. stop a non-
critical timer for 30 seconds).
- Confirm the right Telegram chat receives one alert (numbered-emoji
formatted) and one recovery message when the timer comes back. The
OTHER chat must stay silent.
- Repeat for Bheem.
- Repeat with `instance=all` (cross-cutting) and confirm BOTH chats
receive the alert.
GUARDRAILS:
- Bot tokens never go in repo files. They live in `~<user>/.config/hermes/
telegram` mode `600`, owned by the right user.
- The dashboard backend only WRITES to the alert log. It must NOT call
Telegram directly (that would split the delivery path and create two
places where rate limits / token rotation matter).
- Don't emit chatty health pings — silent-on-success is the rule.
- Numbered-emoji convention is mandatory for completion-update messages
(`1⃣`, `2⃣`, …); see `docs/hermes-operations.md`.
REPORTING:
When finished, report:
- Diff of `lib/dashboard-alerts.ts` and the wiring sites.
- Output of `tail -20 /var/log/hermes-dashboard-warnings.log` after the
end-to-end test.
- Screenshots / chat-export of the test alerts in both Telegram chats
(sanitized).
- Updated `docs/hermes-operations.md` "Telegram Notification Convention"
section with the wired approval + media flows.
DEFINITION OF DONE:
- Dashboard-detected warnings reach the right Telegram chat per instance.
- Recoveries reach the same chat.
- Approval prompt + file delivery validated end-to-end.
- Numbered-emoji convention preserved.
- Operator (you) ticks the corresponding Phase 8 roadmap checkboxes.