Closes the Phase 8 line that's actually a docs/codebase change. The
other two Phase 8 items are VM-ops work (bot tokens + watchdog
extensions) and live as a delegation brief.
What's in this repo
- `docs/hermes-operations.md` gains a "Telegram Notification
Convention" section codifying:
* routing per instance (Vijay → root chat, Bheem → Uma chat,
cross-cutting → root)
* silent-on-healthy + post-on-recovery
* the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) and
why it survives Telegram client rendering
* approval-prompt UI expectation
* "don't paste secrets" pointer back to `lib/logger.ts`'s
redaction path-list
- `docs/prompts/phase8-telegram-loop.md` — full delegation brief
for the VM-side implementation. Design: dashboard backend writes
new warnings (with `instance=<id>` tag, deduped over 1h) to an
append-only log; both watchdogs tail it and route through the
existing Telegram delivery path. Avoids splitting the delivery
code into two places that would each need rate-limit + token-
rotation handling. Brief is gated on Phase 4 — Uma's watchdog
must exist first.
- Roadmap Phase 8 ticked for "preserve numbered-emoji convention"
(codified in operations doc); the other two items have notes
pointing at the brief.
Phase 8 doesn't fully close in this repo because the delivery loop
needs real bot tokens and the Phase 4 Uma watchdog before it can be
end-to-end validated. The codebase's contribution is everything that
doesn't need a token: the convention, the design, and the delegation
brief.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
5.8 KiB
Delegation Brief — Phase 8: Telegram notification loop
Self-contained task brief. Mostly VM ops + bot-token configuration, with two small backend-side hooks. The dashboard has already done its half of Phase 8 — see
docs/hermes-operations.md"Telegram Notification Convention".Related:
docs/hermes_dashboard_v2_roadmap.md(Phase 8),docs/hermes-operations.md,scripts/hermes-health-watchdog.py,docs/prompts/phase4-bheem-uma-parity.md(Bheem watchdog needs to exist first).
ROLE: Operator with sudo on the Hostinger VM and admin access to both Telegram bots (root + Uma).
OBJECTIVE: Close the loop between dashboard-detected warnings and the existing watchdog Telegram delivery path so that:
- New warnings that the dashboard surfaces in
getHermesOpsSnapshot()(and in the per-instance telemetry endpoint) reach the right Telegram chat (Vijay → root; Bheem → Uma). - Approval-required actions (currently only the watchdog uses these) work end-to-end including media/file delivery — these are the two unchecked items left over from Hermes v1.
- The numbered-emoji progress convention is preserved.
PREREQUISITES:
- Phase 4 (Bheem/Uma parity) must be complete so Uma has its own watchdog + bot. Without Uma's bot, Bheem warnings have nowhere to go. Don't start Phase 8 until the Phase 4 brief signs off.
- Root + Uma watchdog scripts already deliver to Telegram successfully on a manually-broken probe. Confirm before proceeding.
DESIGN (least-invasive, no new long-lived service):
The dashboard does NOT open its own Telegram connection. Instead, the backend writes new dashboard-detected warnings to a small append-only log that the existing watchdog tails.
- Path:
/var/log/hermes-dashboard-warnings.log(root-writeable; world- readable so both watchdogs can tail). - Format: one line per warning, RFC3339-ish timestamp + severity token +
message — same shape as
hermes-health-watchdog.log. Reuse the parser inbackend/src/modules/hermes-telemetry/repository.ts:WATCHDOG_LINE. - Routing: each line carries an explicit
instance=<vijay|bheem|all>tag so the watchdog knows which Telegram bot to use.instance=allposts to both chats (cross-cutting). - De-dup: the dashboard backend keeps a 1h in-memory hash of recent warnings and only appends each one once. Restart resets the hash — that's fine; an alert reappearing post-restart is signal, not noise.
TASKS:
-
Backend hook (small):
- Add
lib/dashboard-alerts.tsthat exposesappendDashboardWarning({ severity, instance, message }). Internals: append + dedup hash. Tests should mockfs.promises.appendFile. - Wire it into
getHermesOpsSnapshot()so each new warning insnapshot.warnings(only the ones not in the dedup hash) is written out. Same wiring ongetHermesTelemetrySnapshot()forwarningsandwatchdog.alertsof severitycritical. - Gate the file-write behind an env flag
HERMES_DASHBOARD_ALERT_LOGpointing at the path so dev/CI doesn't try to write to/var/log. - Unit-test: appendDashboardWarning de-dups within the window, expires, and writes the right line format. Add to the coverage gate.
- Add
-
Watchdog tail-extension (VM ops):
- Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the
new dashboard-warnings log. Filter by
instance=tag — root's watchdog only acts oninstance=vijayorinstance=all; Uma's only oninstance=bheemorinstance=all. - Forward each parsed line into the existing Telegram delivery (same format / same numbered-emoji convention). Silent on no-new-lines.
- Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the
new dashboard-warnings log. Filter by
-
Approval-prompt + media validation (the two unchecked v1 items):
- Pick a Telegram approval-required action that already exists in the watchdog (e.g. "restart degraded gateway"). Confirm the inline ✅/❌ buttons land, the callback hits the watchdog, and the action runs.
- Confirm the watchdog can deliver a small file (e.g. last 200 lines of a log) when an alert says "investigate", and that the file lands as a Telegram document, not a truncated message.
- Document both flows in
docs/hermes-operations.mdunder "Telegram Notification Convention" so anyone reading it knows what's wired.
-
End-to-end test:
- From the dashboard, trigger a transient warning (e.g. stop a non- critical timer for 30 seconds).
- Confirm the right Telegram chat receives one alert (numbered-emoji formatted) and one recovery message when the timer comes back. The OTHER chat must stay silent.
- Repeat for Bheem.
- Repeat with
instance=all(cross-cutting) and confirm BOTH chats receive the alert.
GUARDRAILS:
- Bot tokens never go in repo files. They live in
~<user>/.config/hermes/ telegrammode600, owned by the right user. - The dashboard backend only WRITES to the alert log. It must NOT call Telegram directly (that would split the delivery path and create two places where rate limits / token rotation matter).
- Don't emit chatty health pings — silent-on-success is the rule.
- Numbered-emoji convention is mandatory for completion-update messages
(
1️⃣,2️⃣, …); seedocs/hermes-operations.md.
REPORTING: When finished, report:
- Diff of
lib/dashboard-alerts.tsand the wiring sites. - Output of
tail -20 /var/log/hermes-dashboard-warnings.logafter the end-to-end test. - Screenshots / chat-export of the test alerts in both Telegram chats (sanitized).
- Updated
docs/hermes-operations.md"Telegram Notification Convention" section with the wired approval + media flows.
DEFINITION OF DONE:
- Dashboard-detected warnings reach the right Telegram chat per instance.
- Recoveries reach the same chat.
- Approval prompt + file delivery validated end-to-end.
- Numbered-emoji convention preserved.
- Operator (you) ticks the corresponding Phase 8 roadmap checkboxes.