docs: Phase 8 — Telegram convention + delegation brief
Closes the Phase 8 line that's actually a docs/codebase change. The
other two Phase 8 items are VM-ops work (bot tokens + watchdog
extensions) and live as a delegation brief.
What's in this repo
- `docs/hermes-operations.md` gains a "Telegram Notification
Convention" section codifying:
* routing per instance (Vijay → root chat, Bheem → Uma chat,
cross-cutting → root)
* silent-on-healthy + post-on-recovery
* the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) and
why it survives Telegram client rendering
* approval-prompt UI expectation
* "don't paste secrets" pointer back to `lib/logger.ts`'s
redaction path-list
- `docs/prompts/phase8-telegram-loop.md` — full delegation brief
for the VM-side implementation. Design: dashboard backend writes
new warnings (with `instance=<id>` tag, deduped over 1h) to an
append-only log; both watchdogs tail it and route through the
existing Telegram delivery path. Avoids splitting the delivery
code into two places that would each need rate-limit + token-
rotation handling. Brief is gated on Phase 4 — Uma's watchdog
must exist first.
- Roadmap Phase 8 ticked for "preserve numbered-emoji convention"
(codified in operations doc); the other two items have notes
pointing at the brief.
Phase 8 doesn't fully close in this repo because the delivery loop
needs real bot tokens and the Phase 4 Uma watchdog before it can be
end-to-end validated. The codebase's contribution is everything that
doesn't need a token: the convention, the design, and the delegation
brief.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
This commit is contained in:
parent
14c7a8f59a
commit
a8cf61a281
@ -392,3 +392,41 @@ Safe sequence:
|
||||
5. Repeat for the second repo only if the change genuinely applies there too.
|
||||
|
||||
Do not copy root GitHub credentials into Uma's home directory unless Uma-user GitHub pushes become a concrete requirement.
|
||||
|
||||
## Telegram Notification Convention
|
||||
|
||||
Phase 8 of the dashboard roadmap (and the watchdog scripts that ship Telegram
|
||||
alerts today) follow a small set of conventions worth keeping consistent.
|
||||
|
||||
**Routing per instance**
|
||||
- Vijay (root) alerts go to the root Telegram chat.
|
||||
- Bheem (uma) alerts go to Uma's Telegram chat.
|
||||
- Cross-cutting alerts (e.g. "the dashboard itself is unreachable") go to the
|
||||
root chat — root is the operator account.
|
||||
|
||||
**Silent on healthy**
|
||||
- Watchdog scripts and (in future) the dashboard's own Telegram hook **only**
|
||||
post when something is wrong. A green poll is a no-op.
|
||||
- Recoveries ARE a Telegram event (one line: "back to healthy") so the chat
|
||||
history reflects the full incident lifecycle.
|
||||
|
||||
**Numbered-emoji progress convention**
|
||||
- When a multi-step operation is being narrated to Telegram, prefix each step
|
||||
with the corresponding numbered emoji: `1️⃣`, `2️⃣`, `3️⃣`, … up to `🔟`.
|
||||
- This survives copy-paste across clients (unlike `1.`, which Telegram tends
|
||||
to render inconsistently in dark mode) and makes the chat scannable.
|
||||
- The watchdog scripts already emit completion updates this way; any
|
||||
dashboard-originated message that runs through the same delivery path
|
||||
should match.
|
||||
|
||||
**Approval prompts**
|
||||
- Approval-required actions still land in Telegram with two inline buttons
|
||||
(✅ approve / ❌ deny). The dashboard does not yet trigger these — see the
|
||||
Phase 8 delegation brief in `docs/prompts/phase8-telegram-loop.md` for the
|
||||
design that closes the loop end-to-end.
|
||||
|
||||
**Don't paste secrets**
|
||||
- Bot tokens and chat IDs live in `~<user>/.config/hermes/telegram` mode `600`,
|
||||
never in repo files. The dashboard's `lib/logger.ts` redacts
|
||||
`Authorization` / `Cookie` / `*.token` paths from any logged object so an
|
||||
accidental `req.log.info({ tg })` won't dump credentials.
|
||||
|
||||
@ -148,9 +148,11 @@ This is the biggest operational asymmetry and the reason half the ops-panel warn
|
||||
|
||||
## Phase 8 — Notifications & Telegram loop (G9)
|
||||
|
||||
- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy.
|
||||
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items).
|
||||
- [ ] Preserve the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) for completion updates.
|
||||
> **Mostly VM ops + bot-token configuration**, with two small backend hooks. Full delegation brief in [`docs/prompts/phase8-telegram-loop.md`](./prompts/phase8-telegram-loop.md). The dashboard's documentation half is already done — see `docs/hermes-operations.md` "Telegram Notification Convention".
|
||||
|
||||
- [ ] Push new dashboard-detected warnings to the correct Telegram (Vijay → root chat, Bheem → Uma chat), reusing the watchdog delivery path; silent on healthy. *(Design captured in the brief: `lib/dashboard-alerts.ts` writes new warnings to a tag-prefixed log; both watchdogs tail it. Implementation gated on Phase 4 (Uma watchdog must exist first) and on bot tokens.)*
|
||||
- [ ] Validate the Telegram approval-prompt flow and media/file delivery end-to-end (the two unchecked v1 items). *(Brief item 3.)*
|
||||
- [x] Preserve the numbered-emoji progress convention (`1️⃣`, `2️⃣`, …) for completion updates. *(Codified in `docs/hermes-operations.md` under a new "Telegram Notification Convention" section, alongside the routing-per-instance, silent-on-healthy, and never-paste-secrets rules. The brief references this as the source of truth so VM-side implementers stay consistent.)*
|
||||
|
||||
---
|
||||
|
||||
@ -194,7 +196,7 @@ Update only with evidence (source review, tests, build output, or browser/VM ver
|
||||
- [x] Phase 5 — App/CI hardening (P0/P1/P2 done; P2 follow-ups in DEPLOYMENT.md mitigation roadmap remain)
|
||||
- [x] Phase 6 — UX polish (severity tags + deep links + per-instance actions; trend cards + theme toggle deferred)
|
||||
- [x] Phase 7 — Security & access (auth on hermes routes + privacy stance documented; redact_secrets/redact_pii decision deferred)
|
||||
- [ ] Phase 8 — Notifications & Telegram
|
||||
- [ ] Phase 8 — Notifications & Telegram (convention codified; delivery loop is VM ops, see brief)
|
||||
|
||||
## Decisions (resolved 2026-05-30)
|
||||
|
||||
|
||||
121
docs/prompts/phase8-telegram-loop.md
Normal file
121
docs/prompts/phase8-telegram-loop.md
Normal file
@ -0,0 +1,121 @@
|
||||
# Delegation Brief — Phase 8: Telegram notification loop
|
||||
|
||||
> Self-contained task brief. Mostly **VM ops + bot-token configuration**, with
|
||||
> two small backend-side hooks. The dashboard has already done its half of
|
||||
> Phase 8 — see `docs/hermes-operations.md` "Telegram Notification Convention".
|
||||
>
|
||||
> Related: `docs/hermes_dashboard_v2_roadmap.md` (Phase 8),
|
||||
> `docs/hermes-operations.md`, `scripts/hermes-health-watchdog.py`,
|
||||
> `docs/prompts/phase4-bheem-uma-parity.md` (Bheem watchdog needs to exist
|
||||
> first).
|
||||
|
||||
---
|
||||
|
||||
ROLE: Operator with sudo on the Hostinger VM and admin access to both Telegram
|
||||
bots (root + Uma).
|
||||
|
||||
OBJECTIVE: Close the loop between dashboard-detected warnings and the
|
||||
existing watchdog Telegram delivery path so that:
|
||||
|
||||
1. New warnings that the dashboard surfaces in `getHermesOpsSnapshot()` (and
|
||||
in the per-instance telemetry endpoint) reach the right Telegram chat
|
||||
(Vijay → root; Bheem → Uma).
|
||||
2. Approval-required actions (currently only the watchdog uses these) work
|
||||
end-to-end including media/file delivery — these are the two unchecked
|
||||
items left over from Hermes v1.
|
||||
3. The numbered-emoji progress convention is preserved.
|
||||
|
||||
PREREQUISITES:
|
||||
- Phase 4 (Bheem/Uma parity) must be complete so Uma has its own watchdog +
|
||||
bot. Without Uma's bot, Bheem warnings have nowhere to go. Don't start
|
||||
Phase 8 until the Phase 4 brief signs off.
|
||||
- Root + Uma watchdog scripts already deliver to Telegram successfully on a
|
||||
manually-broken probe. Confirm before proceeding.
|
||||
|
||||
DESIGN (least-invasive, no new long-lived service):
|
||||
|
||||
The dashboard does NOT open its own Telegram connection. Instead, the
|
||||
backend writes new dashboard-detected warnings to a small append-only log
|
||||
that the existing watchdog tails.
|
||||
|
||||
- Path: `/var/log/hermes-dashboard-warnings.log` (root-writeable; world-
|
||||
readable so both watchdogs can tail).
|
||||
- Format: one line per warning, RFC3339-ish timestamp + severity token +
|
||||
message — same shape as `hermes-health-watchdog.log`. Reuse the parser in
|
||||
`backend/src/modules/hermes-telemetry/repository.ts:WATCHDOG_LINE`.
|
||||
- Routing: each line carries an explicit `instance=<vijay|bheem|all>` tag
|
||||
so the watchdog knows which Telegram bot to use. `instance=all` posts to
|
||||
both chats (cross-cutting).
|
||||
- De-dup: the dashboard backend keeps a 1h in-memory hash of recent
|
||||
warnings and only appends each one once. Restart resets the hash — that's
|
||||
fine; an alert reappearing post-restart is signal, not noise.
|
||||
|
||||
TASKS:
|
||||
|
||||
1. **Backend hook** (small):
|
||||
- Add `lib/dashboard-alerts.ts` that exposes
|
||||
`appendDashboardWarning({ severity, instance, message })`. Internals:
|
||||
append + dedup hash. Tests should mock `fs.promises.appendFile`.
|
||||
- Wire it into `getHermesOpsSnapshot()` so each new warning in
|
||||
`snapshot.warnings` (only the ones not in the dedup hash) is written
|
||||
out. Same wiring on `getHermesTelemetrySnapshot()` for `warnings` and
|
||||
`watchdog.alerts` of severity `critical`.
|
||||
- Gate the file-write behind an env flag `HERMES_DASHBOARD_ALERT_LOG`
|
||||
pointing at the path so dev/CI doesn't try to write to `/var/log`.
|
||||
- Unit-test: appendDashboardWarning de-dups within the window, expires,
|
||||
and writes the right line format. Add to the coverage gate.
|
||||
|
||||
2. **Watchdog tail-extension** (VM ops):
|
||||
- Modify both watchdog scripts (root + Uma's mirror) to ALSO tail the
|
||||
new dashboard-warnings log. Filter by `instance=` tag — root's
|
||||
watchdog only acts on `instance=vijay` or `instance=all`; Uma's only
|
||||
on `instance=bheem` or `instance=all`.
|
||||
- Forward each parsed line into the existing Telegram delivery (same
|
||||
format / same numbered-emoji convention). Silent on no-new-lines.
|
||||
|
||||
3. **Approval-prompt + media validation** (the two unchecked v1 items):
|
||||
- Pick a Telegram approval-required action that already exists in the
|
||||
watchdog (e.g. "restart degraded gateway"). Confirm the inline ✅/❌
|
||||
buttons land, the callback hits the watchdog, and the action runs.
|
||||
- Confirm the watchdog can deliver a small file (e.g. last 200 lines of
|
||||
a log) when an alert says "investigate", and that the file lands as
|
||||
a Telegram document, not a truncated message.
|
||||
- Document both flows in `docs/hermes-operations.md` under "Telegram
|
||||
Notification Convention" so anyone reading it knows what's wired.
|
||||
|
||||
4. **End-to-end test**:
|
||||
- From the dashboard, trigger a transient warning (e.g. stop a non-
|
||||
critical timer for 30 seconds).
|
||||
- Confirm the right Telegram chat receives one alert (numbered-emoji
|
||||
formatted) and one recovery message when the timer comes back. The
|
||||
OTHER chat must stay silent.
|
||||
- Repeat for Bheem.
|
||||
- Repeat with `instance=all` (cross-cutting) and confirm BOTH chats
|
||||
receive the alert.
|
||||
|
||||
GUARDRAILS:
|
||||
- Bot tokens never go in repo files. They live in `~<user>/.config/hermes/
|
||||
telegram` mode `600`, owned by the right user.
|
||||
- The dashboard backend only WRITES to the alert log. It must NOT call
|
||||
Telegram directly (that would split the delivery path and create two
|
||||
places where rate limits / token rotation matter).
|
||||
- Don't emit chatty health pings — silent-on-success is the rule.
|
||||
- Numbered-emoji convention is mandatory for completion-update messages
|
||||
(`1️⃣`, `2️⃣`, …); see `docs/hermes-operations.md`.
|
||||
|
||||
REPORTING:
|
||||
When finished, report:
|
||||
- Diff of `lib/dashboard-alerts.ts` and the wiring sites.
|
||||
- Output of `tail -20 /var/log/hermes-dashboard-warnings.log` after the
|
||||
end-to-end test.
|
||||
- Screenshots / chat-export of the test alerts in both Telegram chats
|
||||
(sanitized).
|
||||
- Updated `docs/hermes-operations.md` "Telegram Notification Convention"
|
||||
section with the wired approval + media flows.
|
||||
|
||||
DEFINITION OF DONE:
|
||||
- Dashboard-detected warnings reach the right Telegram chat per instance.
|
||||
- Recoveries reach the same chat.
|
||||
- Approval prompt + file delivery validated end-to-end.
|
||||
- Numbered-emoji convention preserved.
|
||||
- Operator (you) ticks the corresponding Phase 8 roadmap checkboxes.
|
||||
Loading…
Reference in New Issue
Block a user