bytelyst-devops-tools/systemd
Hermes VM 76ef17f26b feat(vm): Phase 2.3 closure — OOM watchdog + orphan-container docs
OOM watchdog:
- vm-oom-watchdog.sh — scans journalctl -k since cursor for oom-kill,
  killed-process, and "out of memory ... killed" entries; maps cgroup
  hits back to container names via docker inspect; posts a single
  Telegram alert per scan window (no dedupe needed — cursor advances
  on every run). Cursor at /var/log/vm-oom-cursor, log at
  /var/log/vm-oom-watchdog.log.
- Systemd: OnBootSec=10min, OnUnitActiveSec=1h, Persistent=true.

Orphan containers (no compose file on disk):
- trading-backend → docker update --memory=768m (high-I/O bot)
- gitea-npm-registry → docker update --memory=512m
- orphan-containers.md captures canonical configs for recovery
  (env, mounts, networks, restart policy, memory limits).

Closes Phase 2.3 (post-monitoring) and Phase 3.3 (orphan limits).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 05:26:49 +00:00
..
bytelyst-gitea-backup.service feat: add gitea backup timer assets 2026-05-27 18:53:20 +00:00
bytelyst-gitea-backup.timer feat: add gitea backup timer assets 2026-05-27 18:53:20 +00:00
docker-health-watchdog.service feat(vm): Phases 1.2, 1.4, 2.1 — steal time, swap pressure, health watchdog 2026-05-27 21:31:09 +00:00
docker-health-watchdog.timer feat(vm): Phases 1.2, 1.4, 2.1 — steal time, swap pressure, health watchdog 2026-05-27 21:31:09 +00:00
hermes-emergency-drive-upload.service Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-emergency-drive-upload.timer Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-gateway.service Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
hermes-root-backup.service Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
hermes-root-backup.timer Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
hermes-root-dashboard.service Complete Hermes dashboard and watchdog roadmap audit 2026-05-27 10:45:29 +00:00
uma-hermes-backup.service Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
uma-hermes-backup.timer Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
uma-hermes-dashboard.service Complete Hermes dashboard and watchdog roadmap audit 2026-05-27 10:45:29 +00:00
uma-hermes-gateway.service Add Hermes disaster recovery runbook 2026-05-27 11:23:07 +00:00
vm-oom-watchdog.service feat(vm): Phase 2.3 closure — OOM watchdog + orphan-container docs 2026-05-30 05:26:49 +00:00
vm-oom-watchdog.timer feat(vm): Phase 2.3 closure — OOM watchdog + orphan-container docs 2026-05-30 05:26:49 +00:00
vm-weekly-digest.service feat(dashboard/vm): Phases 4.1-4.3 — Prometheus trends, sparklines, weekly digest 2026-05-30 05:26:49 +00:00
vm-weekly-digest.timer feat(dashboard/vm): Phases 4.1-4.3 — Prometheus trends, sparklines, weekly digest 2026-05-30 05:26:49 +00:00