OOM watchdog: - vm-oom-watchdog.sh — scans journalctl -k since cursor for oom-kill, killed-process, and "out of memory ... killed" entries; maps cgroup hits back to container names via docker inspect; posts a single Telegram alert per scan window (no dedupe needed — cursor advances on every run). Cursor at /var/log/vm-oom-cursor, log at /var/log/vm-oom-watchdog.log. - Systemd: OnBootSec=10min, OnUnitActiveSec=1h, Persistent=true. Orphan containers (no compose file on disk): - trading-backend → docker update --memory=768m (high-I/O bot) - gitea-npm-registry → docker update --memory=512m - orphan-containers.md captures canonical configs for recovery (env, mounts, networks, restart policy, memory limits). Closes Phase 2.3 (post-monitoring) and Phase 3.3 (orphan limits). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 lines
244 B
Desktop File
11 lines
244 B
Desktop File
[Unit]
|
|
Description=Detect kernel OOM-kill events and alert via Telegram
|
|
After=docker.service network-online.target
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
User=root
|
|
Group=root
|
|
Environment="HERMES_HOME=/root/.hermes"
|
|
ExecStart=/usr/local/bin/vm-oom-watchdog.sh
|