bytelyst-devops-tools/scripts
Hermes VM d9618ba7b0 feat(vm): Phases 1.2, 1.4, 2.1 — steal time, swap pressure, health watchdog
Phase 1.2 — CPU steal time metric in vm-health-check.sh:
- Samples /proc/stat twice 1s apart for accurate current steal %
- Thresholds: >5% WARN, >15% CRIT (currently 0.8% on this host)
- Inserts before memory check so steal is visible alongside load

Phase 1.4 — Swap pressure indicator:
- Reads SwapCached from /proc/meminfo as secondary metric
- Raises SWAP_USED_WARN_GB 1→1.5 to reduce noise (current usage 0.6G)
- New WARN path: SwapCached > 200MB signals recent pressure even when
  current swap usage looks ok (catches post-spike state)

Phase 2.1 — Docker health-check watchdog:
- docker-health-watchdog.sh: checks unhealthy containers every 10 min,
  restarts only after 3 consecutive failing health checks (30min grace)
- docker-health-watchdog.service + .timer: enabled, fires every 10 min
- Sends Telegram notification on each auto-restart
- Rollback: systemctl disable docker-health-watchdog.timer

Phase 2.2 already complete: sync_hermes_persistent_backup.py handles
diverge gracefully with rebase/reset-hard fallback; running successfully.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 21:31:09 +00:00
..
VMs feat(vm): Phases 1.2, 1.4, 2.1 — steal time, swap pressure, health watchdog 2026-05-27 21:31:09 +00:00
gitea-backup.sh feat: add gitea backup timer assets 2026-05-27 18:53:20 +00:00
gitea-git Document Hermes Gitea token flow 2026-05-27 11:06:15 +00:00
gitea-git-askpass Document Hermes Gitea token flow 2026-05-27 11:06:15 +00:00
google-drive-upload-file.py Add Google Drive single file uploader 2026-05-27 12:19:45 +00:00
google-drive-upload-file.sh Add Google Drive single file uploader 2026-05-27 12:19:45 +00:00
hermes-emergency-bundle-create.sh Add encrypted Hermes emergency bundle scripts 2026-05-27 11:31:58 +00:00
hermes-emergency-bundle-decrypt.sh Add encrypted Hermes emergency bundle scripts 2026-05-27 11:31:58 +00:00
hermes-emergency-bundle-upload-drive.py Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-emergency-bundle-upload-drive.sh Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-google-drive-oauth-login.py Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-health-watchdog.py Complete Hermes dashboard and watchdog roadmap audit 2026-05-27 10:45:29 +00:00
monitor-lucky25-execution.sh chore(devops): tighten deployment scripts 2026-05-18 09:01:03 +00:00
README.md feat: detect stale VM automation 2026-05-27 21:00:43 +00:00
ubuntu-vm-security-update.sh Harden Ubuntu VM update script readiness checks 2026-05-05 03:09:57 +00:00

Scripts

This directory is the preferred home for self-contained operational scripts.

Current Scripts

  • ubuntu-vm-security-update.sh
    • Supported.
    • Purpose: update and harden Ubuntu VMs with unattended upgrades, UFW, and fail2ban.
    • Risk level: high, because it modifies packages, firewall rules, and reboot behavior.
  • VMs/HostingerVM/vm-health-check.sh
    • Supported.
    • Purpose: read-only VM health and drift check for disk, memory, swap, Docker health, failed systemd units, and stale root crontab script paths.
    • Risk level: low, because it is read-only apart from an optional local log write.

Conventions

  • New standalone operational scripts should go here instead of the repo root.
  • Each script should document:
    • prerequisites
    • required environment variables
    • destructive or privileged behavior
    • example usage
  • Scripts that change host state should support --help and a non-destructive preview mode when practical.

Legacy Note

The repo root still contains older shell utilities. Those are not all deprecated, but new work should prefer scripts/ for clearer ownership and discoverability.