bytelyst-devops-tools/scripts
Hermes VM 13a105ba23 feat(vm): Phase 5 closure — GPU/freshness checks, chaos validation, I/O alert
vm-health-check.sh:
- check_gpu(): nvidia-smi probe; "CPU-only" OK on this VM (no GPU)
- check_image_freshness(): flag containers running images >30d old.
  Skips third-party images (gitea, grafana, prom, mcr.microsoft, axllent,
  caddy, traefik, valkey, cadvisor) — they have their own rebuild cadence.
  Currently flags 19 stale product images (~60d old).

chaos-validation.sh:
- Monthly chaos test: kill PID 1 in chronomind-web, wait up to 35 min
  for docker-health-watchdog to detect + restart. Telegram pass/fail.
- Refuses to run if target not healthy. systemd timer fires 1st of month
  at 10:00 UTC (after 08:00 weekly digest).

vm-io-anomaly-check.sh:
- 6h avg sda write rate; transition alerts at WARN (1 GB/hr) /
  CRIT (2.5 GB/hr). De-dupes via /var/log/vm-io-anomaly-state so the
  alert fires once per transition, not every 6h. Current baseline:
  ~1.94 GB/hr (orphan-container state-file writes; see Phase 0.3).
- Reports recovery to OK when rate drops back.

vm/page.tsx: gpu + image_freshness added to CHECK_META so they render
with proper icon/label and slot into CHECK_ORDER.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 05:26:49 +00:00
..
VMs feat(vm): Phase 5 closure — GPU/freshness checks, chaos validation, I/O alert 2026-05-30 05:26:49 +00:00
gitea-backup.sh feat: add gitea backup timer assets 2026-05-27 18:53:20 +00:00
gitea-git Document Hermes Gitea token flow 2026-05-27 11:06:15 +00:00
gitea-git-askpass Document Hermes Gitea token flow 2026-05-27 11:06:15 +00:00
google-drive-upload-file.py Add Google Drive single file uploader 2026-05-27 12:19:45 +00:00
google-drive-upload-file.sh Add Google Drive single file uploader 2026-05-27 12:19:45 +00:00
hermes-emergency-bundle-create.sh Add encrypted Hermes emergency bundle scripts 2026-05-27 11:31:58 +00:00
hermes-emergency-bundle-decrypt.sh Add encrypted Hermes emergency bundle scripts 2026-05-27 11:31:58 +00:00
hermes-emergency-bundle-upload-drive.py Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-emergency-bundle-upload-drive.sh Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-google-drive-oauth-login.py Add Google Drive emergency bundle upload 2026-05-27 12:08:41 +00:00
hermes-health-watchdog.py Complete Hermes dashboard and watchdog roadmap audit 2026-05-27 10:45:29 +00:00
monitor-lucky25-execution.sh chore(devops): tighten deployment scripts 2026-05-18 09:01:03 +00:00
README.md feat: detect stale VM automation 2026-05-27 21:00:43 +00:00
ubuntu-vm-security-update.sh Harden Ubuntu VM update script readiness checks 2026-05-05 03:09:57 +00:00

Scripts

This directory is the preferred home for self-contained operational scripts.

Current Scripts

  • ubuntu-vm-security-update.sh
    • Supported.
    • Purpose: update and harden Ubuntu VMs with unattended upgrades, UFW, and fail2ban.
    • Risk level: high, because it modifies packages, firewall rules, and reboot behavior.
  • VMs/HostingerVM/vm-health-check.sh
    • Supported.
    • Purpose: read-only VM health and drift check for disk, memory, swap, Docker health, failed systemd units, and stale root crontab script paths.
    • Risk level: low, because it is read-only apart from an optional local log write.

Conventions

  • New standalone operational scripts should go here instead of the repo root.
  • Each script should document:
    • prerequisites
    • required environment variables
    • destructive or privileged behavior
    • example usage
  • Scripts that change host state should support --help and a non-destructive preview mode when practical.

Legacy Note

The repo root still contains older shell utilities. Those are not all deprecated, but new work should prefer scripts/ for clearer ownership and discoverability.