bytelyst-devops-tools/scripts/VMs/HostingerVM/CRON_SETUP.md
Hermes VM 0a2d303f93 add HostingerVM health-check and cleanup scripts
- vm-health-check.sh: read-only checks for disk, load, RAM, swap,
  Docker containers (crash-loops + healthchecks), build cache, journal.
  Flags: --quiet, --json, --notify (Telegram). Exit 0/1/2 = OK/WARN/CRIT.

- vm-cleanup.sh: safe periodic cleanup.
  Default (weekly): build cache, journal, apt, npm, .next/cache.
  --full (monthly): adds docker system prune, pnpm store, old logs, HOLD cleanup.
  --dry-run, --install-cron, --uninstall-cron.
  Logs to /var/log/vm-cleanup.log.

Related: docs/hostinger-vm-maintenance.md, scripts/VMs/HostingerVM/CRON_SETUP.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 18:53:20 +00:00

4.9 KiB

Hostinger VM — Cron Setup

Automated maintenance schedule for srv1491630. Scripts: vm-health-check.sh (read-only) + vm-cleanup.sh (safe cleanup).


Quick install

SSH into the VM and run:

bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --install-cron

This installs the full recommended schedule. To remove it:

bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --uninstall-cron

What gets scheduled

Schedule Time (UTC) Command What it does
Daily 07:00 vm-health-check.sh Read-only check; sends Telegram alert on WARNING/CRITICAL
Daily 03:00 vm-cleanup.sh Prune Docker build cache only (always safe)
Weekly Sun 02:00 vm-cleanup.sh Standard cleanup (see below)
Monthly 1st 01:00 vm-cleanup.sh --full Full cleanup (see below)

What each mode does

Standard weekly cleanup (vm-cleanup.sh)

All steps are labelled SAFE — they only remove regenerable caches.

Step What's removed Risk
Docker build cache Layer cache from docker build runs Zero — rebuilds just take longer next time
Crash loop check Detection only, no changes Zero
Journal vacuum Old journal entries beyond 200MB / 7 days Zero — logs are already captured in syslog
APT cache /var/cache/apt/archives/ Zero — packages can be re-downloaded
NPM cache ~/.npm/_cacache/ Zero — cache is re-populated on next npm install
.next/cache Webpack/babel/TSC build cache dirs Zero — rebuilt automatically on next next build

Monthly full cleanup (vm-cleanup.sh --full)

Adds these CAREFUL steps on top of the standard run:

Step What's removed Risk
Docker system prune Stopped containers, unused networks, dangling images Low — does NOT remove images used by any container
pnpm store prune Packages not referenced by any node_modules Low — only removes truly orphaned packages
Old log files .gz log rotations older than 30 days Low — old compressed logs
HOLD node_modules node_modules in /opt/bytelyst/HOLD archived projects Low — code intact, can reinstall with pnpm install

Never touched (by design)

  • /opt/bytelyst/*/node_modules (active repos)
  • /opt/bytelyst/*/src, /app, /backend, /web source code
  • .next/standalone (production Next.js builds)
  • Docker images used by currently configured containers
  • /usr/local/lib/hermes-agent/
  • /usr/share/ollama/ (models)
  • /swapfile
  • Any database volumes

Manual crontab (if you prefer not to use --install-cron)

# Health check daily 07:00 UTC
0 7 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --quiet --notify 2>&1 | logger -t vm-health

# Build cache prune daily 03:00 UTC
0 3 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup

# Standard weekly cleanup Sunday 02:00 UTC
0 2 * * 0 bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup

# Full monthly cleanup 1st of month 01:00 UTC
0 1 1 * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --full --quiet 2>&1 | logger -t vm-cleanup

Edit with: crontab -e


Monitoring logs

# Tail cleanup log
tail -f /var/log/vm-cleanup.log

# Tail health check log
tail -f /var/log/vm-health-check.log

# See all cron output via syslog
grep vm-cleanup /var/log/syslog | tail -20
grep vm-health /var/log/syslog | tail -20

Telegram alerts

The health check script sends a Telegram message when it detects WARNING or CRITICAL. It reads credentials from $HERMES_HOME/.env (usually /root/.hermes/.env).

Required keys in that file:

TELEGRAM_BOT_TOKEN=<your-bot-token>
TELEGRAM_CHAT_ID=<your-chat-id>

Both are already set if Hermes gateway is configured. Test with:

bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --notify

Disk thresholds (from vm-health-check.sh)

Metric WARNING CRITICAL
Disk used % > 55% > 70%
Load average > 4.0 > 8.0
RAM available < 3 GB < 1 GB
Swap used > 1 GB > 3 GB
Container restarts > 10 > 50
Build cache > 5 GB > 20 GB

Thresholds are constants at the top of each script — easy to adjust.


What the May 2026 incident would have caught

If this cron had been running during the May 26 incident:

  • 07:00 daily health checkcontainer_loops CRIT: admin-web(50x) → Telegram alert sent within hours of the loop starting
  • 03:00 daily build cache prune → would have kept build cache under 5 GB instead of growing to 84 GB
  • Monthly full cleanup → would have cleared the HOLD node_modules and old logs before they became a storage crisis