# Hostinger VM — Cron Setup Automated maintenance schedule for `srv1491630`. Scripts: `vm-health-check.sh` (read-only) + `vm-cleanup.sh` (safe cleanup). --- ## Quick install SSH into the VM and run: ```bash bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --install-cron ``` This installs the full recommended schedule. To remove it: ```bash bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --uninstall-cron ``` --- ## What gets scheduled | Schedule | Time (UTC) | Command | What it does | |---|---|---|---| | Daily | 07:00 | `vm-health-check.sh` | Read-only check; sends Telegram alert on WARNING/CRITICAL | | Daily | 03:00 | `vm-cleanup.sh` | Prune Docker build cache only (always safe) | | Weekly | Sun 02:00 | `vm-cleanup.sh` | Standard cleanup (see below) | | Monthly | 1st 01:00 | `vm-cleanup.sh --full` | Full cleanup (see below) | --- ## What each mode does ### Standard weekly cleanup (`vm-cleanup.sh`) All steps are labelled **SAFE** — they only remove regenerable caches. | Step | What's removed | Risk | |---|---|---| | Docker build cache | Layer cache from `docker build` runs | Zero — rebuilds just take longer next time | | Crash loop check | Detection only, no changes | Zero | | Journal vacuum | Old journal entries beyond 200MB / 7 days | Zero — logs are already captured in syslog | | APT cache | `/var/cache/apt/archives/` | Zero — packages can be re-downloaded | | NPM cache | `~/.npm/_cacache/` | Zero — cache is re-populated on next `npm install` | | `.next/cache` | Webpack/babel/TSC build cache dirs | Zero — rebuilt automatically on next `next build` | ### Monthly full cleanup (`vm-cleanup.sh --full`) Adds these **CAREFUL** steps on top of the standard run: | Step | What's removed | Risk | |---|---|---| | Docker system prune | Stopped containers, unused networks, dangling images | Low — does NOT remove images used by any container | | pnpm store prune | Packages not referenced by any `node_modules` | Low — only removes truly orphaned packages | | Old log files | `.gz` log rotations older than 30 days | Low — old compressed logs | | HOLD node_modules | `node_modules` in `/opt/bytelyst/HOLD` archived projects | Low — code intact, can reinstall with `pnpm install` | ### Never touched (by design) - `/opt/bytelyst/*/node_modules` (active repos) - `/opt/bytelyst/*/src`, `/app`, `/backend`, `/web` source code - `.next/standalone` (production Next.js builds) - Docker images used by currently configured containers - `/usr/local/lib/hermes-agent/` - `/usr/share/ollama/` (models) - `/swapfile` - Any database volumes --- ## Manual crontab (if you prefer not to use --install-cron) ``` # Health check daily 07:00 UTC 0 7 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --quiet --notify 2>&1 | logger -t vm-health # Build cache prune daily 03:00 UTC 0 3 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup # Standard weekly cleanup Sunday 02:00 UTC 0 2 * * 0 bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup # Full monthly cleanup 1st of month 01:00 UTC 0 1 1 * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --full --quiet 2>&1 | logger -t vm-cleanup ``` Edit with: `crontab -e` --- ## Monitoring logs ```bash # Tail cleanup log tail -f /var/log/vm-cleanup.log # Tail health check log tail -f /var/log/vm-health-check.log # See all cron output via syslog grep vm-cleanup /var/log/syslog | tail -20 grep vm-health /var/log/syslog | tail -20 ``` --- ## Telegram alerts The health check script sends a Telegram message when it detects WARNING or CRITICAL. It reads credentials from `$HERMES_HOME/.env` (usually `/root/.hermes/.env`). Required keys in that file: ``` TELEGRAM_BOT_TOKEN= TELEGRAM_CHAT_ID= ``` Both are already set if Hermes gateway is configured. Test with: ```bash bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --notify ``` --- ## Disk thresholds (from `vm-health-check.sh`) | Metric | WARNING | CRITICAL | |---|---|---| | Disk used `%` | > 55% | > 70% | | Load average | > 4.0 | > 8.0 | | RAM available | < 3 GB | < 1 GB | | Swap used | > 1 GB | > 3 GB | | Container restarts | > 10 | > 50 | | Build cache | > 5 GB | > 20 GB | Thresholds are constants at the top of each script — easy to adjust. --- ## What the May 2026 incident would have caught If this cron had been running during the May 26 incident: - **07:00 daily health check** → `container_loops CRIT: admin-web(50x)` → Telegram alert sent within hours of the loop starting - **03:00 daily build cache prune** → would have kept build cache under 5 GB instead of growing to 84 GB - **Monthly full cleanup** → would have cleared the HOLD node_modules and old logs before they became a storage crisis