- vm-health-check.sh: read-only checks for disk, load, RAM, swap, Docker containers (crash-loops + healthchecks), build cache, journal. Flags: --quiet, --json, --notify (Telegram). Exit 0/1/2 = OK/WARN/CRIT. - vm-cleanup.sh: safe periodic cleanup. Default (weekly): build cache, journal, apt, npm, .next/cache. --full (monthly): adds docker system prune, pnpm store, old logs, HOLD cleanup. --dry-run, --install-cron, --uninstall-cron. Logs to /var/log/vm-cleanup.log. Related: docs/hostinger-vm-maintenance.md, scripts/VMs/HostingerVM/CRON_SETUP.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4.9 KiB
4.9 KiB
Hostinger VM — Cron Setup
Automated maintenance schedule for srv1491630.
Scripts: vm-health-check.sh (read-only) + vm-cleanup.sh (safe cleanup).
Quick install
SSH into the VM and run:
bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --install-cron
This installs the full recommended schedule. To remove it:
bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --uninstall-cron
What gets scheduled
| Schedule | Time (UTC) | Command | What it does |
|---|---|---|---|
| Daily | 07:00 | vm-health-check.sh |
Read-only check; sends Telegram alert on WARNING/CRITICAL |
| Daily | 03:00 | vm-cleanup.sh |
Prune Docker build cache only (always safe) |
| Weekly | Sun 02:00 | vm-cleanup.sh |
Standard cleanup (see below) |
| Monthly | 1st 01:00 | vm-cleanup.sh --full |
Full cleanup (see below) |
What each mode does
Standard weekly cleanup (vm-cleanup.sh)
All steps are labelled SAFE — they only remove regenerable caches.
| Step | What's removed | Risk |
|---|---|---|
| Docker build cache | Layer cache from docker build runs |
Zero — rebuilds just take longer next time |
| Crash loop check | Detection only, no changes | Zero |
| Journal vacuum | Old journal entries beyond 200MB / 7 days | Zero — logs are already captured in syslog |
| APT cache | /var/cache/apt/archives/ |
Zero — packages can be re-downloaded |
| NPM cache | ~/.npm/_cacache/ |
Zero — cache is re-populated on next npm install |
.next/cache |
Webpack/babel/TSC build cache dirs | Zero — rebuilt automatically on next next build |
Monthly full cleanup (vm-cleanup.sh --full)
Adds these CAREFUL steps on top of the standard run:
| Step | What's removed | Risk |
|---|---|---|
| Docker system prune | Stopped containers, unused networks, dangling images | Low — does NOT remove images used by any container |
| pnpm store prune | Packages not referenced by any node_modules |
Low — only removes truly orphaned packages |
| Old log files | .gz log rotations older than 30 days |
Low — old compressed logs |
| HOLD node_modules | node_modules in /opt/bytelyst/HOLD archived projects |
Low — code intact, can reinstall with pnpm install |
Never touched (by design)
/opt/bytelyst/*/node_modules(active repos)/opt/bytelyst/*/src,/app,/backend,/websource code.next/standalone(production Next.js builds)- Docker images used by currently configured containers
/usr/local/lib/hermes-agent//usr/share/ollama/(models)/swapfile- Any database volumes
Manual crontab (if you prefer not to use --install-cron)
# Health check daily 07:00 UTC
0 7 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --quiet --notify 2>&1 | logger -t vm-health
# Build cache prune daily 03:00 UTC
0 3 * * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup
# Standard weekly cleanup Sunday 02:00 UTC
0 2 * * 0 bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --quiet 2>&1 | logger -t vm-cleanup
# Full monthly cleanup 1st of month 01:00 UTC
0 1 1 * * bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-cleanup.sh --full --quiet 2>&1 | logger -t vm-cleanup
Edit with: crontab -e
Monitoring logs
# Tail cleanup log
tail -f /var/log/vm-cleanup.log
# Tail health check log
tail -f /var/log/vm-health-check.log
# See all cron output via syslog
grep vm-cleanup /var/log/syslog | tail -20
grep vm-health /var/log/syslog | tail -20
Telegram alerts
The health check script sends a Telegram message when it detects WARNING or CRITICAL.
It reads credentials from $HERMES_HOME/.env (usually /root/.hermes/.env).
Required keys in that file:
TELEGRAM_BOT_TOKEN=<your-bot-token>
TELEGRAM_CHAT_ID=<your-chat-id>
Both are already set if Hermes gateway is configured. Test with:
bash /opt/bytelyst/learning_ai_devops_tools/scripts/VMs/HostingerVM/vm-health-check.sh --notify
Disk thresholds (from vm-health-check.sh)
| Metric | WARNING | CRITICAL |
|---|---|---|
Disk used % |
> 55% | > 70% |
| Load average | > 4.0 | > 8.0 |
| RAM available | < 3 GB | < 1 GB |
| Swap used | > 1 GB | > 3 GB |
| Container restarts | > 10 | > 50 |
| Build cache | > 5 GB | > 20 GB |
Thresholds are constants at the top of each script — easy to adjust.
What the May 2026 incident would have caught
If this cron had been running during the May 26 incident:
- 07:00 daily health check →
container_loops CRIT: admin-web(50x)→ Telegram alert sent within hours of the loop starting - 03:00 daily build cache prune → would have kept build cache under 5 GB instead of growing to 84 GB
- Monthly full cleanup → would have cleared the HOLD node_modules and old logs before they became a storage crisis