diff --git a/docs/devops/SINGLE_VM_DEPLOYMENT.md b/docs/devops/SINGLE_VM_DEPLOYMENT.md index 8f99c928..3fdafb4e 100644 --- a/docs/devops/SINGLE_VM_DEPLOYMENT.md +++ b/docs/devops/SINGLE_VM_DEPLOYMENT.md @@ -134,7 +134,9 @@ ### Phase 1: Docker Compose (after prerequisite work) -> **⚠️ Prerequisite:** product repos that still rely on `file:`-based `@bytelyst/*` Docker consumption must run `docker-prep.sh` before building images (see §12 Audit Findings). FlowMonk's current backend/web Docker path is the registry-backed exception and uses repo-root build context instead. All Dockerfiles and `output: 'standalone'` configs are now in place (completed 2026-03-22). During the package-manager transition, each repo's Docker build must follow that repo's declared package manager and lockfile semantics rather than assuming `npm` or `pnpm` globally. +> **✅ Prerequisite RESOLVED (2026-03-24):** All 10 repos now consume `@bytelyst/*` packages from the Gitea npm registry. `docker-prep.sh` has been deleted from all repos. Docker builds use pnpm + BuildKit secret mount pattern. See [`GITEA_NPM_REGISTRY_MIGRATION.md`](GITEA_NPM_REGISTRY_MIGRATION.md) §14-17 for details. +> +> **📋 Enhanced plan:** See [`SINGLE_VM_ENHANCED_PLAN.md`](SINGLE_VM_ENHANCED_PLAN.md) for the updated deployment plan with Coolify, Valkey, Uptime Kuma, and other open-source tooling additions. Create a **unified** `docker-compose.ecosystem.yml` that brings everything up. diff --git a/docs/devops/SINGLE_VM_ENHANCED_PLAN.md b/docs/devops/SINGLE_VM_ENHANCED_PLAN.md new file mode 100644 index 00000000..441b2d05 --- /dev/null +++ b/docs/devops/SINGLE_VM_ENHANCED_PLAN.md @@ -0,0 +1,451 @@ +# ByteLyst Ecosystem — Enhanced Single-VM Deployment Plan + +> Supersedes the stale sections of `SINGLE_VM_DEPLOYMENT.md`. Incorporates lessons from the Gitea registry migration (2026-03-24) and introduces open-source tooling to minimize setup time while maximizing robustness. + +--- + +## 0. What Changed Since the Original Plan + +The original `SINGLE_VM_DEPLOYMENT.md` was written during the `file:` → registry transition. These items are now **resolved**: + +- ✅ All 10 repos consume `@bytelyst/*` from Gitea npm registry (`^0.1.0`) +- ✅ `docker-prep.sh` deleted from all repos — no more tarball prep step +- ✅ All Dockerfiles use pnpm + BuildKit secret mount pattern +- ✅ 49 packages published, 1,591 backend tests green, 9/9 web typechecks clean +- ✅ Docker builds verified for MindLyst + LysnrAI (the two non-standard repos) + +**The prerequisite blocker in §4.1 of the original plan is gone.** We can now build any image with just `docker build` + registry auth. + +--- + +## 1. Recommended Open-Source Tooling Additions + +### Tier 1 — Game Changers (add these first) + +| Tool | Replaces | Why | +|------|----------|-----| +| **[Coolify](https://coolify.io)** | Manual compose orchestration + Traefik config + SSL + deploy scripts | Self-hosted PaaS. Git-push deploys, automatic SSL (Let's Encrypt), env var management UI, Docker Compose support, real-time logs, one-click rollbacks. **Eliminates ~60% of manual deployment work.** | +| **[Uptime Kuma](https://github.com/louislam/uptime-kuma)** | Custom health-check scripts + `prototype-self-test.sh` | Beautiful status page + monitoring for all 25+ endpoints. Slack/Discord/email alerts. Multi-protocol (HTTP, TCP, DNS, Docker). Setup: 2 minutes. | +| **[Valkey](https://valkey.io)** (Redis fork, BSD licensed) | In-memory caches scattered across services | Centralized session store, rate-limit counters, pub/sub for SSE fan-out, feature flag cache, job queue backend. Eliminates per-service in-memory state that dies on restart. | + +### Tier 2 — Operational Excellence + +| Tool | Replaces | Why | +|------|----------|-----| +| **[Dozzle](https://dozzle.dev)** | Loki+Grafana (for single-VM log viewing) | Lightweight real-time Docker log viewer. Zero config, 8MB image, web UI. Keep Loki+Grafana for structured queries but use Dozzle for quick debugging. | +| **[Portainer CE](https://portainer.io)** | CLI-only Docker management | Visual container management, resource monitoring, compose stack deployment, volume management. Good for when the AI agent isn't available. | +| **[Restic](https://restic.net)** + cron | No backup strategy | Encrypted, deduplicated backups of Docker volumes (Cosmos data, Gitea repos, Grafana dashboards) to Azure Blob or S3. Scheduled via the platform-service jobs module. | +| **[SOPS](https://github.com/getsops/sops)** + [age](https://github.com/FiloSottile/age) | Plain `.env` files (secrets in cleartext) | Encrypt secrets in git. `sops -e .env.production > .env.production.enc`. Decrypt at deploy time. No Key Vault dependency for single-VM. | + +### Tier 3 — Developer Experience + +| Tool | Replaces | Why | +|------|----------|-----| +| **[Ollama](https://ollama.ai)** | External LLM API calls | Local LLM inference for LocalMemGPT, extraction-service, and AI coaching features. Already referenced in compose. GPU optional (CPU works for small models). | +| **[Windmill](https://windmill.dev)** | Custom bash/cron scripts | Open-source workflow engine (like n8n but code-first). Schedule package publishes, backup jobs, health sweeps, dependency updates. TypeScript/Python scripts with UI. | +| **[Caddy](https://caddyserver.com)** | Traefik (if Coolify isn't used) | Automatic HTTPS with zero config. Simpler than Traefik for single-domain setups. If using Coolify, Coolify handles this internally. | + +### Decision Matrix: Coolify vs. Raw Docker Compose + +| Factor | Coolify | Raw Compose | +|--------|---------|-------------| +| **Setup time** | ~15 min (one script) | ~6-7 hours | +| **SSL/HTTPS** | Automatic (Let's Encrypt) | Manual (Caddy/Traefik + certs) | +| **Git push deploy** | Built-in | Custom webhook + script | +| **Env var management** | Web UI per service | `.env` files | +| **Rollback** | One-click | `git revert` + rebuild | +| **Log viewer** | Built-in | Dozzle or Loki | +| **Resource monitoring** | Built-in dashboard | Portainer or Grafana | +| **Learning curve** | Low (GUI-driven) | Medium (YAML wrangling) | +| **Flexibility** | High (supports compose, Dockerfile, Nixpacks) | Maximum | +| **K8s migration path** | Export to compose, then convert | Direct conversion | + +**Recommendation: Use Coolify for the VM deployment.** It's a mature, actively maintained project (36K+ GitHub stars) that handles the boring plumbing. Reserve raw compose/K3s for when you need multi-node or fine-grained control. + +--- + +## 2. Enhanced Architecture (Single VM) + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Azure VM (32 GB) │ +│ │ +│ ┌─── Coolify (PaaS layer) ──────────────────────────────┐ │ +│ │ • Git-push deploy for all repos │ │ +│ │ • Automatic SSL via Let's Encrypt │ │ +│ │ • Reverse proxy (built-in Traefik) │ │ +│ │ • Environment variable management │ │ +│ │ • Container lifecycle management │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Infrastructure ────────────────────────────────────┐ │ +│ │ Gitea (npm registry + git + CI) port 3300 │ │ +│ │ Cosmos DB Emulator port 8081 │ │ +│ │ Valkey (Redis-compatible) port 6379 │ │ +│ │ Azurite (blob storage) port 10000 │ │ +│ │ Mailpit (SMTP sandbox) port 1025 │ │ +│ │ Ollama (LLM inference) port 11434 │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Shared Services ───────────────────────────────────┐ │ +│ │ platform-service port 4003 (37 modules) │ │ +│ │ extraction-service port 4005 (+ Python) │ │ +│ │ mcp-server port 4007 (tool hub) │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Product Backends (10) ─────────────────────────────┐ │ +│ │ PeakPulse 4010 │ ChronoMind 4011 │ JarvisJr 4012 │ │ +│ │ NomGap 4013 │ MindLyst 4014 │ LysnrAI 4015 │ │ +│ │ NoteLett 4016 │ FlowMonk 4017 │ ActionTrail 4018 │ │ +│ │ LocalMemGPT 4019 │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Web Dashboards (11) ───────────────────────────────┐ │ +│ │ admin 3001 │ user 3002 │ tracker 3003 │ │ +│ │ NomGap 3040 │ MindLyst 3050 │ ChronoMind 3051 │ │ +│ │ JarvisJr 3052 │ FlowMonk 3053 │ NoteLett 3054 │ │ +│ │ ActionTrail 3060 │ LocalMemGPT 3070 │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Observability ─────────────────────────────────────┐ │ +│ │ Uptime Kuma (status page) port 3333 │ │ +│ │ Dozzle (live log viewer) port 9999 │ │ +│ │ Grafana + Loki (structured logs) port 3000/3100 │ │ +│ │ Portainer CE (container mgmt) port 9443 │ │ +│ └───────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─── Automation ────────────────────────────────────────┐ │ +│ │ Gitea Actions (CI runner) │ │ +│ │ Restic (scheduled volume backups) │ │ +│ └───────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**Total containers: ~35** | **Estimated RAM: ~10 GB idle, ~18 GB under load** | **VM: 32 GB recommended** + +--- + +## 3. Docker Build Optimizations + +### 3.1 BuildKit Cache Mounts (already partially used) + +Add pnpm store caching to all Dockerfiles for 3-5× faster rebuilds: + +```dockerfile +# Current (good but slow on cache miss): +RUN --mount=type=secret,id=gitea_npm_token \ + export GITEA_NPM_TOKEN="$(cat /run/secrets/gitea_npm_token)" && \ + pnpm install --ignore-scripts --lockfile=false + +# Enhanced (cache pnpm store across builds): +RUN --mount=type=secret,id=gitea_npm_token \ + --mount=type=cache,id=pnpm-store,target=/root/.local/share/pnpm/store \ + export GITEA_NPM_TOKEN="$(cat /run/secrets/gitea_npm_token)" && \ + pnpm install --ignore-scripts --lockfile=false +``` + +### 3.2 Multi-Repo Parallel Build Script + +```bash +#!/bin/bash +# scripts/build-all-images.sh — parallel Docker builds for all services +export DOCKER_BUILDKIT=1 +export TOKEN="$(cat ~/.gitea-npm-token)" + +build_image() { + local repo=$1 dockerfile=$2 tag=$3 context=$4 + echo "Building $tag..." + docker build \ + --add-host localhost:host-gateway \ + --build-arg GITEA_NPM_HOST=host.docker.internal \ + --secret id=gitea_npm_token,env=TOKEN \ + -f "$dockerfile" -t "$tag" "$context" 2>&1 | tail -1 +} + +# Infrastructure builds (parallel) +build_image common-plat services/platform-service/Dockerfile bytelyst/platform-service:latest ./learning_ai_common_plat & +build_image common-plat services/extraction-service/Dockerfile bytelyst/extraction-service:latest ./learning_ai_common_plat & +build_image common-plat services/mcp-server/Dockerfile bytelyst/mcp-server:latest ./learning_ai_common_plat & +wait + +# Product backends (parallel batches of 5) +build_image flowmonk backend/Dockerfile bytelyst/flowmonk-backend:latest ./learning_ai_flowmonk & +build_image notelett backend/Dockerfile bytelyst/notelett-backend:latest ./learning_ai_notes & +build_image actiontrail backend/Dockerfile bytelyst/actiontrail-backend:latest ./learning_ai_trails & +build_image localmemgpt backend/Dockerfile bytelyst/localmemgpt-backend:latest ./learning_ai_local_memory_gpt & +build_image nomgap backend/Dockerfile bytelyst/nomgap-backend:latest ./learning_ai_fastgap & +wait + +build_image chronomind backend/Dockerfile bytelyst/chronomind-backend:latest ./learning_ai_clock & +build_image jarvisjr backend/Dockerfile bytelyst/jarvisjr-backend:latest ./learning_ai_jarvis_jr & +build_image peakpulse backend/Dockerfile bytelyst/peakpulse-backend:latest ./learning_ai_peakpulse & +build_image mindlyst backend/Dockerfile bytelyst/mindlyst-backend:latest ./learning_multimodal_memory_agents & +build_image lysnrai backend/Dockerfile bytelyst/lysnrai-backend:latest ./learning_voice_ai_agent & +wait + +echo "All images built." +``` + +### 3.3 Docker Compose Profiles (selective startup) + +```yaml +# In docker-compose.ecosystem.yml, add profiles: +services: + cosmos-emulator: + profiles: [infra, full] + platform-service: + profiles: [platform, full] + flowmonk-backend: + profiles: [products, full] + admin-web: + profiles: [web, full] + uptime-kuma: + profiles: [observability, full] +``` + +```bash +# Start only infra: +docker compose --profile infra up -d + +# Start infra + platform: +docker compose --profile infra --profile platform up -d + +# Start everything: +docker compose --profile full up -d +``` + +--- + +## 4. Valkey Integration Plan + +### Why Valkey over raw Redis + +- BSD-licensed fork (Redis switched to SSPL) +- Drop-in Redis-compatible (same protocol, same clients) +- Actively maintained by Linux Foundation + +### What moves to Valkey + +| Current | Moves to Valkey | Benefit | +|---------|----------------|---------| +| In-memory rate limit counters (`lib/rate-limiter.ts`) | `INCR` + `EXPIRE` | Survives restarts, shared across replicas | +| In-memory feature flag cache (`lib/feature-flags.ts`) | `GET`/`SET` with TTL | Instant cross-service flag propagation | +| In-memory TTL cache (`lib/cache.ts` in ActionTrail) | Redis `GET`/`SET` | Shared cache across service replicas | +| In-process event bus (`@bytelyst/events`) | Redis Pub/Sub | Cross-service event propagation | +| SSE hub connections | Redis Pub/Sub fan-out | Multi-replica SSE without sticky sessions | +| Session tokens (Cosmos queries) | Redis session store | Sub-ms session lookups | + +### Compose entry + +```yaml +valkey: + image: valkey/valkey:8-alpine + ports: ['6379:6379'] + volumes: [valkey-data:/data] + command: valkey-server --save 60 1 --loglevel warning + healthcheck: + test: ['CMD', 'valkey-cli', 'ping'] + interval: 10s + restart: unless-stopped +``` + +### Package: `@bytelyst/cache` + +New shared package wrapping `ioredis` with Valkey connection: + +```typescript +// packages/cache/src/index.ts +import Redis from 'ioredis'; + +export function createCacheClient(url = process.env.VALKEY_URL ?? 'redis://localhost:6379') { + return new Redis(url, { lazyConnect: true, maxRetriesPerRequest: 3 }); +} + +export function createPubSub(url = process.env.VALKEY_URL ?? 'redis://localhost:6379') { + return { publisher: new Redis(url), subscriber: new Redis(url) }; +} +``` + +--- + +## 5. Coolify Setup (15-minute path) + +### Prerequisites + +- Ubuntu 24.04 VM with Docker installed +- Domain pointing to VM IP (e.g., `*.bytelyst.dev`) + +### Install + +```bash +# One command — installs Coolify + all dependencies +curl -fsSL https://cdn.coollabs.io/coolify/install.sh | bash +``` + +### Configure for ByteLyst + +1. **Add Gitea as git source** — Coolify connects to Gitea via API token +2. **Add each repo as a "service"** — Coolify auto-detects Dockerfile, builds, deploys +3. **Set env vars per service** — Web UI, encrypted at rest +4. **Enable auto-deploy** — Push to main → build → deploy (via Gitea webhook) +5. **Configure domains** — `platform.bytelyst.dev`, `flowmonk.bytelyst.dev`, etc. + +### What Coolify handles automatically + +- Traefik reverse proxy with automatic SSL +- Docker image builds with BuildKit +- Container health checks and auto-restart +- Rolling deploys with zero-downtime +- Resource monitoring dashboard +- Persistent volume management +- Webhook-triggered deployments from Gitea + +### What still needs manual compose + +Coolify supports Docker Compose files natively, so the infra services (Cosmos emulator, Valkey, Ollama, etc.) can be deployed as a single compose stack through the Coolify UI. + +--- + +## 6. Observability Stack + +### Uptime Kuma — Status Page + Alerting + +```yaml +uptime-kuma: + image: louislam/uptime-kuma:1 + ports: ['3333:3001'] + volumes: [uptime-kuma-data:/app/data] + restart: unless-stopped +``` + +**Configure 25+ monitors:** +- All `/health` endpoints (backends + services) +- Web dashboard reachability +- Cosmos emulator connectivity +- Gitea API availability +- Valkey ping +- Ollama model availability + +### Dozzle — Live Container Logs + +```yaml +dozzle: + image: amir20/dozzle:latest + ports: ['9999:8080'] + volumes: ['/var/run/docker.sock:/var/run/docker.sock:ro'] + restart: unless-stopped +``` + +### Keep Loki + Grafana for + +- Structured log queries across time ranges +- Custom dashboards (request latency, error rates, Cosmos RU consumption) +- Alert rules based on log patterns + +--- + +## 7. Backup Strategy (Restic) + +```bash +# Install restic +apt install restic + +# Init backup repo (Azure Blob, S3, local, or SFTP) +restic -r azure:bytelyst-backups:/ init + +# Backup all Docker volumes +restic backup \ + /var/lib/docker/volumes/gitea-data/ \ + /var/lib/docker/volumes/cosmos-data/ \ + /var/lib/docker/volumes/valkey-data/ \ + /var/lib/docker/volumes/grafana-data/ \ + /var/lib/docker/volumes/uptime-kuma-data/ \ + --tag daily + +# Cron: daily at 2 AM +echo "0 2 * * * restic backup ... --tag daily && restic forget --keep-daily 7 --keep-weekly 4 --prune" | crontab - +``` + +--- + +## 8. Secret Management (SOPS + age) + +```bash +# Generate age key pair (one-time) +age-keygen -o ~/.config/sops/age/keys.txt + +# Create .sops.yaml in repo root +cat > .sops.yaml << 'EOF' +creation_rules: + - path_regex: \.env\..*\.enc$ + age: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +EOF + +# Encrypt secrets +sops -e .env.production > .env.production.enc + +# Decrypt at deploy time (on VM) +sops -d .env.production.enc > .env.production + +# Git-safe: .env.production.enc is committed, .env.production is gitignored +``` + +--- + +## 9. Revised Implementation Order + +| Step | Time | What | Tools | +|------|------|------|-------| +| **1** | 15 min | Install Coolify on VM | `curl` one-liner | +| **2** | 30 min | Deploy infra compose stack (Cosmos, Valkey, Gitea, Azurite, Mailpit, Ollama) via Coolify | Coolify UI | +| **3** | 30 min | Publish 49 `@bytelyst/*` packages to VM's Gitea | `scripts/publish-all.sh` | +| **4** | 1 hr | Add all 13 service repos to Coolify (auto-detect Dockerfile, set env vars) | Coolify UI | +| **5** | 30 min | Deploy Uptime Kuma + configure 25+ health monitors | Coolify + Uptime Kuma UI | +| **6** | 15 min | Deploy Dozzle for live log viewing | Coolify | +| **7** | 30 min | Configure SOPS + age for secrets, encrypt `.env.production` | CLI | +| **8** | 30 min | Configure Restic backups for all stateful volumes | CLI + cron | +| **9** | 30 min | Smoke test: hit all `/health` endpoints, verify Uptime Kuma green | Browser + curl | + +**Total: ~4.5 hours** (down from 6-7 hours without Coolify) + +--- + +## 10. VM Sizing (Updated) + +### Minimum (32 GB) — dev/staging, no Ollama + +| Component | Count | RAM | +|-----------|-------|-----| +| Cosmos DB Emulator | 1 | ~2 GB | +| Fastify backends | 13 | ~2 GB | +| Next.js web apps | 11 | ~2.2 GB | +| Valkey | 1 | ~100 MB | +| Infra (Traefik, Loki, Grafana, Azurite, Mailpit, Gitea) | 7 | ~1 GB | +| Observability (Uptime Kuma, Dozzle, Portainer) | 3 | ~300 MB | +| Coolify overhead | 1 | ~500 MB | +| **Subtotal** | **~37** | **~8.1 GB** | +| Headroom for builds + spikes | — | ~24 GB | + +### Recommended (64 GB) — with Ollama (7B models) + +Same as above + Ollama (~8 GB for llama3:8b) = ~16 GB active, ~48 GB headroom. + +### Cloud Pricing (updated) + +| Provider | Instance | vCPU | RAM | Price | +|----------|----------|------|-----|-------| +| **Hetzner** | CPX51 | 16 | 32 GB | **~€45/mo** ← best value | +| **Hetzner** | CCX33 | 8 | 32 GB | **~€55/mo** (dedicated) | +| **Azure** | Standard_D8s_v5 | 8 | 32 GB | ~$280/mo | +| **Home** | Mac Mini M4 Pro | 12 | 48 GB | One-time ~$1,600 | + +--- + +## 11. What This Plan Does NOT Cover (Future Work) + +- **Multi-node K3s** — Phase 2, same manifests, add workers with `k3s agent` +- **Managed Kubernetes** (AKS/EKS) — Phase 3, same manifests + Helm chart +- **CI/CD pipeline** for automated package publish → image build → deploy +- **Custom domain + DNS** — depends on registrar choice +- **WAF / DDoS protection** — Cloudflare free tier in front of Coolify +- **Mobile app distribution** — TestFlight / Play Console (separate from VM)