docs(devops): add single-VM deployment guide with audit findings

2026-03-22 00:18:17 -07:00 · 2026-03-22 00:18:17 -07:00 · ae2af43d71
commit ae2af43d71
parent 2f06aacc27
1 changed files with 956 additions and 0 deletions
--- a/docs/devops/SINGLE_VM_DEPLOYMENT.md
+++ b/docs/devops/SINGLE_VM_DEPLOYMENT.md
@ -0,0 +1,956 @@
+# ByteLyst Ecosystem — Single-VM Deployment Guide
+
+> Deploy the **entire** ByteLyst ecosystem on one VM, fully Dockerized, with a K3s Kubernetes layer for production-readiness practice.
+
+---
+
+## 1. Service Inventory
+
+### Shared Infrastructure (common-plat)
+
+| Service                  | Port       | Image                                                                  | RAM Est. |
+| ------------------------ | ---------- | ---------------------------------------------------------------------- | -------- |
+| **platform-service**     | 4003       | Fastify 5 + TS                                                         | ~200 MB  |
+| **extraction-service**   | 4005       | Fastify 5 + Python sidecar                                             | ~350 MB  |
+| **mcp-server**           | 4007       | Fastify 5 + TS                                                         | ~150 MB  |
+| **Cosmos DB Emulator**   | 8081, 1234 | `mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview` | ~2 GB    |
+| **Azurite** (blob)       | 10000      | `mcr.microsoft.com/azure-storage/azurite`                              | ~100 MB  |
+| **Mailpit** (SMTP)       | 1025, 8025 | `axllent/mailpit`                                                      | ~50 MB   |
+| **Traefik** (gateway)    | 80, 8080   | `traefik:v3.3`                                                         | ~100 MB  |
+| **Loki** (logs)          | 3100       | `grafana/loki`                                                         | ~200 MB  |
+| **Grafana** (dashboards) | 3000       | `grafana/grafana`                                                      | ~200 MB  |
+
+### Product Backends (Fastify 5 + TypeScript)
+
+| Product                 | Port | RAM Est. |
+| ----------------------- | ---- | -------- |
+| **LysnrAI** backend     | 4015 | ~150 MB  |
+| **MindLyst** backend    | 4014 | ~150 MB  |
+| **ChronoMind** backend  | 4011 | ~150 MB  |
+| **JarvisJr** backend    | 4012 | ~150 MB  |
+| **NomGap** backend      | 4013 | ~150 MB  |
+| **PeakPulse** backend   | 4010 | ~150 MB  |
+| **FlowMonk** backend    | 4017 | ~150 MB  |
+| **NoteLett** backend    | 4016 | ~150 MB  |
+| **ActionTrail** backend | 4018 | ~150 MB  |
+| **LocalMemGPT** backend | 4019 | ~150 MB  |
+
+### Web Dashboards (Next.js 16)
+
+| Dashboard              | Default Port | Compose Port | RAM Est. | Notes                                             |
+| ---------------------- | ------------ | ------------ | -------- | ------------------------------------------------- |
+| **admin-web**          | 3000         | **3001**     | ~250 MB  | No port in package.json; must set `PORT=3001` env |
+| **user-dashboard-web** | 3002         | 3002         | ~250 MB  | Port set in package.json                          |
+| **tracker-web**        | 3003         | 3003         | ~200 MB  | Port set in package.json                          |
+| **NomGap** web         | 3040         | 3040         | ~200 MB  | Port set in Dockerfile                            |
+| **ChronoMind** web     | 3000         | **3051**     | ~200 MB  | No port override; must set `PORT` env             |
+| **JarvisJr** web       | 3000         | **3052**     | ~200 MB  | No port override; must set `PORT` env             |
+| **FlowMonk** web       | 3000         | **3053**     | ~200 MB  | No port override; must set `PORT` env             |
+| **NoteLett** web       | 3000         | **3054**     | ~200 MB  | Dockerfile EXPOSE 3000; remap in compose          |
+| **ActionTrail** web    | 3000         | **3060**     | ~200 MB  | Dockerfile EXPOSE 3000; remap in compose          |
+| **LocalMemGPT** web    | 3070         | 3070         | ~200 MB  | Port set in package.json + Dockerfile             |
+| **MindLyst** web       | 3050         | 3050         | ~200 MB  | Port set in package.json (`-p 3050`)              |
+
+> **Port conflict warning:** Grafana uses port 3000. admin-web, ChronoMind, JarvisJr, FlowMonk, NoteLett, and ActionTrail webs all default to 3000. The compose file **must** either set `PORT` env var or remap via `ports:` mapping.
+
+### Optional / AI
+
+| Service          | Port  | RAM Est.                  |
+| ---------------- | ----- | ------------------------- |
+| **Ollama** (LLM) | 11434 | 4–16 GB (model-dependent) |
+
+---
+
+## 2. VM Sizing
+
+### Minimum (dev/staging, no Ollama)
+
+| Spec      | Value            |
+| --------- | ---------------- |
+| **vCPUs** | 8                |
+| **RAM**   | 32 GB            |
+| **Disk**  | 100 GB SSD       |
+| **OS**    | Ubuntu 24.04 LTS |
+
+**Breakdown:**
+
+- Cosmos Emulator: ~2 GB
+- 10 Fastify backends × 150 MB = ~1.5 GB
+- 3 shared services × 250 MB = ~0.75 GB
+- 10 Next.js webs × 200 MB = ~2 GB
+- Infra (Traefik, Loki, Grafana, Azurite, Mailpit) = ~0.65 GB
+- K3s overhead = ~0.5 GB
+- **Subtotal: ~7.4 GB** → headroom for spikes + build cache = **32 GB**
+
+### Recommended (with Ollama, small models)
+
+| Spec      | Value                                         |
+| --------- | --------------------------------------------- |
+| **vCPUs** | 16                                            |
+| **RAM**   | 64 GB                                         |
+| **Disk**  | 200 GB NVMe SSD                               |
+| **GPU**   | Optional NVIDIA T4/A10 for fast LLM inference |
+| **OS**    | Ubuntu 24.04 LTS                              |
+
+### Cloud Equivalents
+
+| Provider    | Instance         | vCPU | RAM    | Price (approx)   |
+| ----------- | ---------------- | ---- | ------ | ---------------- |
+| **Azure**   | Standard_D8s_v5  | 8    | 32 GB  | ~$280/mo         |
+| **Azure**   | Standard_D16s_v5 | 16   | 64 GB  | ~$560/mo         |
+| **AWS**     | m6i.2xlarge      | 8    | 32 GB  | ~$280/mo         |
+| **AWS**     | m6i.4xlarge      | 16   | 64 GB  | ~$560/mo         |
+| **Hetzner** | CPX51            | 16   | 32 GB  | ~$45/mo          |
+| **Hetzner** | CCX63            | 48   | 192 GB | ~$230/mo         |
+| **Home**    | Mac Mini M4 Pro  | 12   | 48 GB  | One-time ~$1,600 |
+
+> **Cost tip:** Hetzner is 5–10× cheaper than Azure/AWS for dev/staging.
+
+---
+
+## 3. Architecture: Docker Compose → K3s Migration Path
+
+### Phase 1: Docker Compose (after prerequisite work)
+
+> **⚠️ Prerequisite:** 6 repos need Dockerfiles created, 3 webs need `output: 'standalone'` in next.config.ts, and ALL product repos must run `docker-prep.sh` before building (see §12 Audit Findings).
+
+Create a **unified** `docker-compose.ecosystem.yml` that brings everything up.
+
+### Phase 2: K3s (single-node Kubernetes)
+
+[K3s](https://k3s.io/) is a lightweight, certified Kubernetes distro that runs on a single node. It gives you **real** `kubectl`, Helm, Ingress, and CRDs — identical APIs to production EKS/AKS/GKE.
+
+**Why K3s over minikube/kind?**
+
+- Production-grade (CNCF certified, used by Rancher)
+- Single binary, ~70 MB, installs in 30 seconds
+- Built-in Traefik Ingress (you already use Traefik!)
+- Built-in local-path StorageClass
+- Runs as systemd service (survives reboot)
+- Can scale to multi-node later by just joining worker nodes
+
+---
+
+## 4. Implementation Plan
+
+### 4.1 Phase 1 — Unified Docker Compose
+
+Create `docker-compose.ecosystem.yml` at workspace root (`~/code/mygh/`) that composes all services:
+
+**⚠️ Critical prerequisite — run BEFORE `docker compose build`:**
+
+```bash
+# Pack @bytelyst/* file: dependencies into tarballs for each product repo.
+# Every product repo has file: refs to ../learning_ai_common_plat/packages/*
+# which don't resolve inside Docker build context. docker-prep.sh packs them.
+for repo in learning_ai_trails learning_ai_local_memory_gpt learning_ai_notes learning_ai_fastgap; do
+  (cd $repo && ./scripts/docker-prep.sh)
+done
+# Repos without docker-prep.sh yet need it created (see §12 Audit Findings)
+```
+
+```yaml
+# ~/code/mygh/docker-compose.ecosystem.yml
+# NOTE: All product backends/webs have file: deps to @bytelyst/* packages.
+# You MUST run docker-prep.sh for each repo first (see above).
+
+services:
+  # ══════════════════════════════════════════════════════
+  # INFRASTRUCTURE
+  # ══════════════════════════════════════════════════════
+  cosmos-emulator:
+    image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview
+    ports: ['8081:8081', '1234:1234']
+    environment:
+      PROTOCOL: http
+      ENABLE_EXPLORER: 'true'
+    restart: unless-stopped
+
+  azurite:
+    image: mcr.microsoft.com/azure-storage/azurite:3.35.0
+    command: azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --skipApiVersionCheck
+    ports: ['10000:10000']
+    volumes: [azurite-data:/data]
+    restart: unless-stopped
+
+  mailpit:
+    image: axllent/mailpit:v1.27.5
+    ports: ['1025:1025', '8025:8025']
+    restart: unless-stopped
+
+  traefik:
+    image: traefik:v3.3
+    command:
+      - '--api.insecure=true'
+      - '--providers.docker=true'
+      - '--providers.docker.exposedbydefault=false'
+      - '--entrypoints.web.address=:80'
+    ports: ['80:80', '8080:8080']
+    volumes: ['/var/run/docker.sock:/var/run/docker.sock:ro']
+    restart: unless-stopped
+
+  loki:
+    image: grafana/loki:3.3.2
+    ports: ['3100:3100']
+    volumes: [loki-data:/loki]
+    restart: unless-stopped
+
+  grafana:
+    image: grafana/grafana:11.4.0
+    ports: ['3000:3000'] # NOTE: many Next.js webs also default to 3000 — avoid conflicts
+    environment:
+      GF_SECURITY_ADMIN_USER: admin
+      GF_SECURITY_ADMIN_PASSWORD: lysnrai
+    volumes: [grafana-data:/var/lib/grafana]
+    restart: unless-stopped
+
+  # ══════════════════════════════════════════════════════
+  # SHARED SERVICES (common-plat — no file: deps, pnpm workspace handles it)
+  # ══════════════════════════════════════════════════════
+  platform-service:
+    build:
+      context: ./learning_ai_common_plat
+      dockerfile: services/platform-service/Dockerfile
+    ports: ['4003:4003']
+    env_file: [.env.ecosystem]
+    environment:
+      PORT: 4003
+      COSMOS_AUTO_INIT: 'true'
+    depends_on: [cosmos-emulator, azurite, mailpit]
+    labels:
+      - 'traefik.enable=true'
+      - 'traefik.http.routers.platform.rule=Host(`platform.local`)'
+      - 'traefik.http.services.platform.loadbalancer.server.port=4003'
+    restart: unless-stopped
+
+  extraction-service:
+    build:
+      context: ./learning_ai_common_plat
+      dockerfile: services/extraction-service/Dockerfile
+    ports: ['4005:4005']
+    env_file: [.env.ecosystem]
+    environment:
+      PORT: 4005
+    depends_on: [cosmos-emulator]
+    restart: unless-stopped
+
+  mcp-server:
+    build:
+      context: ./learning_ai_common_plat
+      dockerfile: services/mcp-server/Dockerfile
+    ports: ['4007:4007']
+    env_file: [.env.ecosystem]
+    environment:
+      PORT: 4007
+      PLATFORM_SERVICE_URL: http://platform-service:4003
+      EXTRACTION_SERVICE_URL: http://extraction-service:4005
+    depends_on: [platform-service, extraction-service]
+    restart: unless-stopped
+
+  # ══════════════════════════════════════════════════════
+  # PRODUCT BACKENDS
+  # All have file: deps → must run docker-prep.sh first.
+  # ActionTrail + LocalMemGPT Dockerfiles use repo-root context.
+  # Others use backend/ subdir context.
+  # ══════════════════════════════════════════════════════
+  lysnrai-backend:
+    build: ./learning_voice_ai_agent/backend # Needs Dockerfile (missing)
+    ports: ['4015:4015']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4015', SERVICE_NAME: lysnrai-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  mindlyst-backend:
+    build: ./learning_multimodal_memory_agents/backend # Needs Dockerfile (missing)
+    ports: ['4014:4014']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4014', SERVICE_NAME: mindlyst-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  chronomind-backend:
+    build: ./learning_ai_clock/backend # Needs Dockerfile (missing)
+    ports: ['4011:4011']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4011', SERVICE_NAME: chronomind-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  jarvisjr-backend:
+    build: ./learning_ai_jarvis_jr/backend # Needs Dockerfile (missing)
+    ports: ['4012:4012']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4012', SERVICE_NAME: jarvisjr-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  nomgap-backend:
+    build: ./learning_ai_fastgap/backend
+    ports: ['4013:4013']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4013', SERVICE_NAME: nomgap-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  peakpulse-backend:
+    build: ./learning_ai_peakpulse/backend # Needs Dockerfile (missing)
+    ports: ['4010:4010']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4010', SERVICE_NAME: peakpulse-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  flowmonk-backend:
+    build: ./learning_ai_flowmonk/backend # Needs Dockerfile (missing)
+    ports: ['4017:4017']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4017', SERVICE_NAME: flowmonk-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  notelett-backend:
+    build: ./learning_ai_notes/backend
+    ports: ['4016:4016']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4016', SERVICE_NAME: notelett-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  actiontrail-backend:
+    build:
+      context: ./learning_ai_trails # Dockerfile expects repo-root context
+      dockerfile: backend/Dockerfile
+    ports: ['4018:4018']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4018', SERVICE_NAME: actiontrail-backend }
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  localmemgpt-backend:
+    build:
+      context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context
+      dockerfile: backend/Dockerfile
+    ports: ['4019:4019']
+    env_file: [.env.ecosystem]
+    environment: { PORT: '4019', OLLAMA_URL: 'http://host.docker.internal:11434' }
+    volumes: [localmemgpt-data:/app/db]
+    restart: unless-stopped
+
+  # ══════════════════════════════════════════════════════
+  # WEB DASHBOARDS
+  # IMPORTANT: Most webs default to port 3000 internally.
+  # Use PORT env var to override, or remap via host:container ports.
+  # ══════════════════════════════════════════════════════
+  admin-web:
+    build: ./learning_ai_common_plat/dashboards/admin-web
+    ports: ['3001:3001']
+    env_file: [.env.ecosystem]
+    environment:
+      PORT: 3001 # admin-web has NO port override — defaults to 3000 without this!
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  user-dashboard:
+    build: ./learning_voice_ai_agent/user-dashboard-web
+    ports: ['3002:3002']
+    env_file: [.env.ecosystem]
+    depends_on: [lysnrai-backend]
+    restart: unless-stopped
+
+  tracker-web:
+    build: ./learning_ai_common_plat/dashboards/tracker-web
+    ports: ['3003:3003']
+    env_file: [.env.ecosystem]
+    depends_on: [platform-service]
+    restart: unless-stopped
+
+  nomgap-web:
+    build: ./learning_ai_fastgap/web
+    ports: ['3040:3040']
+    environment:
+      PORT: 3040
+      NEXT_PUBLIC_NOMGAP_API_URL: http://nomgap-backend:4013/api
+      NEXT_PUBLIC_PLATFORM_SERVICE_URL: http://platform-service:4003/api
+    depends_on: [nomgap-backend]
+    restart: unless-stopped
+
+  actiontrail-web:
+    build: ./learning_ai_trails/web
+    ports: ['3060:3000'] # Internal 3000 → external 3060
+    environment:
+      NEXT_PUBLIC_API_URL: http://actiontrail-backend:4018
+    depends_on: [actiontrail-backend]
+    restart: unless-stopped
+
+  localmemgpt-web:
+    build:
+      context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context
+      dockerfile: web/Dockerfile
+    ports: ['3070:3070']
+    environment:
+      NEXT_PUBLIC_BACKEND_URL: http://localmemgpt-backend:4019
+    depends_on: [localmemgpt-backend]
+    restart: unless-stopped
+
+  notelett-web:
+    build: ./learning_ai_notes/web
+    ports: ['3054:3000'] # Internal 3000 → external 3054
+    environment:
+      NEXT_PUBLIC_BACKEND_URL: http://notelett-backend:4016
+    depends_on: [notelett-backend]
+    restart: unless-stopped
+
+  # Remaining webs need Dockerfiles + output:'standalone' in next.config.ts:
+  # chronomind-web (3051), jarvisjr-web (3052), flowmonk-web (3053), mindlyst-web (3050)
+
+volumes:
+  azurite-data:
+  loki-data:
+  grafana-data:
+  localmemgpt-data:
+```
+
+### 4.2 Phase 2 — K3s (Single-Node Kubernetes)
+
+#### Install K3s on the VM
+
+```bash
+# Install K3s (30 seconds, includes kubectl + containerd)
+curl -sfL https://get.k3s.io | sh -
+
+# Verify
+sudo kubectl get nodes
+# NAME       STATUS   ROLES                  AGE   VERSION
+# myvm       Ready    control-plane,master   30s   v1.30.x+k3s1
+
+# Copy kubeconfig for non-root usage
+mkdir -p ~/.kube
+sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
+sudo chown $(id -u):$(id -g) ~/.kube/config
+
+# Install Helm
+curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
+```
+
+#### Namespace Layout
+
+```bash
+kubectl create namespace bytelyst-infra      # Cosmos, Azurite, Mailpit, Loki, Grafana
+kubectl create namespace bytelyst-platform   # platform-service, extraction, mcp
+kubectl create namespace bytelyst-products   # 10 product backends
+kubectl create namespace bytelyst-web        # All Next.js dashboards
+```
+
+#### Example K8s Manifest (one backend)
+
+```yaml
+# k8s/products/lysnrai-backend.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: lysnrai-backend
+  namespace: bytelyst-products
+  labels:
+    app: lysnrai-backend
+    product: lysnrai
+spec:
+  replicas: 1 # Scale to 2+ when ready
+  selector:
+    matchLabels:
+      app: lysnrai-backend
+  template:
+    metadata:
+      labels:
+        app: lysnrai-backend
+    spec:
+      containers:
+        - name: lysnrai-backend
+          image: bytelyst/lysnrai-backend:latest
+          ports:
+            - containerPort: 4015
+          envFrom:
+            - configMapRef:
+                name: bytelyst-common-config
+            - secretRef:
+                name: bytelyst-secrets
+          env:
+            - name: PORT
+              value: '4015'
+            - name: SERVICE_NAME
+              value: lysnrai-backend
+          resources:
+            requests:
+              memory: '128Mi'
+              cpu: '100m'
+            limits:
+              memory: '256Mi'
+              cpu: '500m'
+          livenessProbe:
+            httpGet:
+              path: /health
+              port: 4015
+            initialDelaySeconds: 10
+            periodSeconds: 30
+          readinessProbe:
+            httpGet:
+              path: /health
+              port: 4015
+            initialDelaySeconds: 5
+            periodSeconds: 10
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: lysnrai-backend
+  namespace: bytelyst-products
+spec:
+  selector:
+    app: lysnrai-backend
+  ports:
+    - port: 4015
+      targetPort: 4015
+```
+
+#### Ingress (Traefik, built into K3s)
+
+```yaml
+# k8s/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: bytelyst-ingress
+  namespace: bytelyst-products
+  annotations:
+    traefik.ingress.kubernetes.io/router.entrypoints: web
+spec:
+  rules:
+    - host: lysnrai.local
+      http:
+        paths:
+          - path: /api
+            pathType: Prefix
+            backend:
+              service:
+                name: lysnrai-backend
+                port:
+                  number: 4015
+    - host: platform.local
+      http:
+        paths:
+          - path: /
+            pathType: Prefix
+            backend:
+              service:
+                name: platform-service
+                port:
+                  number: 4003
+    # ... repeat per product
+```
+
+---
+
+## 5. Docker Compose → K3s Migration Cheat Sheet
+
+| Docker Compose            | K3s Equivalent                             |
+| ------------------------- | ------------------------------------------ |
+| `services:`               | `Deployment` + `Service`                   |
+| `ports:`                  | `Service` (ClusterIP/NodePort)             |
+| `env_file:`               | `ConfigMap` + `Secret`                     |
+| `depends_on:`             | `initContainers` or readiness probes       |
+| `volumes:`                | `PersistentVolumeClaim` (local-path)       |
+| `restart: unless-stopped` | Built-in (K8s always restarts pods)        |
+| `labels: traefik.*`       | `Ingress` resource                         |
+| `docker compose up`       | `kubectl apply -k k8s/`                    |
+| `docker compose logs`     | `kubectl logs -f deploy/X` or Loki/Grafana |
+| `docker compose ps`       | `kubectl get pods -A`                      |
+| Scale: change nothing     | `kubectl scale deploy/X --replicas=3`      |
+
+---
+
+## 6. K3s Practice Exercises (on single VM)
+
+These exercises simulate real production scenarios:
+
+### Exercise 1: Rolling Update
+
+```bash
+# Build new image, deploy with zero downtime
+docker build -t bytelyst/lysnrai-backend:v2 ./learning_voice_ai_agent/backend
+kubectl set image deploy/lysnrai-backend lysnrai-backend=bytelyst/lysnrai-backend:v2 -n bytelyst-products
+kubectl rollout status deploy/lysnrai-backend -n bytelyst-products
+```
+
+### Exercise 2: Scale Horizontally
+
+```bash
+kubectl scale deploy/platform-service --replicas=3 -n bytelyst-platform
+# Traefik auto-balances across all 3 pods
+```
+
+### Exercise 3: ConfigMap / Secret Rotation
+
+```bash
+kubectl create secret generic bytelyst-secrets \
+  --from-literal=JWT_SECRET=new-secret \
+  --from-literal=COSMOS_KEY=new-key \
+  -n bytelyst-platform --dry-run=client -o yaml | kubectl apply -f -
+kubectl rollout restart deploy -n bytelyst-platform
+```
+
+### Exercise 4: Resource Limits + HPA
+
+```yaml
+# Auto-scale platform-service 1→5 pods based on CPU
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: platform-service-hpa
+  namespace: bytelyst-platform
+spec:
+  scaleTargetRef:
+    apiVersion: apps/v1
+    kind: Deployment
+    name: platform-service
+  minReplicas: 1
+  maxReplicas: 5
+  metrics:
+    - type: Resource
+      resource:
+        name: cpu
+        target:
+          type: Utilization
+          averageUtilization: 70
+```
+
+### Exercise 5: Helm Chart (packaged deploy)
+
+```bash
+# Create chart scaffold
+helm create bytelyst-ecosystem
+# Templatize all 25+ services into one chart
+# Deploy: helm install bytelyst ./bytelyst-ecosystem -n bytelyst
+```
+
+---
+
+## 7. Scaling Path: Single VM → Multi-Node
+
+```
+Phase 1: Docker Compose          Phase 2: K3s (1 node)
+┌─────────────────────┐          ┌──────────────────────┐
+│  Single VM           │    →     │  Single VM + K3s     │
+│  docker compose up   │          │  kubectl apply -k    │
+│  ~25 containers      │          │  ~25 pods            │
+└─────────────────────┘          └──────────────────────┘
+                                          │
+                                          ▼
+Phase 3: K3s (3 nodes)           Phase 4: Managed K8s
+┌──────────────────────┐         ┌──────────────────────┐
+│  1 server + 2 agents │    →    │  AKS / EKS / GKE     │
+│  Same manifests!     │         │  Same manifests!      │
+│  Real HA             │         │  Auto-scaling nodes   │
+└──────────────────────┘         └──────────────────────┘
+```
+
+**Adding a worker node to K3s is one command:**
+
+```bash
+# On the worker VM:
+curl -sfL https://get.k3s.io | K3S_URL=https://server-ip:6443 K3S_TOKEN=<token> sh -
+```
+
+---
+
+## 8. Recommended Directory Structure
+
+```
+~/code/mygh/
+├── docker-compose.ecosystem.yml     # Phase 1: all-in-one compose
+├── .env.ecosystem                   # Shared env vars
+├── k8s/                             # Phase 2: K3s manifests
+│   ├── kustomization.yaml           # Kustomize root
+│   ├── infra/                       # Cosmos emulator, Azurite, Mailpit, Loki, Grafana
+│   ├── platform/                    # platform-service, extraction, mcp
+│   ├── products/                    # 10 product backends
+│   ├── web/                         # 10+ Next.js dashboards
+│   ├── config/                      # ConfigMaps
+│   └── secrets/                     # Secrets (gitignored)
+├── helm/                            # Phase 3: Helm chart
+│   └── bytelyst-ecosystem/
+│       ├── Chart.yaml
+│       ├── values.yaml
+│       └── templates/
+└── scripts/
+    ├── ecosystem-up.sh              # docker compose -f docker-compose.ecosystem.yml up -d
+    ├── ecosystem-k3s-deploy.sh      # kubectl apply -k k8s/
+    └── ecosystem-build-all.sh       # Build all Docker images
+```
+
+---
+
+## 9. Quick Start Commands
+
+```bash
+# ── Phase 1: Docker Compose ───────────────────────────
+cd ~/code/mygh
+
+# Build all images (first time, ~15-20 min)
+docker compose -f docker-compose.ecosystem.yml build
+
+# Start everything
+docker compose -f docker-compose.ecosystem.yml up -d
+
+# Check status
+docker compose -f docker-compose.ecosystem.yml ps
+
+# View logs
+docker compose -f docker-compose.ecosystem.yml logs -f platform-service
+
+# Tear down
+docker compose -f docker-compose.ecosystem.yml down
+
+# ── Phase 2: K3s ──────────────────────────────────────
+# Build + load images into K3s containerd
+docker build -t bytelyst/platform-service:latest ./learning_ai_common_plat/services/platform-service
+sudo k3s ctr images import <(docker save bytelyst/platform-service:latest)
+
+# Deploy all
+kubectl apply -k k8s/
+
+# Check pods
+kubectl get pods -A
+
+# Port-forward for local access
+kubectl port-forward svc/platform-service 4003:4003 -n bytelyst-platform
+```
+
+---
+
+## 10. What's NOT Dockerized Yet (gaps)
+
+| Repo            | Backend Dockerfile | Web Dockerfile      | `docker-prep.sh` | `output:'standalone'` | Status                                                         |
+| --------------- | ------------------ | ------------------- | ---------------- | --------------------- | -------------------------------------------------------------- |
+| **LysnrAI**     | ❌                 | ✅ user-dashboard   | ❌               | ✅ (conditional)      | Need backend Dockerfile + docker-prep.sh                       |
+| **MindLyst**    | ❌                 | ❌                  | ❌               | ❌                    | Need all 4                                                     |
+| **ChronoMind**  | ❌                 | ❌                  | ❌               | ❌                    | Need all 4                                                     |
+| **JarvisJr**    | ❌                 | ❌                  | ❌               | ❌                    | Need all 4                                                     |
+| **PeakPulse**   | ❌                 | ❌                  | ❌               | ❌                    | Need all 4                                                     |
+| **FlowMonk**    | ❌                 | ❌                  | ❌               | ❌                    | Need all 4                                                     |
+| **NomGap**      | ✅ ⚠️              | ✅                  | ✅               | ✅                    | Backend Dockerfile ignores `file:` deps — see §12.F3           |
+| **NoteLett**    | ✅ ⚠️              | ✅                  | ✅               | ✅                    | Backend Dockerfile `COPY .` pulls broken symlinks — see §12.F4 |
+| **ActionTrail** | ✅                 | ✅                  | ✅               | ✅                    | Ready (uses `.tarballs/` pattern)                              |
+| **LocalMemGPT** | ✅                 | ✅                  | ✅               | ✅                    | Ready (repo-root build context)                                |
+| **admin-web**   | —                  | ✅ (in common-plat) | N/A (pnpm)       | ✅ (conditional)      | Ready                                                          |
+| **tracker-web** | —                  | ✅ (in common-plat) | N/A (pnpm)       | ✅ (conditional)      | Ready                                                          |
+
+**6 repos need Dockerfiles** + `docker-prep.sh` + `output:'standalone'`. 2 existing Dockerfiles have issues.
+
+---
+
+## 11. Dockerfile Template (for missing repos)
+
+> **Critical:** These templates assume you run `docker-prep.sh` first to pack `@bytelyst/*` file: deps into `.tarballs/`. Without this, `npm ci` will fail because `file:../../learning_ai_common_plat/packages/*` doesn't exist inside the Docker build context.
+
+### Backend (Fastify 5 + TypeScript)
+
+```dockerfile
+# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs
+FROM node:22-alpine AS builder
+WORKDIR /app
+
+COPY package.json package-lock.json ./
+COPY .tarballs/ ./.tarballs/
+RUN npm ci --ignore-scripts
+
+COPY tsconfig.json ./
+COPY src/ ./src/
+RUN npx tsc
+
+# Production stage
+FROM node:22-alpine
+WORKDIR /app
+ENV NODE_ENV=production
+
+COPY package.json package-lock.json ./
+COPY .tarballs/ ./.tarballs/
+RUN npm ci --omit=dev --ignore-scripts
+
+COPY --from=builder /app/dist ./dist
+# Copy shared/product.json if the backend reads it at runtime
+COPY shared/ ./shared/ 2>/dev/null || true
+
+EXPOSE ${PORT:-4010}
+CMD ["node", "dist/server.js"]
+```
+
+### Web (Next.js 16)
+
+> **Prerequisite:** `next.config.ts` MUST have `output: 'standalone'` for the standalone Dockerfile pattern to work. Without it, `.next/standalone/` won't be generated and the COPY will fail.
+
+```dockerfile
+# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs
+FROM node:22-alpine AS builder
+WORKDIR /app
+
+COPY package.json package-lock.json ./
+COPY .tarballs/ ./.tarballs/
+RUN npm ci
+
+COPY . .
+
+# Dummy env vars for Next.js build-time static page collection
+ENV NEXT_PUBLIC_BACKEND_URL=http://localhost:4010
+ENV NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003
+
+RUN npm run build
+
+FROM node:22-alpine
+WORKDIR /app
+ENV NODE_ENV=production
+
+COPY --from=builder /app/.next/standalone ./
+COPY --from=builder /app/.next/static ./.next/static
+COPY --from=builder /app/public ./public 2>/dev/null || true
+
+EXPOSE 3000
+CMD ["node", "server.js"]
+```
+
+### docker-prep.sh (for repos that don't have one yet)
+
+Copy from `learning_ai_trails/scripts/docker-prep.sh` — it handles both `backend/` and `web/` targets, packs all `file:` refs into `.tarballs/`, and rewrites `package.json` to point at them.
+
+```bash
+cp learning_ai_trails/scripts/docker-prep.sh <target-repo>/scripts/docker-prep.sh
+chmod +x <target-repo>/scripts/docker-prep.sh
+```
+
+---
+
+## 12. Audit Findings (Review 2026-03-22)
+
+Systematic code review of all claims in this document against the actual codebase.
+
+### F1. Port Conflicts (CRITICAL)
+
+**Grafana** uses port 3000. The following webs also default to 3000:
+
+- admin-web (no port in package.json)
+- ChronoMind web (no port override)
+- JarvisJr web (no port override)
+- FlowMonk web (no port override)
+- NoteLett web (Dockerfile EXPOSE 3000)
+- ActionTrail web (Dockerfile EXPOSE 3000)
+
+**Fix:** Set `PORT` env var in compose for each, or use host:container port remapping.
+
+### F2. `file:` Dependencies Break Docker Builds (CRITICAL)
+
+**Every** product backend and web has `file:../../learning_ai_common_plat/packages/*` dependencies in package.json. These resolve locally via symlinks but **fail inside Docker** because the sibling repo isn't in the build context.
+
+**Pattern:** Each repo needs a `docker-prep.sh` that:
+
+1. Runs `pnpm build` in common-plat
+2. Packs each `@bytelyst/*` package into a `.tarballs/*.tgz`
+3. Rewrites package.json `file:` refs → `file:.tarballs/bytelyst-*.tgz`
+
+**Repos with `docker-prep.sh`:** ActionTrail ✅, LocalMemGPT ✅, NoteLett ✅, NomGap ✅
+**Repos missing `docker-prep.sh`:** LysnrAI, MindLyst, ChronoMind, JarvisJr, PeakPulse, FlowMonk
+
+### F3. NomGap Backend Dockerfile Ignores `file:` Deps (BUG)
+
+`@/learning_ai_fastgap/backend/Dockerfile` does `COPY package.json → npm ci` but doesn't copy `.tarballs/`. The `file:` refs will fail. Needs the `.tarballs/` COPY step added.
+
+### F4. NoteLett Backend Dockerfile Copies Everything (BUG)
+
+`@/learning_ai_notes/backend/Dockerfile` does `COPY . .` in the build stage, which includes broken `node_modules` symlinks from `file:` deps. Should use explicit `COPY` of `src/`, `tsconfig.json`, and `.tarballs/` instead.
+
+### F5. Missing `output: 'standalone'` in next.config.ts (CRITICAL)
+
+The Dockerfile template copies from `.next/standalone/` — this directory only exists when `output: 'standalone'` is set in `next.config.ts`.
+
+| Web            | Has `output: 'standalone'`? | Notes                                                             |
+| -------------- | --------------------------- | ----------------------------------------------------------------- |
+| NomGap         | ✅                          | Set directly                                                      |
+| NoteLett       | ✅                          | Set directly                                                      |
+| ActionTrail    | ✅                          | Set directly                                                      |
+| LocalMemGPT    | ✅                          | Set directly                                                      |
+| admin-web      | ✅                          | Conditional: `process.env.VERCEL ? {} : { output: 'standalone' }` |
+| tracker-web    | ✅                          | Conditional (same)                                                |
+| user-dashboard | ✅                          | Conditional (same)                                                |
+| ChronoMind     | ❌                          | **Must add**                                                      |
+| JarvisJr       | ❌                          | **Must add**                                                      |
+| FlowMonk       | ❌                          | **Must add**                                                      |
+| MindLyst       | ❌                          | Unknown — needs check                                             |
+
+### F6. Build Context Mismatch for ActionTrail + LocalMemGPT
+
+Their Dockerfiles expect repo-root as build context (they `COPY backend/...` and `COPY shared/...`). The compose `build:` must use `context: ./repo-name` + `dockerfile: backend/Dockerfile`, not `build: ./repo-name/backend`.
+
+**Already correct in the compose above.** Calling it out so future editors don't "simplify" it.
+
+### F7. Node.js Version Inconsistency
+
+Existing Dockerfiles use mixed Node versions:
+
+- NomGap, NoteLett: `node:20-alpine`
+- ActionTrail, LocalMemGPT: `node:22-alpine` / `node:22-slim`
+
+**Recommendation:** Standardize on `node:22-alpine` for all new Dockerfiles. Existing ones work but should be updated for consistency.
+
+### F8. Missing `--webpack` Flag for Next.js Builds
+
+Several web apps require `--webpack` flag for builds (Serwist PWA incompatible with Turbopack, or `@bytelyst/*` file: ref transpilation). The Dockerfile template uses `npm run build` which should map to `next build --webpack` in package.json — verify each repo's `build` script.
+
+### F9. Missing `.env.ecosystem` Template
+
+The compose references `.env.ecosystem` but the doc doesn't define its contents. Key vars needed:
+
+```env
+# .env.ecosystem — shared env for all services
+COSMOS_ENDPOINT=https://cosmos-emulator:8081
+COSMOS_KEY=<emulator-key>
+COSMOS_DATABASE=bytelyst
+JWT_SECRET=dev-ecosystem-secret-change-me
+AZURE_BLOB_CONNECTION_STRING=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=...;BlobEndpoint=http://azurite:10000/devstoreaccount1;
+PLATFORM_SERVICE_URL=http://platform-service:4003
+EXTRACTION_SERVICE_URL=http://extraction-service:4005
+DB_PROVIDER=memory
+NODE_ENV=production
+CORS_ORIGIN=*
+SMTP_HOST=mailpit
+SMTP_PORT=1025
+```
+
+### F10. `host.docker.internal` Only Works on Docker Desktop (Mac/Windows)
+
+LocalMemGPT uses `OLLAMA_URL: 'http://host.docker.internal:11434'` — this works on Docker Desktop but **not on Linux VMs** (which is the likely deployment target).
+
+**Fix on Linux:** Add `extra_hosts: ['host.docker.internal:host-gateway']` to the service, or use `network_mode: host`.
+
+### Summary of Required Work Before Compose Works
+
+| Priority | Item                                                     | Count         |
+| -------- | -------------------------------------------------------- | ------------- |
+| **P0**   | Create missing `docker-prep.sh`                          | 6 repos       |
+| **P0**   | Create missing backend Dockerfiles                       | 6 repos       |
+| **P0**   | Create missing web Dockerfiles                           | 5 repos       |
+| **P0**   | Add `output: 'standalone'` to next.config.ts             | 3 webs        |
+| **P1**   | Fix NomGap backend Dockerfile (add `.tarballs/` COPY)    | 1 file        |
+| **P1**   | Fix NoteLett backend Dockerfile (explicit COPY, not `.`) | 1 file        |
+| **P1**   | Create `.env.ecosystem` template                         | 1 file        |
+| **P2**   | Standardize Node.js version to 22-alpine                 | 4 Dockerfiles |
+| **P2**   | Add `extra_hosts` for Linux VM Ollama access             | 1 service     |
+
+---
+
+## Summary
+
+| Question                       | Answer                                                                                                         |
+| ------------------------------ | -------------------------------------------------------------------------------------------------------------- |
+| **Can deploy on single VM?**   | **Yes.** All ~25 services fit in 32 GB RAM.                                                                    |
+| **All Dockerized?**            | 4/10 product repos fully Dockerized. 6 need Dockerfiles (copy-paste template).                                 |
+| **K8s practice on single VM?** | **K3s** — certified K8s, single binary, same manifests scale to multi-node or AKS/EKS/GKE.                     |
+| **Recommended VM?**            | 8 vCPU / 32 GB (min) or 16 vCPU / 64 GB (with Ollama). Hetzner ~$45/mo for dev.                                |
+| **Time to production K8s?**    | Phase 1 (compose) → Phase 2 (K3s single) → Phase 3 (K3s multi) → Phase 4 (managed). Same manifests throughout. |