From ae2af43d71d04f31894843e339f27add03296304 Mon Sep 17 00:00:00 2001 From: saravanakumardb1 Date: Sun, 22 Mar 2026 00:18:17 -0700 Subject: [PATCH] docs(devops): add single-VM deployment guide with audit findings --- docs/devops/SINGLE_VM_DEPLOYMENT.md | 956 ++++++++++++++++++++++++++++ 1 file changed, 956 insertions(+) create mode 100644 docs/devops/SINGLE_VM_DEPLOYMENT.md diff --git a/docs/devops/SINGLE_VM_DEPLOYMENT.md b/docs/devops/SINGLE_VM_DEPLOYMENT.md new file mode 100644 index 00000000..16e12d0d --- /dev/null +++ b/docs/devops/SINGLE_VM_DEPLOYMENT.md @@ -0,0 +1,956 @@ +# ByteLyst Ecosystem — Single-VM Deployment Guide + +> Deploy the **entire** ByteLyst ecosystem on one VM, fully Dockerized, with a K3s Kubernetes layer for production-readiness practice. + +--- + +## 1. Service Inventory + +### Shared Infrastructure (common-plat) + +| Service | Port | Image | RAM Est. | +| ------------------------ | ---------- | ---------------------------------------------------------------------- | -------- | +| **platform-service** | 4003 | Fastify 5 + TS | ~200 MB | +| **extraction-service** | 4005 | Fastify 5 + Python sidecar | ~350 MB | +| **mcp-server** | 4007 | Fastify 5 + TS | ~150 MB | +| **Cosmos DB Emulator** | 8081, 1234 | `mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview` | ~2 GB | +| **Azurite** (blob) | 10000 | `mcr.microsoft.com/azure-storage/azurite` | ~100 MB | +| **Mailpit** (SMTP) | 1025, 8025 | `axllent/mailpit` | ~50 MB | +| **Traefik** (gateway) | 80, 8080 | `traefik:v3.3` | ~100 MB | +| **Loki** (logs) | 3100 | `grafana/loki` | ~200 MB | +| **Grafana** (dashboards) | 3000 | `grafana/grafana` | ~200 MB | + +### Product Backends (Fastify 5 + TypeScript) + +| Product | Port | RAM Est. | +| ----------------------- | ---- | -------- | +| **LysnrAI** backend | 4015 | ~150 MB | +| **MindLyst** backend | 4014 | ~150 MB | +| **ChronoMind** backend | 4011 | ~150 MB | +| **JarvisJr** backend | 4012 | ~150 MB | +| **NomGap** backend | 4013 | ~150 MB | +| **PeakPulse** backend | 4010 | ~150 MB | +| **FlowMonk** backend | 4017 | ~150 MB | +| **NoteLett** backend | 4016 | ~150 MB | +| **ActionTrail** backend | 4018 | ~150 MB | +| **LocalMemGPT** backend | 4019 | ~150 MB | + +### Web Dashboards (Next.js 16) + +| Dashboard | Default Port | Compose Port | RAM Est. | Notes | +| ---------------------- | ------------ | ------------ | -------- | ------------------------------------------------- | +| **admin-web** | 3000 | **3001** | ~250 MB | No port in package.json; must set `PORT=3001` env | +| **user-dashboard-web** | 3002 | 3002 | ~250 MB | Port set in package.json | +| **tracker-web** | 3003 | 3003 | ~200 MB | Port set in package.json | +| **NomGap** web | 3040 | 3040 | ~200 MB | Port set in Dockerfile | +| **ChronoMind** web | 3000 | **3051** | ~200 MB | No port override; must set `PORT` env | +| **JarvisJr** web | 3000 | **3052** | ~200 MB | No port override; must set `PORT` env | +| **FlowMonk** web | 3000 | **3053** | ~200 MB | No port override; must set `PORT` env | +| **NoteLett** web | 3000 | **3054** | ~200 MB | Dockerfile EXPOSE 3000; remap in compose | +| **ActionTrail** web | 3000 | **3060** | ~200 MB | Dockerfile EXPOSE 3000; remap in compose | +| **LocalMemGPT** web | 3070 | 3070 | ~200 MB | Port set in package.json + Dockerfile | +| **MindLyst** web | 3050 | 3050 | ~200 MB | Port set in package.json (`-p 3050`) | + +> **Port conflict warning:** Grafana uses port 3000. admin-web, ChronoMind, JarvisJr, FlowMonk, NoteLett, and ActionTrail webs all default to 3000. The compose file **must** either set `PORT` env var or remap via `ports:` mapping. + +### Optional / AI + +| Service | Port | RAM Est. | +| ---------------- | ----- | ------------------------- | +| **Ollama** (LLM) | 11434 | 4–16 GB (model-dependent) | + +--- + +## 2. VM Sizing + +### Minimum (dev/staging, no Ollama) + +| Spec | Value | +| --------- | ---------------- | +| **vCPUs** | 8 | +| **RAM** | 32 GB | +| **Disk** | 100 GB SSD | +| **OS** | Ubuntu 24.04 LTS | + +**Breakdown:** + +- Cosmos Emulator: ~2 GB +- 10 Fastify backends × 150 MB = ~1.5 GB +- 3 shared services × 250 MB = ~0.75 GB +- 10 Next.js webs × 200 MB = ~2 GB +- Infra (Traefik, Loki, Grafana, Azurite, Mailpit) = ~0.65 GB +- K3s overhead = ~0.5 GB +- **Subtotal: ~7.4 GB** → headroom for spikes + build cache = **32 GB** + +### Recommended (with Ollama, small models) + +| Spec | Value | +| --------- | --------------------------------------------- | +| **vCPUs** | 16 | +| **RAM** | 64 GB | +| **Disk** | 200 GB NVMe SSD | +| **GPU** | Optional NVIDIA T4/A10 for fast LLM inference | +| **OS** | Ubuntu 24.04 LTS | + +### Cloud Equivalents + +| Provider | Instance | vCPU | RAM | Price (approx) | +| ----------- | ---------------- | ---- | ------ | ---------------- | +| **Azure** | Standard_D8s_v5 | 8 | 32 GB | ~$280/mo | +| **Azure** | Standard_D16s_v5 | 16 | 64 GB | ~$560/mo | +| **AWS** | m6i.2xlarge | 8 | 32 GB | ~$280/mo | +| **AWS** | m6i.4xlarge | 16 | 64 GB | ~$560/mo | +| **Hetzner** | CPX51 | 16 | 32 GB | ~$45/mo | +| **Hetzner** | CCX63 | 48 | 192 GB | ~$230/mo | +| **Home** | Mac Mini M4 Pro | 12 | 48 GB | One-time ~$1,600 | + +> **Cost tip:** Hetzner is 5–10× cheaper than Azure/AWS for dev/staging. + +--- + +## 3. Architecture: Docker Compose → K3s Migration Path + +### Phase 1: Docker Compose (after prerequisite work) + +> **⚠️ Prerequisite:** 6 repos need Dockerfiles created, 3 webs need `output: 'standalone'` in next.config.ts, and ALL product repos must run `docker-prep.sh` before building (see §12 Audit Findings). + +Create a **unified** `docker-compose.ecosystem.yml` that brings everything up. + +### Phase 2: K3s (single-node Kubernetes) + +[K3s](https://k3s.io/) is a lightweight, certified Kubernetes distro that runs on a single node. It gives you **real** `kubectl`, Helm, Ingress, and CRDs — identical APIs to production EKS/AKS/GKE. + +**Why K3s over minikube/kind?** + +- Production-grade (CNCF certified, used by Rancher) +- Single binary, ~70 MB, installs in 30 seconds +- Built-in Traefik Ingress (you already use Traefik!) +- Built-in local-path StorageClass +- Runs as systemd service (survives reboot) +- Can scale to multi-node later by just joining worker nodes + +--- + +## 4. Implementation Plan + +### 4.1 Phase 1 — Unified Docker Compose + +Create `docker-compose.ecosystem.yml` at workspace root (`~/code/mygh/`) that composes all services: + +**⚠️ Critical prerequisite — run BEFORE `docker compose build`:** + +```bash +# Pack @bytelyst/* file: dependencies into tarballs for each product repo. +# Every product repo has file: refs to ../learning_ai_common_plat/packages/* +# which don't resolve inside Docker build context. docker-prep.sh packs them. +for repo in learning_ai_trails learning_ai_local_memory_gpt learning_ai_notes learning_ai_fastgap; do + (cd $repo && ./scripts/docker-prep.sh) +done +# Repos without docker-prep.sh yet need it created (see §12 Audit Findings) +``` + +```yaml +# ~/code/mygh/docker-compose.ecosystem.yml +# NOTE: All product backends/webs have file: deps to @bytelyst/* packages. +# You MUST run docker-prep.sh for each repo first (see above). + +services: + # ══════════════════════════════════════════════════════ + # INFRASTRUCTURE + # ══════════════════════════════════════════════════════ + cosmos-emulator: + image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview + ports: ['8081:8081', '1234:1234'] + environment: + PROTOCOL: http + ENABLE_EXPLORER: 'true' + restart: unless-stopped + + azurite: + image: mcr.microsoft.com/azure-storage/azurite:3.35.0 + command: azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --skipApiVersionCheck + ports: ['10000:10000'] + volumes: [azurite-data:/data] + restart: unless-stopped + + mailpit: + image: axllent/mailpit:v1.27.5 + ports: ['1025:1025', '8025:8025'] + restart: unless-stopped + + traefik: + image: traefik:v3.3 + command: + - '--api.insecure=true' + - '--providers.docker=true' + - '--providers.docker.exposedbydefault=false' + - '--entrypoints.web.address=:80' + ports: ['80:80', '8080:8080'] + volumes: ['/var/run/docker.sock:/var/run/docker.sock:ro'] + restart: unless-stopped + + loki: + image: grafana/loki:3.3.2 + ports: ['3100:3100'] + volumes: [loki-data:/loki] + restart: unless-stopped + + grafana: + image: grafana/grafana:11.4.0 + ports: ['3000:3000'] # NOTE: many Next.js webs also default to 3000 — avoid conflicts + environment: + GF_SECURITY_ADMIN_USER: admin + GF_SECURITY_ADMIN_PASSWORD: lysnrai + volumes: [grafana-data:/var/lib/grafana] + restart: unless-stopped + + # ══════════════════════════════════════════════════════ + # SHARED SERVICES (common-plat — no file: deps, pnpm workspace handles it) + # ══════════════════════════════════════════════════════ + platform-service: + build: + context: ./learning_ai_common_plat + dockerfile: services/platform-service/Dockerfile + ports: ['4003:4003'] + env_file: [.env.ecosystem] + environment: + PORT: 4003 + COSMOS_AUTO_INIT: 'true' + depends_on: [cosmos-emulator, azurite, mailpit] + labels: + - 'traefik.enable=true' + - 'traefik.http.routers.platform.rule=Host(`platform.local`)' + - 'traefik.http.services.platform.loadbalancer.server.port=4003' + restart: unless-stopped + + extraction-service: + build: + context: ./learning_ai_common_plat + dockerfile: services/extraction-service/Dockerfile + ports: ['4005:4005'] + env_file: [.env.ecosystem] + environment: + PORT: 4005 + depends_on: [cosmos-emulator] + restart: unless-stopped + + mcp-server: + build: + context: ./learning_ai_common_plat + dockerfile: services/mcp-server/Dockerfile + ports: ['4007:4007'] + env_file: [.env.ecosystem] + environment: + PORT: 4007 + PLATFORM_SERVICE_URL: http://platform-service:4003 + EXTRACTION_SERVICE_URL: http://extraction-service:4005 + depends_on: [platform-service, extraction-service] + restart: unless-stopped + + # ══════════════════════════════════════════════════════ + # PRODUCT BACKENDS + # All have file: deps → must run docker-prep.sh first. + # ActionTrail + LocalMemGPT Dockerfiles use repo-root context. + # Others use backend/ subdir context. + # ══════════════════════════════════════════════════════ + lysnrai-backend: + build: ./learning_voice_ai_agent/backend # Needs Dockerfile (missing) + ports: ['4015:4015'] + env_file: [.env.ecosystem] + environment: { PORT: '4015', SERVICE_NAME: lysnrai-backend } + depends_on: [platform-service] + restart: unless-stopped + + mindlyst-backend: + build: ./learning_multimodal_memory_agents/backend # Needs Dockerfile (missing) + ports: ['4014:4014'] + env_file: [.env.ecosystem] + environment: { PORT: '4014', SERVICE_NAME: mindlyst-backend } + depends_on: [platform-service] + restart: unless-stopped + + chronomind-backend: + build: ./learning_ai_clock/backend # Needs Dockerfile (missing) + ports: ['4011:4011'] + env_file: [.env.ecosystem] + environment: { PORT: '4011', SERVICE_NAME: chronomind-backend } + depends_on: [platform-service] + restart: unless-stopped + + jarvisjr-backend: + build: ./learning_ai_jarvis_jr/backend # Needs Dockerfile (missing) + ports: ['4012:4012'] + env_file: [.env.ecosystem] + environment: { PORT: '4012', SERVICE_NAME: jarvisjr-backend } + depends_on: [platform-service] + restart: unless-stopped + + nomgap-backend: + build: ./learning_ai_fastgap/backend + ports: ['4013:4013'] + env_file: [.env.ecosystem] + environment: { PORT: '4013', SERVICE_NAME: nomgap-backend } + depends_on: [platform-service] + restart: unless-stopped + + peakpulse-backend: + build: ./learning_ai_peakpulse/backend # Needs Dockerfile (missing) + ports: ['4010:4010'] + env_file: [.env.ecosystem] + environment: { PORT: '4010', SERVICE_NAME: peakpulse-backend } + depends_on: [platform-service] + restart: unless-stopped + + flowmonk-backend: + build: ./learning_ai_flowmonk/backend # Needs Dockerfile (missing) + ports: ['4017:4017'] + env_file: [.env.ecosystem] + environment: { PORT: '4017', SERVICE_NAME: flowmonk-backend } + depends_on: [platform-service] + restart: unless-stopped + + notelett-backend: + build: ./learning_ai_notes/backend + ports: ['4016:4016'] + env_file: [.env.ecosystem] + environment: { PORT: '4016', SERVICE_NAME: notelett-backend } + depends_on: [platform-service] + restart: unless-stopped + + actiontrail-backend: + build: + context: ./learning_ai_trails # Dockerfile expects repo-root context + dockerfile: backend/Dockerfile + ports: ['4018:4018'] + env_file: [.env.ecosystem] + environment: { PORT: '4018', SERVICE_NAME: actiontrail-backend } + depends_on: [platform-service] + restart: unless-stopped + + localmemgpt-backend: + build: + context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context + dockerfile: backend/Dockerfile + ports: ['4019:4019'] + env_file: [.env.ecosystem] + environment: { PORT: '4019', OLLAMA_URL: 'http://host.docker.internal:11434' } + volumes: [localmemgpt-data:/app/db] + restart: unless-stopped + + # ══════════════════════════════════════════════════════ + # WEB DASHBOARDS + # IMPORTANT: Most webs default to port 3000 internally. + # Use PORT env var to override, or remap via host:container ports. + # ══════════════════════════════════════════════════════ + admin-web: + build: ./learning_ai_common_plat/dashboards/admin-web + ports: ['3001:3001'] + env_file: [.env.ecosystem] + environment: + PORT: 3001 # admin-web has NO port override — defaults to 3000 without this! + depends_on: [platform-service] + restart: unless-stopped + + user-dashboard: + build: ./learning_voice_ai_agent/user-dashboard-web + ports: ['3002:3002'] + env_file: [.env.ecosystem] + depends_on: [lysnrai-backend] + restart: unless-stopped + + tracker-web: + build: ./learning_ai_common_plat/dashboards/tracker-web + ports: ['3003:3003'] + env_file: [.env.ecosystem] + depends_on: [platform-service] + restart: unless-stopped + + nomgap-web: + build: ./learning_ai_fastgap/web + ports: ['3040:3040'] + environment: + PORT: 3040 + NEXT_PUBLIC_NOMGAP_API_URL: http://nomgap-backend:4013/api + NEXT_PUBLIC_PLATFORM_SERVICE_URL: http://platform-service:4003/api + depends_on: [nomgap-backend] + restart: unless-stopped + + actiontrail-web: + build: ./learning_ai_trails/web + ports: ['3060:3000'] # Internal 3000 → external 3060 + environment: + NEXT_PUBLIC_API_URL: http://actiontrail-backend:4018 + depends_on: [actiontrail-backend] + restart: unless-stopped + + localmemgpt-web: + build: + context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context + dockerfile: web/Dockerfile + ports: ['3070:3070'] + environment: + NEXT_PUBLIC_BACKEND_URL: http://localmemgpt-backend:4019 + depends_on: [localmemgpt-backend] + restart: unless-stopped + + notelett-web: + build: ./learning_ai_notes/web + ports: ['3054:3000'] # Internal 3000 → external 3054 + environment: + NEXT_PUBLIC_BACKEND_URL: http://notelett-backend:4016 + depends_on: [notelett-backend] + restart: unless-stopped + + # Remaining webs need Dockerfiles + output:'standalone' in next.config.ts: + # chronomind-web (3051), jarvisjr-web (3052), flowmonk-web (3053), mindlyst-web (3050) + +volumes: + azurite-data: + loki-data: + grafana-data: + localmemgpt-data: +``` + +### 4.2 Phase 2 — K3s (Single-Node Kubernetes) + +#### Install K3s on the VM + +```bash +# Install K3s (30 seconds, includes kubectl + containerd) +curl -sfL https://get.k3s.io | sh - + +# Verify +sudo kubectl get nodes +# NAME STATUS ROLES AGE VERSION +# myvm Ready control-plane,master 30s v1.30.x+k3s1 + +# Copy kubeconfig for non-root usage +mkdir -p ~/.kube +sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config +sudo chown $(id -u):$(id -g) ~/.kube/config + +# Install Helm +curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash +``` + +#### Namespace Layout + +```bash +kubectl create namespace bytelyst-infra # Cosmos, Azurite, Mailpit, Loki, Grafana +kubectl create namespace bytelyst-platform # platform-service, extraction, mcp +kubectl create namespace bytelyst-products # 10 product backends +kubectl create namespace bytelyst-web # All Next.js dashboards +``` + +#### Example K8s Manifest (one backend) + +```yaml +# k8s/products/lysnrai-backend.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: lysnrai-backend + namespace: bytelyst-products + labels: + app: lysnrai-backend + product: lysnrai +spec: + replicas: 1 # Scale to 2+ when ready + selector: + matchLabels: + app: lysnrai-backend + template: + metadata: + labels: + app: lysnrai-backend + spec: + containers: + - name: lysnrai-backend + image: bytelyst/lysnrai-backend:latest + ports: + - containerPort: 4015 + envFrom: + - configMapRef: + name: bytelyst-common-config + - secretRef: + name: bytelyst-secrets + env: + - name: PORT + value: '4015' + - name: SERVICE_NAME + value: lysnrai-backend + resources: + requests: + memory: '128Mi' + cpu: '100m' + limits: + memory: '256Mi' + cpu: '500m' + livenessProbe: + httpGet: + path: /health + port: 4015 + initialDelaySeconds: 10 + periodSeconds: 30 + readinessProbe: + httpGet: + path: /health + port: 4015 + initialDelaySeconds: 5 + periodSeconds: 10 +--- +apiVersion: v1 +kind: Service +metadata: + name: lysnrai-backend + namespace: bytelyst-products +spec: + selector: + app: lysnrai-backend + ports: + - port: 4015 + targetPort: 4015 +``` + +#### Ingress (Traefik, built into K3s) + +```yaml +# k8s/ingress.yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: bytelyst-ingress + namespace: bytelyst-products + annotations: + traefik.ingress.kubernetes.io/router.entrypoints: web +spec: + rules: + - host: lysnrai.local + http: + paths: + - path: /api + pathType: Prefix + backend: + service: + name: lysnrai-backend + port: + number: 4015 + - host: platform.local + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: platform-service + port: + number: 4003 + # ... repeat per product +``` + +--- + +## 5. Docker Compose → K3s Migration Cheat Sheet + +| Docker Compose | K3s Equivalent | +| ------------------------- | ------------------------------------------ | +| `services:` | `Deployment` + `Service` | +| `ports:` | `Service` (ClusterIP/NodePort) | +| `env_file:` | `ConfigMap` + `Secret` | +| `depends_on:` | `initContainers` or readiness probes | +| `volumes:` | `PersistentVolumeClaim` (local-path) | +| `restart: unless-stopped` | Built-in (K8s always restarts pods) | +| `labels: traefik.*` | `Ingress` resource | +| `docker compose up` | `kubectl apply -k k8s/` | +| `docker compose logs` | `kubectl logs -f deploy/X` or Loki/Grafana | +| `docker compose ps` | `kubectl get pods -A` | +| Scale: change nothing | `kubectl scale deploy/X --replicas=3` | + +--- + +## 6. K3s Practice Exercises (on single VM) + +These exercises simulate real production scenarios: + +### Exercise 1: Rolling Update + +```bash +# Build new image, deploy with zero downtime +docker build -t bytelyst/lysnrai-backend:v2 ./learning_voice_ai_agent/backend +kubectl set image deploy/lysnrai-backend lysnrai-backend=bytelyst/lysnrai-backend:v2 -n bytelyst-products +kubectl rollout status deploy/lysnrai-backend -n bytelyst-products +``` + +### Exercise 2: Scale Horizontally + +```bash +kubectl scale deploy/platform-service --replicas=3 -n bytelyst-platform +# Traefik auto-balances across all 3 pods +``` + +### Exercise 3: ConfigMap / Secret Rotation + +```bash +kubectl create secret generic bytelyst-secrets \ + --from-literal=JWT_SECRET=new-secret \ + --from-literal=COSMOS_KEY=new-key \ + -n bytelyst-platform --dry-run=client -o yaml | kubectl apply -f - +kubectl rollout restart deploy -n bytelyst-platform +``` + +### Exercise 4: Resource Limits + HPA + +```yaml +# Auto-scale platform-service 1→5 pods based on CPU +apiVersion: autoscaling/v2 +kind: HorizontalPodAutoscaler +metadata: + name: platform-service-hpa + namespace: bytelyst-platform +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: platform-service + minReplicas: 1 + maxReplicas: 5 + metrics: + - type: Resource + resource: + name: cpu + target: + type: Utilization + averageUtilization: 70 +``` + +### Exercise 5: Helm Chart (packaged deploy) + +```bash +# Create chart scaffold +helm create bytelyst-ecosystem +# Templatize all 25+ services into one chart +# Deploy: helm install bytelyst ./bytelyst-ecosystem -n bytelyst +``` + +--- + +## 7. Scaling Path: Single VM → Multi-Node + +``` +Phase 1: Docker Compose Phase 2: K3s (1 node) +┌─────────────────────┐ ┌──────────────────────┐ +│ Single VM │ → │ Single VM + K3s │ +│ docker compose up │ │ kubectl apply -k │ +│ ~25 containers │ │ ~25 pods │ +└─────────────────────┘ └──────────────────────┘ + │ + ▼ +Phase 3: K3s (3 nodes) Phase 4: Managed K8s +┌──────────────────────┐ ┌──────────────────────┐ +│ 1 server + 2 agents │ → │ AKS / EKS / GKE │ +│ Same manifests! │ │ Same manifests! │ +│ Real HA │ │ Auto-scaling nodes │ +└──────────────────────┘ └──────────────────────┘ +``` + +**Adding a worker node to K3s is one command:** + +```bash +# On the worker VM: +curl -sfL https://get.k3s.io | K3S_URL=https://server-ip:6443 K3S_TOKEN= sh - +``` + +--- + +## 8. Recommended Directory Structure + +``` +~/code/mygh/ +├── docker-compose.ecosystem.yml # Phase 1: all-in-one compose +├── .env.ecosystem # Shared env vars +├── k8s/ # Phase 2: K3s manifests +│ ├── kustomization.yaml # Kustomize root +│ ├── infra/ # Cosmos emulator, Azurite, Mailpit, Loki, Grafana +│ ├── platform/ # platform-service, extraction, mcp +│ ├── products/ # 10 product backends +│ ├── web/ # 10+ Next.js dashboards +│ ├── config/ # ConfigMaps +│ └── secrets/ # Secrets (gitignored) +├── helm/ # Phase 3: Helm chart +│ └── bytelyst-ecosystem/ +│ ├── Chart.yaml +│ ├── values.yaml +│ └── templates/ +└── scripts/ + ├── ecosystem-up.sh # docker compose -f docker-compose.ecosystem.yml up -d + ├── ecosystem-k3s-deploy.sh # kubectl apply -k k8s/ + └── ecosystem-build-all.sh # Build all Docker images +``` + +--- + +## 9. Quick Start Commands + +```bash +# ── Phase 1: Docker Compose ─────────────────────────── +cd ~/code/mygh + +# Build all images (first time, ~15-20 min) +docker compose -f docker-compose.ecosystem.yml build + +# Start everything +docker compose -f docker-compose.ecosystem.yml up -d + +# Check status +docker compose -f docker-compose.ecosystem.yml ps + +# View logs +docker compose -f docker-compose.ecosystem.yml logs -f platform-service + +# Tear down +docker compose -f docker-compose.ecosystem.yml down + +# ── Phase 2: K3s ────────────────────────────────────── +# Build + load images into K3s containerd +docker build -t bytelyst/platform-service:latest ./learning_ai_common_plat/services/platform-service +sudo k3s ctr images import <(docker save bytelyst/platform-service:latest) + +# Deploy all +kubectl apply -k k8s/ + +# Check pods +kubectl get pods -A + +# Port-forward for local access +kubectl port-forward svc/platform-service 4003:4003 -n bytelyst-platform +``` + +--- + +## 10. What's NOT Dockerized Yet (gaps) + +| Repo | Backend Dockerfile | Web Dockerfile | `docker-prep.sh` | `output:'standalone'` | Status | +| --------------- | ------------------ | ------------------- | ---------------- | --------------------- | -------------------------------------------------------------- | +| **LysnrAI** | ❌ | ✅ user-dashboard | ❌ | ✅ (conditional) | Need backend Dockerfile + docker-prep.sh | +| **MindLyst** | ❌ | ❌ | ❌ | ❌ | Need all 4 | +| **ChronoMind** | ❌ | ❌ | ❌ | ❌ | Need all 4 | +| **JarvisJr** | ❌ | ❌ | ❌ | ❌ | Need all 4 | +| **PeakPulse** | ❌ | ❌ | ❌ | ❌ | Need all 4 | +| **FlowMonk** | ❌ | ❌ | ❌ | ❌ | Need all 4 | +| **NomGap** | ✅ ⚠️ | ✅ | ✅ | ✅ | Backend Dockerfile ignores `file:` deps — see §12.F3 | +| **NoteLett** | ✅ ⚠️ | ✅ | ✅ | ✅ | Backend Dockerfile `COPY .` pulls broken symlinks — see §12.F4 | +| **ActionTrail** | ✅ | ✅ | ✅ | ✅ | Ready (uses `.tarballs/` pattern) | +| **LocalMemGPT** | ✅ | ✅ | ✅ | ✅ | Ready (repo-root build context) | +| **admin-web** | — | ✅ (in common-plat) | N/A (pnpm) | ✅ (conditional) | Ready | +| **tracker-web** | — | ✅ (in common-plat) | N/A (pnpm) | ✅ (conditional) | Ready | + +**6 repos need Dockerfiles** + `docker-prep.sh` + `output:'standalone'`. 2 existing Dockerfiles have issues. + +--- + +## 11. Dockerfile Template (for missing repos) + +> **Critical:** These templates assume you run `docker-prep.sh` first to pack `@bytelyst/*` file: deps into `.tarballs/`. Without this, `npm ci` will fail because `file:../../learning_ai_common_plat/packages/*` doesn't exist inside the Docker build context. + +### Backend (Fastify 5 + TypeScript) + +```dockerfile +# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs +FROM node:22-alpine AS builder +WORKDIR /app + +COPY package.json package-lock.json ./ +COPY .tarballs/ ./.tarballs/ +RUN npm ci --ignore-scripts + +COPY tsconfig.json ./ +COPY src/ ./src/ +RUN npx tsc + +# Production stage +FROM node:22-alpine +WORKDIR /app +ENV NODE_ENV=production + +COPY package.json package-lock.json ./ +COPY .tarballs/ ./.tarballs/ +RUN npm ci --omit=dev --ignore-scripts + +COPY --from=builder /app/dist ./dist +# Copy shared/product.json if the backend reads it at runtime +COPY shared/ ./shared/ 2>/dev/null || true + +EXPOSE ${PORT:-4010} +CMD ["node", "dist/server.js"] +``` + +### Web (Next.js 16) + +> **Prerequisite:** `next.config.ts` MUST have `output: 'standalone'` for the standalone Dockerfile pattern to work. Without it, `.next/standalone/` won't be generated and the COPY will fail. + +```dockerfile +# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs +FROM node:22-alpine AS builder +WORKDIR /app + +COPY package.json package-lock.json ./ +COPY .tarballs/ ./.tarballs/ +RUN npm ci + +COPY . . + +# Dummy env vars for Next.js build-time static page collection +ENV NEXT_PUBLIC_BACKEND_URL=http://localhost:4010 +ENV NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003 + +RUN npm run build + +FROM node:22-alpine +WORKDIR /app +ENV NODE_ENV=production + +COPY --from=builder /app/.next/standalone ./ +COPY --from=builder /app/.next/static ./.next/static +COPY --from=builder /app/public ./public 2>/dev/null || true + +EXPOSE 3000 +CMD ["node", "server.js"] +``` + +### docker-prep.sh (for repos that don't have one yet) + +Copy from `learning_ai_trails/scripts/docker-prep.sh` — it handles both `backend/` and `web/` targets, packs all `file:` refs into `.tarballs/`, and rewrites `package.json` to point at them. + +```bash +cp learning_ai_trails/scripts/docker-prep.sh /scripts/docker-prep.sh +chmod +x /scripts/docker-prep.sh +``` + +--- + +## 12. Audit Findings (Review 2026-03-22) + +Systematic code review of all claims in this document against the actual codebase. + +### F1. Port Conflicts (CRITICAL) + +**Grafana** uses port 3000. The following webs also default to 3000: + +- admin-web (no port in package.json) +- ChronoMind web (no port override) +- JarvisJr web (no port override) +- FlowMonk web (no port override) +- NoteLett web (Dockerfile EXPOSE 3000) +- ActionTrail web (Dockerfile EXPOSE 3000) + +**Fix:** Set `PORT` env var in compose for each, or use host:container port remapping. + +### F2. `file:` Dependencies Break Docker Builds (CRITICAL) + +**Every** product backend and web has `file:../../learning_ai_common_plat/packages/*` dependencies in package.json. These resolve locally via symlinks but **fail inside Docker** because the sibling repo isn't in the build context. + +**Pattern:** Each repo needs a `docker-prep.sh` that: + +1. Runs `pnpm build` in common-plat +2. Packs each `@bytelyst/*` package into a `.tarballs/*.tgz` +3. Rewrites package.json `file:` refs → `file:.tarballs/bytelyst-*.tgz` + +**Repos with `docker-prep.sh`:** ActionTrail ✅, LocalMemGPT ✅, NoteLett ✅, NomGap ✅ +**Repos missing `docker-prep.sh`:** LysnrAI, MindLyst, ChronoMind, JarvisJr, PeakPulse, FlowMonk + +### F3. NomGap Backend Dockerfile Ignores `file:` Deps (BUG) + +`@/learning_ai_fastgap/backend/Dockerfile` does `COPY package.json → npm ci` but doesn't copy `.tarballs/`. The `file:` refs will fail. Needs the `.tarballs/` COPY step added. + +### F4. NoteLett Backend Dockerfile Copies Everything (BUG) + +`@/learning_ai_notes/backend/Dockerfile` does `COPY . .` in the build stage, which includes broken `node_modules` symlinks from `file:` deps. Should use explicit `COPY` of `src/`, `tsconfig.json`, and `.tarballs/` instead. + +### F5. Missing `output: 'standalone'` in next.config.ts (CRITICAL) + +The Dockerfile template copies from `.next/standalone/` — this directory only exists when `output: 'standalone'` is set in `next.config.ts`. + +| Web | Has `output: 'standalone'`? | Notes | +| -------------- | --------------------------- | ----------------------------------------------------------------- | +| NomGap | ✅ | Set directly | +| NoteLett | ✅ | Set directly | +| ActionTrail | ✅ | Set directly | +| LocalMemGPT | ✅ | Set directly | +| admin-web | ✅ | Conditional: `process.env.VERCEL ? {} : { output: 'standalone' }` | +| tracker-web | ✅ | Conditional (same) | +| user-dashboard | ✅ | Conditional (same) | +| ChronoMind | ❌ | **Must add** | +| JarvisJr | ❌ | **Must add** | +| FlowMonk | ❌ | **Must add** | +| MindLyst | ❌ | Unknown — needs check | + +### F6. Build Context Mismatch for ActionTrail + LocalMemGPT + +Their Dockerfiles expect repo-root as build context (they `COPY backend/...` and `COPY shared/...`). The compose `build:` must use `context: ./repo-name` + `dockerfile: backend/Dockerfile`, not `build: ./repo-name/backend`. + +**Already correct in the compose above.** Calling it out so future editors don't "simplify" it. + +### F7. Node.js Version Inconsistency + +Existing Dockerfiles use mixed Node versions: + +- NomGap, NoteLett: `node:20-alpine` +- ActionTrail, LocalMemGPT: `node:22-alpine` / `node:22-slim` + +**Recommendation:** Standardize on `node:22-alpine` for all new Dockerfiles. Existing ones work but should be updated for consistency. + +### F8. Missing `--webpack` Flag for Next.js Builds + +Several web apps require `--webpack` flag for builds (Serwist PWA incompatible with Turbopack, or `@bytelyst/*` file: ref transpilation). The Dockerfile template uses `npm run build` which should map to `next build --webpack` in package.json — verify each repo's `build` script. + +### F9. Missing `.env.ecosystem` Template + +The compose references `.env.ecosystem` but the doc doesn't define its contents. Key vars needed: + +```env +# .env.ecosystem — shared env for all services +COSMOS_ENDPOINT=https://cosmos-emulator:8081 +COSMOS_KEY= +COSMOS_DATABASE=bytelyst +JWT_SECRET=dev-ecosystem-secret-change-me +AZURE_BLOB_CONNECTION_STRING=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=...;BlobEndpoint=http://azurite:10000/devstoreaccount1; +PLATFORM_SERVICE_URL=http://platform-service:4003 +EXTRACTION_SERVICE_URL=http://extraction-service:4005 +DB_PROVIDER=memory +NODE_ENV=production +CORS_ORIGIN=* +SMTP_HOST=mailpit +SMTP_PORT=1025 +``` + +### F10. `host.docker.internal` Only Works on Docker Desktop (Mac/Windows) + +LocalMemGPT uses `OLLAMA_URL: 'http://host.docker.internal:11434'` — this works on Docker Desktop but **not on Linux VMs** (which is the likely deployment target). + +**Fix on Linux:** Add `extra_hosts: ['host.docker.internal:host-gateway']` to the service, or use `network_mode: host`. + +### Summary of Required Work Before Compose Works + +| Priority | Item | Count | +| -------- | -------------------------------------------------------- | ------------- | +| **P0** | Create missing `docker-prep.sh` | 6 repos | +| **P0** | Create missing backend Dockerfiles | 6 repos | +| **P0** | Create missing web Dockerfiles | 5 repos | +| **P0** | Add `output: 'standalone'` to next.config.ts | 3 webs | +| **P1** | Fix NomGap backend Dockerfile (add `.tarballs/` COPY) | 1 file | +| **P1** | Fix NoteLett backend Dockerfile (explicit COPY, not `.`) | 1 file | +| **P1** | Create `.env.ecosystem` template | 1 file | +| **P2** | Standardize Node.js version to 22-alpine | 4 Dockerfiles | +| **P2** | Add `extra_hosts` for Linux VM Ollama access | 1 service | + +--- + +## Summary + +| Question | Answer | +| ------------------------------ | -------------------------------------------------------------------------------------------------------------- | +| **Can deploy on single VM?** | **Yes.** All ~25 services fit in 32 GB RAM. | +| **All Dockerized?** | 4/10 product repos fully Dockerized. 6 need Dockerfiles (copy-paste template). | +| **K8s practice on single VM?** | **K3s** — certified K8s, single binary, same manifests scale to multi-node or AKS/EKS/GKE. | +| **Recommended VM?** | 8 vCPU / 32 GB (min) or 16 vCPU / 64 GB (with Ollama). Hetzner ~$45/mo for dev. | +| **Time to production K8s?** | Phase 1 (compose) → Phase 2 (K3s single) → Phase 3 (K3s multi) → Phase 4 (managed). Same manifests throughout. |