saravanakumardb1 ae2af43d71 docs(devops): add single-VM deployment guide with audit findings

2026-03-22 00:18:17 -07:00

39 KiB

Raw Blame History

ByteLyst Ecosystem — Single-VM Deployment Guide

Deploy the entire ByteLyst ecosystem on one VM, fully Dockerized, with a K3s Kubernetes layer for production-readiness practice.

1. Service Inventory

Shared Infrastructure (common-plat)

Service	Port	Image	RAM Est.
platform-service	4003	Fastify 5 + TS	~200 MB
extraction-service	4005	Fastify 5 + Python sidecar	~350 MB
mcp-server	4007	Fastify 5 + TS	~150 MB
Cosmos DB Emulator	8081, 1234	`mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview`	~2 GB
Azurite (blob)	10000	`mcr.microsoft.com/azure-storage/azurite`	~100 MB
Mailpit (SMTP)	1025, 8025	`axllent/mailpit`	~50 MB
Traefik (gateway)	80, 8080	`traefik:v3.3`	~100 MB
Loki (logs)	3100	`grafana/loki`	~200 MB
Grafana (dashboards)	3000	`grafana/grafana`	~200 MB

Product Backends (Fastify 5 + TypeScript)

Product	Port	RAM Est.
LysnrAI backend	4015	~150 MB
MindLyst backend	4014	~150 MB
ChronoMind backend	4011	~150 MB
JarvisJr backend	4012	~150 MB
NomGap backend	4013	~150 MB
PeakPulse backend	4010	~150 MB
FlowMonk backend	4017	~150 MB
NoteLett backend	4016	~150 MB
ActionTrail backend	4018	~150 MB
LocalMemGPT backend	4019	~150 MB

Web Dashboards (Next.js 16)

Dashboard	Default Port	Compose Port	RAM Est.	Notes
admin-web	3000	3001	~250 MB	No port in package.json; must set `PORT=3001` env
user-dashboard-web	3002	3002	~250 MB	Port set in package.json
tracker-web	3003	3003	~200 MB	Port set in package.json
NomGap web	3040	3040	~200 MB	Port set in Dockerfile
ChronoMind web	3000	3051	~200 MB	No port override; must set `PORT` env
JarvisJr web	3000	3052	~200 MB	No port override; must set `PORT` env
FlowMonk web	3000	3053	~200 MB	No port override; must set `PORT` env
NoteLett web	3000	3054	~200 MB	Dockerfile EXPOSE 3000; remap in compose
ActionTrail web	3000	3060	~200 MB	Dockerfile EXPOSE 3000; remap in compose
LocalMemGPT web	3070	3070	~200 MB	Port set in package.json + Dockerfile
MindLyst web	3050	3050	~200 MB	Port set in package.json (`-p 3050`)

Port conflict warning: Grafana uses port 3000. admin-web, ChronoMind, JarvisJr, FlowMonk, NoteLett, and ActionTrail webs all default to 3000. The compose file must either set PORT env var or remap via ports: mapping.

Optional / AI

Service	Port	RAM Est.
Ollama (LLM)	11434	4–16 GB (model-dependent)

2. VM Sizing

Minimum (dev/staging, no Ollama)

Spec	Value
vCPUs	8
RAM	32 GB
Disk	100 GB SSD
OS	Ubuntu 24.04 LTS

Breakdown:

Cosmos Emulator: ~2 GB
10 Fastify backends × 150 MB = ~1.5 GB
3 shared services × 250 MB = ~0.75 GB
10 Next.js webs × 200 MB = ~2 GB
Infra (Traefik, Loki, Grafana, Azurite, Mailpit) = ~0.65 GB
K3s overhead = ~0.5 GB
Subtotal: ~7.4 GB → headroom for spikes + build cache = 32 GB

Recommended (with Ollama, small models)

Spec	Value
vCPUs	16
RAM	64 GB
Disk	200 GB NVMe SSD
GPU	Optional NVIDIA T4/A10 for fast LLM inference
OS	Ubuntu 24.04 LTS

Cloud Equivalents

Provider	Instance	vCPU	RAM	Price (approx)
Azure	Standard_D8s_v5	8	32 GB	~$280/mo
Azure	Standard_D16s_v5	16	64 GB	~$560/mo
AWS	m6i.2xlarge	8	32 GB	~$280/mo
AWS	m6i.4xlarge	16	64 GB	~$560/mo
Hetzner	CPX51	16	32 GB	~$45/mo
Hetzner	CCX63	48	192 GB	~$230/mo
Home	Mac Mini M4 Pro	12	48 GB	One-time ~$1,600

Cost tip: Hetzner is 5–10× cheaper than Azure/AWS for dev/staging.

3. Architecture: Docker Compose → K3s Migration Path

Phase 1: Docker Compose (after prerequisite work)

⚠️ Prerequisite: 6 repos need Dockerfiles created, 3 webs need output: 'standalone' in next.config.ts, and ALL product repos must run docker-prep.sh before building (see §12 Audit Findings).

Create a unified docker-compose.ecosystem.yml that brings everything up.

Phase 2: K3s (single-node Kubernetes)

K3s is a lightweight, certified Kubernetes distro that runs on a single node. It gives you real kubectl, Helm, Ingress, and CRDs — identical APIs to production EKS/AKS/GKE.

Why K3s over minikube/kind?

Production-grade (CNCF certified, used by Rancher)
Single binary, ~70 MB, installs in 30 seconds
Built-in Traefik Ingress (you already use Traefik!)
Built-in local-path StorageClass
Runs as systemd service (survives reboot)
Can scale to multi-node later by just joining worker nodes

4. Implementation Plan

4.1 Phase 1 — Unified Docker Compose

Create docker-compose.ecosystem.yml at workspace root (~/code/mygh/) that composes all services:

⚠️ Critical prerequisite — run BEFORE docker compose build:

# Pack @bytelyst/* file: dependencies into tarballs for each product repo.
# Every product repo has file: refs to ../learning_ai_common_plat/packages/*
# which don't resolve inside Docker build context. docker-prep.sh packs them.
for repo in learning_ai_trails learning_ai_local_memory_gpt learning_ai_notes learning_ai_fastgap; do
  (cd $repo && ./scripts/docker-prep.sh)
done
# Repos without docker-prep.sh yet need it created (see §12 Audit Findings)

# ~/code/mygh/docker-compose.ecosystem.yml
# NOTE: All product backends/webs have file: deps to @bytelyst/* packages.
# You MUST run docker-prep.sh for each repo first (see above).

services:
  # ══════════════════════════════════════════════════════
  # INFRASTRUCTURE
  # ══════════════════════════════════════════════════════
  cosmos-emulator:
    image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:vnext-preview
    ports: ['8081:8081', '1234:1234']
    environment:
      PROTOCOL: http
      ENABLE_EXPLORER: 'true'
    restart: unless-stopped

  azurite:
    image: mcr.microsoft.com/azure-storage/azurite:3.35.0
    command: azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --skipApiVersionCheck
    ports: ['10000:10000']
    volumes: [azurite-data:/data]
    restart: unless-stopped

  mailpit:
    image: axllent/mailpit:v1.27.5
    ports: ['1025:1025', '8025:8025']
    restart: unless-stopped

  traefik:
    image: traefik:v3.3
    command:
      - '--api.insecure=true'
      - '--providers.docker=true'
      - '--providers.docker.exposedbydefault=false'
      - '--entrypoints.web.address=:80'
    ports: ['80:80', '8080:8080']
    volumes: ['/var/run/docker.sock:/var/run/docker.sock:ro']
    restart: unless-stopped

  loki:
    image: grafana/loki:3.3.2
    ports: ['3100:3100']
    volumes: [loki-data:/loki]
    restart: unless-stopped

  grafana:
    image: grafana/grafana:11.4.0
    ports: ['3000:3000'] # NOTE: many Next.js webs also default to 3000 — avoid conflicts
    environment:
      GF_SECURITY_ADMIN_USER: admin
      GF_SECURITY_ADMIN_PASSWORD: lysnrai
    volumes: [grafana-data:/var/lib/grafana]
    restart: unless-stopped

  # ══════════════════════════════════════════════════════
  # SHARED SERVICES (common-plat — no file: deps, pnpm workspace handles it)
  # ══════════════════════════════════════════════════════
  platform-service:
    build:
      context: ./learning_ai_common_plat
      dockerfile: services/platform-service/Dockerfile
    ports: ['4003:4003']
    env_file: [.env.ecosystem]
    environment:
      PORT: 4003
      COSMOS_AUTO_INIT: 'true'
    depends_on: [cosmos-emulator, azurite, mailpit]
    labels:
      - 'traefik.enable=true'
      - 'traefik.http.routers.platform.rule=Host(`platform.local`)'
      - 'traefik.http.services.platform.loadbalancer.server.port=4003'
    restart: unless-stopped

  extraction-service:
    build:
      context: ./learning_ai_common_plat
      dockerfile: services/extraction-service/Dockerfile
    ports: ['4005:4005']
    env_file: [.env.ecosystem]
    environment:
      PORT: 4005
    depends_on: [cosmos-emulator]
    restart: unless-stopped

  mcp-server:
    build:
      context: ./learning_ai_common_plat
      dockerfile: services/mcp-server/Dockerfile
    ports: ['4007:4007']
    env_file: [.env.ecosystem]
    environment:
      PORT: 4007
      PLATFORM_SERVICE_URL: http://platform-service:4003
      EXTRACTION_SERVICE_URL: http://extraction-service:4005
    depends_on: [platform-service, extraction-service]
    restart: unless-stopped

  # ══════════════════════════════════════════════════════
  # PRODUCT BACKENDS
  # All have file: deps → must run docker-prep.sh first.
  # ActionTrail + LocalMemGPT Dockerfiles use repo-root context.
  # Others use backend/ subdir context.
  # ══════════════════════════════════════════════════════
  lysnrai-backend:
    build: ./learning_voice_ai_agent/backend # Needs Dockerfile (missing)
    ports: ['4015:4015']
    env_file: [.env.ecosystem]
    environment: { PORT: '4015', SERVICE_NAME: lysnrai-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  mindlyst-backend:
    build: ./learning_multimodal_memory_agents/backend # Needs Dockerfile (missing)
    ports: ['4014:4014']
    env_file: [.env.ecosystem]
    environment: { PORT: '4014', SERVICE_NAME: mindlyst-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  chronomind-backend:
    build: ./learning_ai_clock/backend # Needs Dockerfile (missing)
    ports: ['4011:4011']
    env_file: [.env.ecosystem]
    environment: { PORT: '4011', SERVICE_NAME: chronomind-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  jarvisjr-backend:
    build: ./learning_ai_jarvis_jr/backend # Needs Dockerfile (missing)
    ports: ['4012:4012']
    env_file: [.env.ecosystem]
    environment: { PORT: '4012', SERVICE_NAME: jarvisjr-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  nomgap-backend:
    build: ./learning_ai_fastgap/backend
    ports: ['4013:4013']
    env_file: [.env.ecosystem]
    environment: { PORT: '4013', SERVICE_NAME: nomgap-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  peakpulse-backend:
    build: ./learning_ai_peakpulse/backend # Needs Dockerfile (missing)
    ports: ['4010:4010']
    env_file: [.env.ecosystem]
    environment: { PORT: '4010', SERVICE_NAME: peakpulse-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  flowmonk-backend:
    build: ./learning_ai_flowmonk/backend # Needs Dockerfile (missing)
    ports: ['4017:4017']
    env_file: [.env.ecosystem]
    environment: { PORT: '4017', SERVICE_NAME: flowmonk-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  notelett-backend:
    build: ./learning_ai_notes/backend
    ports: ['4016:4016']
    env_file: [.env.ecosystem]
    environment: { PORT: '4016', SERVICE_NAME: notelett-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  actiontrail-backend:
    build:
      context: ./learning_ai_trails # Dockerfile expects repo-root context
      dockerfile: backend/Dockerfile
    ports: ['4018:4018']
    env_file: [.env.ecosystem]
    environment: { PORT: '4018', SERVICE_NAME: actiontrail-backend }
    depends_on: [platform-service]
    restart: unless-stopped

  localmemgpt-backend:
    build:
      context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context
      dockerfile: backend/Dockerfile
    ports: ['4019:4019']
    env_file: [.env.ecosystem]
    environment: { PORT: '4019', OLLAMA_URL: 'http://host.docker.internal:11434' }
    volumes: [localmemgpt-data:/app/db]
    restart: unless-stopped

  # ══════════════════════════════════════════════════════
  # WEB DASHBOARDS
  # IMPORTANT: Most webs default to port 3000 internally.
  # Use PORT env var to override, or remap via host:container ports.
  # ══════════════════════════════════════════════════════
  admin-web:
    build: ./learning_ai_common_plat/dashboards/admin-web
    ports: ['3001:3001']
    env_file: [.env.ecosystem]
    environment:
      PORT: 3001 # admin-web has NO port override — defaults to 3000 without this!
    depends_on: [platform-service]
    restart: unless-stopped

  user-dashboard:
    build: ./learning_voice_ai_agent/user-dashboard-web
    ports: ['3002:3002']
    env_file: [.env.ecosystem]
    depends_on: [lysnrai-backend]
    restart: unless-stopped

  tracker-web:
    build: ./learning_ai_common_plat/dashboards/tracker-web
    ports: ['3003:3003']
    env_file: [.env.ecosystem]
    depends_on: [platform-service]
    restart: unless-stopped

  nomgap-web:
    build: ./learning_ai_fastgap/web
    ports: ['3040:3040']
    environment:
      PORT: 3040
      NEXT_PUBLIC_NOMGAP_API_URL: http://nomgap-backend:4013/api
      NEXT_PUBLIC_PLATFORM_SERVICE_URL: http://platform-service:4003/api
    depends_on: [nomgap-backend]
    restart: unless-stopped

  actiontrail-web:
    build: ./learning_ai_trails/web
    ports: ['3060:3000'] # Internal 3000 → external 3060
    environment:
      NEXT_PUBLIC_API_URL: http://actiontrail-backend:4018
    depends_on: [actiontrail-backend]
    restart: unless-stopped

  localmemgpt-web:
    build:
      context: ./learning_ai_local_memory_gpt # Dockerfile expects repo-root context
      dockerfile: web/Dockerfile
    ports: ['3070:3070']
    environment:
      NEXT_PUBLIC_BACKEND_URL: http://localmemgpt-backend:4019
    depends_on: [localmemgpt-backend]
    restart: unless-stopped

  notelett-web:
    build: ./learning_ai_notes/web
    ports: ['3054:3000'] # Internal 3000 → external 3054
    environment:
      NEXT_PUBLIC_BACKEND_URL: http://notelett-backend:4016
    depends_on: [notelett-backend]
    restart: unless-stopped

  # Remaining webs need Dockerfiles + output:'standalone' in next.config.ts:
  # chronomind-web (3051), jarvisjr-web (3052), flowmonk-web (3053), mindlyst-web (3050)

volumes:
  azurite-data:
  loki-data:
  grafana-data:
  localmemgpt-data:

4.2 Phase 2 — K3s (Single-Node Kubernetes)

Install K3s on the VM

# Install K3s (30 seconds, includes kubectl + containerd)
curl -sfL https://get.k3s.io | sh -

# Verify
sudo kubectl get nodes
# NAME       STATUS   ROLES                  AGE   VERSION
# myvm       Ready    control-plane,master   30s   v1.30.x+k3s1

# Copy kubeconfig for non-root usage
mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown $(id -u):$(id -g) ~/.kube/config

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Namespace Layout

kubectl create namespace bytelyst-infra      # Cosmos, Azurite, Mailpit, Loki, Grafana
kubectl create namespace bytelyst-platform   # platform-service, extraction, mcp
kubectl create namespace bytelyst-products   # 10 product backends
kubectl create namespace bytelyst-web        # All Next.js dashboards

Example K8s Manifest (one backend)

# k8s/products/lysnrai-backend.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: lysnrai-backend
  namespace: bytelyst-products
  labels:
    app: lysnrai-backend
    product: lysnrai
spec:
  replicas: 1 # Scale to 2+ when ready
  selector:
    matchLabels:
      app: lysnrai-backend
  template:
    metadata:
      labels:
        app: lysnrai-backend
    spec:
      containers:
        - name: lysnrai-backend
          image: bytelyst/lysnrai-backend:latest
          ports:
            - containerPort: 4015
          envFrom:
            - configMapRef:
                name: bytelyst-common-config
            - secretRef:
                name: bytelyst-secrets
          env:
            - name: PORT
              value: '4015'
            - name: SERVICE_NAME
              value: lysnrai-backend
          resources:
            requests:
              memory: '128Mi'
              cpu: '100m'
            limits:
              memory: '256Mi'
              cpu: '500m'
          livenessProbe:
            httpGet:
              path: /health
              port: 4015
            initialDelaySeconds: 10
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /health
              port: 4015
            initialDelaySeconds: 5
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: lysnrai-backend
  namespace: bytelyst-products
spec:
  selector:
    app: lysnrai-backend
  ports:
    - port: 4015
      targetPort: 4015

Ingress (Traefik, built into K3s)

# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: bytelyst-ingress
  namespace: bytelyst-products
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: lysnrai.local
      http:
        paths:
          - path: /api
            pathType: Prefix
            backend:
              service:
                name: lysnrai-backend
                port:
                  number: 4015
    - host: platform.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: platform-service
                port:
                  number: 4003
    # ... repeat per product

5. Docker Compose → K3s Migration Cheat Sheet

Docker Compose	K3s Equivalent
`services:`	`Deployment` + `Service`
`ports:`	`Service` (ClusterIP/NodePort)
`env_file:`	`ConfigMap` + `Secret`
`depends_on:`	`initContainers` or readiness probes
`volumes:`	`PersistentVolumeClaim` (local-path)
`restart: unless-stopped`	Built-in (K8s always restarts pods)
`labels: traefik.*`	`Ingress` resource
`docker compose up`	`kubectl apply -k k8s/`
`docker compose logs`	`kubectl logs -f deploy/X` or Loki/Grafana
`docker compose ps`	`kubectl get pods -A`
Scale: change nothing	`kubectl scale deploy/X --replicas=3`

6. K3s Practice Exercises (on single VM)

These exercises simulate real production scenarios:

Exercise 1: Rolling Update

# Build new image, deploy with zero downtime
docker build -t bytelyst/lysnrai-backend:v2 ./learning_voice_ai_agent/backend
kubectl set image deploy/lysnrai-backend lysnrai-backend=bytelyst/lysnrai-backend:v2 -n bytelyst-products
kubectl rollout status deploy/lysnrai-backend -n bytelyst-products

Exercise 2: Scale Horizontally

kubectl scale deploy/platform-service --replicas=3 -n bytelyst-platform
# Traefik auto-balances across all 3 pods

Exercise 3: ConfigMap / Secret Rotation

kubectl create secret generic bytelyst-secrets \
  --from-literal=JWT_SECRET=new-secret \
  --from-literal=COSMOS_KEY=new-key \
  -n bytelyst-platform --dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deploy -n bytelyst-platform

Exercise 4: Resource Limits + HPA

# Auto-scale platform-service 1→5 pods based on CPU
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: platform-service-hpa
  namespace: bytelyst-platform
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: platform-service
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Exercise 5: Helm Chart (packaged deploy)

# Create chart scaffold
helm create bytelyst-ecosystem
# Templatize all 25+ services into one chart
# Deploy: helm install bytelyst ./bytelyst-ecosystem -n bytelyst

7. Scaling Path: Single VM → Multi-Node

Phase 1: Docker Compose          Phase 2: K3s (1 node)
┌─────────────────────┐          ┌──────────────────────┐
│  Single VM           │    →     │  Single VM + K3s     │
│  docker compose up   │          │  kubectl apply -k    │
│  ~25 containers      │          │  ~25 pods            │
└─────────────────────┘          └──────────────────────┘
                                          │
                                          ▼
Phase 3: K3s (3 nodes)           Phase 4: Managed K8s
┌──────────────────────┐         ┌──────────────────────┐
│  1 server + 2 agents │    →    │  AKS / EKS / GKE     │
│  Same manifests!     │         │  Same manifests!      │
│  Real HA             │         │  Auto-scaling nodes   │
└──────────────────────┘         └──────────────────────┘

Adding a worker node to K3s is one command:

# On the worker VM:
curl -sfL https://get.k3s.io | K3S_URL=https://server-ip:6443 K3S_TOKEN=<token> sh -

8. Recommended Directory Structure

~/code/mygh/
├── docker-compose.ecosystem.yml     # Phase 1: all-in-one compose
├── .env.ecosystem                   # Shared env vars
├── k8s/                             # Phase 2: K3s manifests
│   ├── kustomization.yaml           # Kustomize root
│   ├── infra/                       # Cosmos emulator, Azurite, Mailpit, Loki, Grafana
│   ├── platform/                    # platform-service, extraction, mcp
│   ├── products/                    # 10 product backends
│   ├── web/                         # 10+ Next.js dashboards
│   ├── config/                      # ConfigMaps
│   └── secrets/                     # Secrets (gitignored)
├── helm/                            # Phase 3: Helm chart
│   └── bytelyst-ecosystem/
│       ├── Chart.yaml
│       ├── values.yaml
│       └── templates/
└── scripts/
    ├── ecosystem-up.sh              # docker compose -f docker-compose.ecosystem.yml up -d
    ├── ecosystem-k3s-deploy.sh      # kubectl apply -k k8s/
    └── ecosystem-build-all.sh       # Build all Docker images

9. Quick Start Commands

# ── Phase 1: Docker Compose ───────────────────────────
cd ~/code/mygh

# Build all images (first time, ~15-20 min)
docker compose -f docker-compose.ecosystem.yml build

# Start everything
docker compose -f docker-compose.ecosystem.yml up -d

# Check status
docker compose -f docker-compose.ecosystem.yml ps

# View logs
docker compose -f docker-compose.ecosystem.yml logs -f platform-service

# Tear down
docker compose -f docker-compose.ecosystem.yml down

# ── Phase 2: K3s ──────────────────────────────────────
# Build + load images into K3s containerd
docker build -t bytelyst/platform-service:latest ./learning_ai_common_plat/services/platform-service
sudo k3s ctr images import <(docker save bytelyst/platform-service:latest)

# Deploy all
kubectl apply -k k8s/

# Check pods
kubectl get pods -A

# Port-forward for local access
kubectl port-forward svc/platform-service 4003:4003 -n bytelyst-platform

10. What's NOT Dockerized Yet (gaps)

Repo	Backend Dockerfile	Web Dockerfile	`docker-prep.sh`	`output:'standalone'`	Status
LysnrAI	❌	✅ user-dashboard	❌	✅ (conditional)	Need backend Dockerfile + docker-prep.sh
MindLyst	❌	❌	❌	❌	Need all 4
ChronoMind	❌	❌	❌	❌	Need all 4
JarvisJr	❌	❌	❌	❌	Need all 4
PeakPulse	❌	❌	❌	❌	Need all 4
FlowMonk	❌	❌	❌	❌	Need all 4
NomGap	✅ ⚠️	✅	✅	✅	Backend Dockerfile ignores `file:` deps — see §12.F3
NoteLett	✅ ⚠️	✅	✅	✅	Backend Dockerfile `COPY .` pulls broken symlinks — see §12.F4
ActionTrail	✅	✅	✅	✅	Ready (uses `.tarballs/` pattern)
LocalMemGPT	✅	✅	✅	✅	Ready (repo-root build context)
admin-web	—	✅ (in common-plat)	N/A (pnpm)	✅ (conditional)	Ready
tracker-web	—	✅ (in common-plat)	N/A (pnpm)	✅ (conditional)	Ready

6 repos need Dockerfiles + docker-prep.sh + output:'standalone'. 2 existing Dockerfiles have issues.

11. Dockerfile Template (for missing repos)

Critical: These templates assume you run docker-prep.sh first to pack @bytelyst/* file: deps into .tarballs/. Without this, npm ci will fail because file:../../learning_ai_common_plat/packages/* doesn't exist inside the Docker build context.

Backend (Fastify 5 + TypeScript)

# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs
FROM node:22-alpine AS builder
WORKDIR /app

COPY package.json package-lock.json ./
COPY .tarballs/ ./.tarballs/
RUN npm ci --ignore-scripts

COPY tsconfig.json ./
COPY src/ ./src/
RUN npx tsc

# Production stage
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production

COPY package.json package-lock.json ./
COPY .tarballs/ ./.tarballs/
RUN npm ci --omit=dev --ignore-scripts

COPY --from=builder /app/dist ./dist
# Copy shared/product.json if the backend reads it at runtime
COPY shared/ ./shared/ 2>/dev/null || true

EXPOSE ${PORT:-4010}
CMD ["node", "dist/server.js"]

Web (Next.js 16)

Prerequisite: next.config.ts MUST have output: 'standalone' for the standalone Dockerfile pattern to work. Without it, .next/standalone/ won't be generated and the COPY will fail.

# Pre-requisite: run ./scripts/docker-prep.sh to pack @bytelyst/* tarballs
FROM node:22-alpine AS builder
WORKDIR /app

COPY package.json package-lock.json ./
COPY .tarballs/ ./.tarballs/
RUN npm ci

COPY . .

# Dummy env vars for Next.js build-time static page collection
ENV NEXT_PUBLIC_BACKEND_URL=http://localhost:4010
ENV NEXT_PUBLIC_PLATFORM_SERVICE_URL=http://localhost:4003

RUN npm run build

FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production

COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/public ./public 2>/dev/null || true

EXPOSE 3000
CMD ["node", "server.js"]

docker-prep.sh (for repos that don't have one yet)

Copy from learning_ai_trails/scripts/docker-prep.sh — it handles both backend/ and web/ targets, packs all file: refs into .tarballs/, and rewrites package.json to point at them.

cp learning_ai_trails/scripts/docker-prep.sh <target-repo>/scripts/docker-prep.sh
chmod +x <target-repo>/scripts/docker-prep.sh

12. Audit Findings (Review 2026-03-22)

Systematic code review of all claims in this document against the actual codebase.

F1. Port Conflicts (CRITICAL)

Grafana uses port 3000. The following webs also default to 3000:

admin-web (no port in package.json)
ChronoMind web (no port override)
JarvisJr web (no port override)
FlowMonk web (no port override)
NoteLett web (Dockerfile EXPOSE 3000)
ActionTrail web (Dockerfile EXPOSE 3000)

Fix: Set PORT env var in compose for each, or use host:container port remapping.

F2. `file:` Dependencies Break Docker Builds (CRITICAL)

Every product backend and web has file:../../learning_ai_common_plat/packages/* dependencies in package.json. These resolve locally via symlinks but fail inside Docker because the sibling repo isn't in the build context.

Pattern: Each repo needs a docker-prep.sh that:

Runs pnpm build in common-plat
Packs each @bytelyst/* package into a .tarballs/*.tgz
Rewrites package.json file: refs → file:.tarballs/bytelyst-*.tgz

Repos with docker-prep.sh: ActionTrail ✅, LocalMemGPT ✅, NoteLett ✅, NomGap ✅ Repos missing docker-prep.sh: LysnrAI, MindLyst, ChronoMind, JarvisJr, PeakPulse, FlowMonk

F3. NomGap Backend Dockerfile Ignores `file:` Deps (BUG)

@/learning_ai_fastgap/backend/Dockerfile does COPY package.json → npm ci but doesn't copy .tarballs/. The file: refs will fail. Needs the .tarballs/ COPY step added.

F4. NoteLett Backend Dockerfile Copies Everything (BUG)

@/learning_ai_notes/backend/Dockerfile does COPY . . in the build stage, which includes broken node_modules symlinks from file: deps. Should use explicit COPY of src/, tsconfig.json, and .tarballs/ instead.

F5. Missing `output: 'standalone'` in next.config.ts (CRITICAL)

The Dockerfile template copies from .next/standalone/ — this directory only exists when output: 'standalone' is set in next.config.ts.

Web	Has `output: 'standalone'`?	Notes
NomGap	✅	Set directly
NoteLett	✅	Set directly
ActionTrail	✅	Set directly
LocalMemGPT	✅	Set directly
admin-web	✅	Conditional: `process.env.VERCEL ? {} : { output: 'standalone' }`
tracker-web	✅	Conditional (same)
user-dashboard	✅	Conditional (same)
ChronoMind	❌	Must add
JarvisJr	❌	Must add
FlowMonk	❌	Must add
MindLyst	❌	Unknown — needs check

F6. Build Context Mismatch for ActionTrail + LocalMemGPT

Their Dockerfiles expect repo-root as build context (they COPY backend/... and COPY shared/...). The compose build: must use context: ./repo-name + dockerfile: backend/Dockerfile, not build: ./repo-name/backend.

Already correct in the compose above. Calling it out so future editors don't "simplify" it.

F7. Node.js Version Inconsistency

Existing Dockerfiles use mixed Node versions:

NomGap, NoteLett: node:20-alpine
ActionTrail, LocalMemGPT: node:22-alpine / node:22-slim

Recommendation: Standardize on node:22-alpine for all new Dockerfiles. Existing ones work but should be updated for consistency.

F8. Missing `--webpack` Flag for Next.js Builds

Several web apps require --webpack flag for builds (Serwist PWA incompatible with Turbopack, or @bytelyst/* file: ref transpilation). The Dockerfile template uses npm run build which should map to next build --webpack in package.json — verify each repo's build script.

F9. Missing `.env.ecosystem` Template

The compose references .env.ecosystem but the doc doesn't define its contents. Key vars needed:

# .env.ecosystem — shared env for all services
COSMOS_ENDPOINT=https://cosmos-emulator:8081
COSMOS_KEY=<emulator-key>
COSMOS_DATABASE=bytelyst
JWT_SECRET=dev-ecosystem-secret-change-me
AZURE_BLOB_CONNECTION_STRING=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=...;BlobEndpoint=http://azurite:10000/devstoreaccount1;
PLATFORM_SERVICE_URL=http://platform-service:4003
EXTRACTION_SERVICE_URL=http://extraction-service:4005
DB_PROVIDER=memory
NODE_ENV=production
CORS_ORIGIN=*
SMTP_HOST=mailpit
SMTP_PORT=1025

F10. `host.docker.internal` Only Works on Docker Desktop (Mac/Windows)

LocalMemGPT uses OLLAMA_URL: 'http://host.docker.internal:11434' — this works on Docker Desktop but not on Linux VMs (which is the likely deployment target).

Fix on Linux: Add extra_hosts: ['host.docker.internal:host-gateway'] to the service, or use network_mode: host.

Summary of Required Work Before Compose Works

Priority	Item	Count
P0	Create missing `docker-prep.sh`	6 repos
P0	Create missing backend Dockerfiles	6 repos
P0	Create missing web Dockerfiles	5 repos
P0	Add `output: 'standalone'` to next.config.ts	3 webs
P1	Fix NomGap backend Dockerfile (add `.tarballs/` COPY)	1 file
P1	Fix NoteLett backend Dockerfile (explicit COPY, not `.`)	1 file
P1	Create `.env.ecosystem` template	1 file
P2	Standardize Node.js version to 22-alpine	4 Dockerfiles
P2	Add `extra_hosts` for Linux VM Ollama access	1 service

Summary

Question	Answer
Can deploy on single VM?	Yes. All ~25 services fit in 32 GB RAM.
All Dockerized?	4/10 product repos fully Dockerized. 6 need Dockerfiles (copy-paste template).
K8s practice on single VM?	K3s — certified K8s, single binary, same manifests scale to multi-node or AKS/EKS/GKE.
Recommended VM?	8 vCPU / 32 GB (min) or 16 vCPU / 64 GB (with Ollama). Hetzner ~$45/mo for dev.
Time to production K8s?	Phase 1 (compose) → Phase 2 (K3s single) → Phase 3 (K3s multi) → Phase 4 (managed). Same manifests throughout.

39 KiB Raw Blame History Unescape Escape