learning_ai_common_plat/docs/devops/single_azure_vm/k8s
saravanakumardb1 32522b218a fix(k8s): setup-k8s.sh — fail phase 3 on build errors, fix non-root crash
- Phase 3 now exits with error if any image builds fail, preventing
  mark_phase_done from running. Previously it just warned and continued,
  which could lead to phase 5 deploying with missing images.
- Moved mkdir from top-level scope into mark_phase_done(). The old
  top-level mkdir -p /opt/bytelyst/.setup-state-k8s crashed non-root
  invocations (--status, --help) due to set -e + permission denied.
- Fixed header comment: 'containerd' → 'Docker runtime' (we use --docker).
- Added --resume to header usage block (was supported but undocumented).
2026-03-24 14:52:53 -07:00
..
config feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
dashboards feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
infra feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
platform feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
products fix(k8s): remove YAML anchors that break across document separators 2026-03-24 14:51:48 -07:00
namespaces.yaml feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
README.md feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
setup-k8s.sh fix(k8s): setup-k8s.sh — fail phase 3 on build errors, fix non-root crash 2026-03-24 14:52:53 -07:00

ByteLyst Single-VM Kubernetes Deployment (k3s)

Deploy the entire ByteLyst ecosystem (30 services, 10 products) on Kubernetes using k3s — a lightweight, CNCF-certified K8s distribution. Production-grade for ~50 beta users on a single Azure VM.


Quick Start

# Step 1: Run Docker setup phases 1-5 (system deps, Gitea, repos, packages)
cd /opt/bytelyst/learning_ai_common_plat/docs/devops/single_azure_vm
sudo ./docker/setup.sh --resume      # Runs phases 1-5 (skip 6-8)

# Step 2: Deploy to Kubernetes
sudo ./k8s/setup-k8s.sh              # 6 phases: preflight → k3s → images → config → deploy → health

# Step 3: Verify
/opt/bytelyst/check-health-k8s.sh    # 32 health checks
kubectl get pods -A                   # All pods

Prerequisites

  • Azure VM: Ubuntu 24.04 LTS, Standard_D8s_v5 (8 vCPU, 32 GB RAM, 128 GB disk)
  • Docker setup phases 1-5 completed (system deps, Gitea, repos, packages built + published)

Why k3s?

Feature k3s minikube kind microk8s
RAM overhead ~512 MB ~2 GB ~1 GB ~800 MB
Production-grade Yes (CNCF certified) No No Yes
Built-in Traefik Yes No No Optional
Single binary Yes No No No (snap)
SQLite backend Yes (no etcd needed) N/A N/A Dqlite

Architecture

Ubuntu 24.04 VM
├── k3s (single-node cluster)
│   ├── kube-system namespace
│   │   ├── CoreDNS
│   │   ├── Traefik Ingress Controller
│   │   ├── Local Path Provisioner
│   │   └── Metrics Server
│   │
│   ├── bytelyst-infra namespace
│   │   ├── cosmos-emulator (StatefulSet + PVC)
│   │   ├── azurite (StatefulSet + PVC)
│   │   ├── mailpit (Deployment)
│   │   ├── loki (StatefulSet + PVC)
│   │   └── grafana (Deployment + PVC)
│   │
│   ├── bytelyst-platform namespace
│   │   ├── platform-service (Deployment, replicas: 1)
│   │   ├── extraction-service (Deployment, replicas: 1)
│   │   └── mcp-server (Deployment, replicas: 1)
│   │
│   ├── bytelyst-dashboards namespace
│   │   ├── admin-web (Deployment, replicas: 1)
│   │   └── tracker-web (Deployment, replicas: 1)
│   │
│   └── bytelyst-products namespace
│       ├── *-backend (10 Deployments)
│       └── *-web (9 Deployments)
│
├── Ollama (systemd, host network — :11434)
└── Gitea (Docker container — :3300, used for build-time only)

File Structure

k8s/
├── README.md                    # This file
├── setup-k8s.sh                 # Bootstrap script (6 phases)
├── namespaces.yaml              # 4 namespaces
├── config/
│   ├── configmap.yaml           # Shared env vars (replaces .env.ecosystem)
│   └── secrets.yaml             # JWT_SECRET template (generated at deploy)
├── infra/
│   ├── cosmos-emulator.yaml     # StatefulSet + Service + PVC + NodePort
│   ├── azurite.yaml             # StatefulSet + Service + PVC + NodePort
│   ├── mailpit.yaml             # Deployment + Service + NodePort
│   ├── loki.yaml                # StatefulSet + Service + PVC + NodePort
│   ├── grafana.yaml             # Deployment + Service + PVC + NodePort
│   └── ollama-external.yaml     # Service + Endpoints → host Ollama
├── platform/
│   ├── platform-service.yaml    # Deployment + Service + NodePort (:4003)
│   ├── extraction-service.yaml  # Deployment + Service + NodePort (:4005)
│   └── mcp-server.yaml          # Deployment + Service + NodePort (:4007)
├── dashboards/
│   ├── admin-web.yaml           # Deployment + Service + NodePort (:3001)
│   └── tracker-web.yaml         # Deployment + Service + NodePort (:3003)
└── products/
    ├── backends.yaml            # 10 backend Deployments + Services + NodePorts
    └── webs.yaml                # 9 web Deployments + Services + NodePorts

Setup Phases

Phase Duration What happens
1. Pre-flight ~10s Verify Docker phases 1-5 completed, check disk/RAM
2. Install k3s ~2 min k3s with Docker runtime, NodePort range 1024-32767
3. Build images ~15 min Docker compose build + tag as bytelyst/<service>:latest
4. Generate config ~30s Namespaces, ConfigMap (3 copies), Secrets (JWT), Ollama endpoint
5. Deploy ~5 min Apply manifests: infra → platform → dashboards → products
6. Health check ~1 min 32 endpoint checks + kubectl pod status

Key Design Decisions

k3s with Docker Runtime

k3s installed with --docker flag — reuses existing Docker daemon and images. No containerd import step needed. Same images used by Docker Compose work directly.

4-Namespace Isolation

  • bytelyst-infra — Cosmos emulator, Azurite, Mailpit, Loki, Grafana
  • bytelyst-platform — platform-service, extraction-service, mcp-server
  • bytelyst-dashboards — admin-web, tracker-web
  • bytelyst-products — 10 backends + 9 web apps

ConfigMap + Secrets are copied to all 3 app namespaces by the setup script.

Cross-Namespace DNS

K8s DNS: <service>.<namespace>.svc.cluster.local

  • Backends reach Cosmos: cosmos-emulator.bytelyst-infra.svc:8081
  • Webs reach backends: flowmonk-backend.bytelyst-products.svc:4017
  • Everything reaches platform: platform-service.bytelyst-platform.svc:4003

Ollama as External Service

Ollama stays on the host (systemd). A headless Service + Endpoints in bytelyst-infra points to the node's internal IP. Pods reach it as ollama.bytelyst-infra.svc:11434. Setup script auto-detects the node IP.

NodePort for External Access

All services use the same ports as Docker Compose (e.g., :4003, :3002, :3030). k3s is configured with --kube-apiserver-arg=service-node-port-range=1024-32767.

Resource Limits (tuned for 32 GB VM, 50 beta users)

Service type CPU request CPU limit Memory request Memory limit
Backend (×10) 100m 500m 256Mi 512Mi
Web app (×9) 100m 500m 256Mi 512Mi
Platform (×3) 200m 1000m 384Mi 768Mi
Cosmos emulator 500m 2000m 2Gi 3Gi
Grafana 100m 500m 128Mi 256Mi
Mailpit / Loki 50-100m 500m 64-128Mi 512Mi
k3s overhead ~512Mi
Ollama (host) ~3Gi
Total ~10 Gi ~19 Gi

Fits comfortably in 32 GB with ~13 GB headroom.

Readiness + Liveness Probes

Every service gets both:

  • Readiness: GET /health every 10s (traffic only when ready)
  • Liveness: GET /health every 30s (auto-restart on failure)
  • Backends: initialDelaySeconds: 15, Web apps: initialDelaySeconds: 15
  • Cosmos emulator: initialDelaySeconds: 60 (slow startup)

Operations Cheat Sheet

# ── Cluster status ─────────────────────────────────
kubectl get nodes                              # Node health
kubectl get pods -A                            # All pods
kubectl top pods -A                            # Resource usage (CPU/memory)

# ── Deploy / update ────────────────────────────────
kubectl apply -f k8s/products/                 # Re-apply product manifests
kubectl rollout restart deploy/flowmonk-backend -n bytelyst-products  # Rolling restart

# ── Scaling (for load testing) ─────────────────────
kubectl scale deploy/platform-service --replicas=2 -n bytelyst-platform
kubectl autoscale deploy/flowmonk-backend --min=1 --max=3 --cpu-percent=70 -n bytelyst-products

# ── Debugging ──────────────────────────────────────
kubectl logs deploy/platform-service -n bytelyst-platform -f        # Stream logs
kubectl describe pod <name> -n bytelyst-platform                    # Pod events
kubectl exec -it deploy/platform-service -n bytelyst-platform -- sh # Shell into pod

# ── Teardown ───────────────────────────────────────
sudo ./setup-k8s.sh --teardown                 # Delete all namespaces (keep k3s)
/usr/local/bin/k3s-uninstall.sh                # Uninstall k3s completely

Port Map (same as Docker Compose)

Service Port Health check
Gitea (npm) 3300 http://localhost:3300/api/v1/version
Ollama (LLM) 11434 http://localhost:11434/api/version
Cosmos Explorer 1234 http://localhost:1234
Azurite (Blob) 10000 http://localhost:10000/devstoreaccount1?comp=list
Mailpit UI 8025 http://localhost:8025
Loki 3100 http://localhost:3100/ready
Grafana 3000 http://localhost:3000/api/health
platform-service 4003 /health
extraction-service 4005 /health
mcp-server 4007 /health
admin-web 3001 /
tracker-web 3003 /
Backends 4010-4019 /health
Web apps 3002, 3030, 3035, 3040, 3045, 3050, 3055, 3060, 3070 /

Switching Between Docker Compose and K8s

Both approaches coexist on the same VM:

# Docker → K8s
cd /opt/bytelyst/learning_ai_common_plat
docker compose -f docker-compose.ecosystem.yml down   # Stop compose stack
sudo ../docs/devops/single_azure_vm/k8s/setup-k8s.sh  # Deploy to k3s

# K8s → Docker
sudo ./setup-k8s.sh --teardown                        # Remove k8s resources
sudo ../docker/setup.sh --phase=7                      # Re-deploy via compose

Both share: Gitea registry (Docker container), Ollama (systemd), and built Docker images.