learning_ai_common_plat/docs/devops/single_azure_vm/k8s
saravanakumardb1 7d0c469858 refactor(infra): reorganize single_azure_vm into docker/ and k8s/ subfolders
- Move setup.sh, README.md, prompt.md into docker/ subfolder
- Create top-level README.md comparing both approaches
- Create k8s/README.md with full design doc: k3s architecture,
  namespace strategy, manifest structure, ConfigMap/Secret design,
  Cosmos emulator StatefulSet, Ollama host service, resource limits,
  5-phase implementation plan, and kubectl cheat sheet
2026-03-24 14:11:50 -07:00
..
README.md refactor(infra): reorganize single_azure_vm into docker/ and k8s/ subfolders 2026-03-24 14:11:50 -07:00

ByteLyst Single-VM Kubernetes Deployment (k3s)

Deploy the ByteLyst ecosystem on Kubernetes using k3s — a lightweight, certified K8s distribution that runs on a single VM with ~512 MB overhead.

Status: Planning — see design decisions below.


Prerequisites

Same VM as the Docker Compose approach:

  • Azure VM: Ubuntu 24.04 LTS, Standard_D8s_v5 (8 vCPU, 32 GB RAM)
  • Disk: 128 GB+
  • Docker images: Built by docker/setup.sh phases 1-5 (reused, not rebuilt)

Why k3s?

Feature k3s minikube kind microk8s
RAM overhead ~512 MB ~2 GB ~1 GB ~800 MB
Production-grade Yes (CNCF certified) No No Yes
Built-in Traefik Yes No No Optional
Single binary Yes No No No (snap)
SQLite backend Yes (no etcd needed) N/A N/A Dqlite

Architecture

Ubuntu 24.04 VM
├── k3s (single-node cluster)
│   ├── kube-system namespace
│   │   ├── CoreDNS
│   │   ├── Traefik Ingress Controller
│   │   ├── Local Path Provisioner
│   │   └── Metrics Server
│   │
│   ├── bytelyst-infra namespace
│   │   ├── cosmos-emulator (StatefulSet + PVC)
│   │   ├── azurite (StatefulSet + PVC)
│   │   ├── mailpit (Deployment)
│   │   ├── loki (StatefulSet + PVC)
│   │   └── grafana (Deployment + PVC)
│   │
│   ├── bytelyst-platform namespace
│   │   ├── platform-service (Deployment, replicas: 1)
│   │   ├── extraction-service (Deployment, replicas: 1)
│   │   └── mcp-server (Deployment, replicas: 1)
│   │
│   ├── bytelyst-dashboards namespace
│   │   ├── admin-web (Deployment, replicas: 1)
│   │   └── tracker-web (Deployment, replicas: 1)
│   │
│   └── bytelyst-products namespace
│       ├── *-backend (10 Deployments)
│       └── *-web (9 Deployments)
│
├── Ollama (systemd, host network — :11434)
└── Gitea (Docker container — :3300, used for build-time only)

Manifest Structure (planned)

k8s/
├── README.md                    # This file
├── setup-k8s.sh                 # Bootstrap script (installs k3s, applies manifests)
├── namespaces.yaml              # 4 namespaces
├── config/
│   ├── configmap.yaml           # Shared env vars (replaces .env.ecosystem)
│   └── secrets.yaml             # JWT_SECRET, COSMOS_KEY, etc.
├── infra/
│   ├── cosmos-emulator.yaml     # StatefulSet + Service + PVC
│   ├── azurite.yaml             # StatefulSet + Service + PVC
│   ├── mailpit.yaml             # Deployment + Service
│   ├── loki.yaml                # StatefulSet + Service + PVC
│   └── grafana.yaml             # Deployment + Service + PVC
├── platform/
│   ├── platform-service.yaml    # Deployment + Service
│   ├── extraction-service.yaml  # Deployment + Service
│   └── mcp-server.yaml          # Deployment + Service
├── dashboards/
│   ├── admin-web.yaml           # Deployment + Service
│   └── tracker-web.yaml         # Deployment + Service
├── products/
│   ├── _backend-template.yaml   # Helm-like template (for reference)
│   ├── peakpulse-backend.yaml
│   ├── chronomind-backend.yaml
│   ├── ... (8 more backends)
│   ├── lysnrai-dashboard.yaml
│   ├── chronomind-web.yaml
│   └── ... (7 more web apps)
└── ingress/
    └── ingress.yaml             # Traefik IngressRoute rules

Key Design Decisions

1. Image Source: Import from Docker

k3s uses containerd, not Docker. We import the Docker-built images:

# Build images with Docker (phases 1-7 from docker/setup.sh)
docker save platform-service:latest | k3s ctr images import -

# Or build directly with nerdctl (k3s-native)
nerdctl build -t platform-service:latest -f services/platform-service/Dockerfile .

Decision: Import from Docker first (simpler), migrate to nerdctl later.

2. Cosmos Emulator: StatefulSet with PVC

The Cosmos emulator needs persistent storage and specific env vars. Use a StatefulSet (not Deployment) for stable network identity:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cosmos-emulator
  namespace: bytelyst-infra
spec:
  replicas: 1
  serviceName: cosmos-emulator
  template:
    spec:
      containers:
      - name: cosmos
        image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:latest
        ports:
        - containerPort: 8081
        - containerPort: 1234
        env:
        - name: AZURE_COSMOS_EMULATOR_ENABLE_DATA_PERSISTENCE
          value: "true"
        - name: ENABLE_EXPLORER
          value: "true"
        resources:
          limits:
            memory: "3Gi"
            cpu: "2"
  volumeClaimTemplates:
  - metadata:
      name: cosmos-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

3. Ollama: Host Network

Ollama stays as a systemd service on the host. Pods reach it via hostNetwork or a manually created Endpoints + Service pointing to the node IP:

apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: bytelyst-products
spec:
  ports:
  - port: 11434
---
apiVersion: v1
kind: Endpoints
metadata:
  name: ollama
  namespace: bytelyst-products
subsets:
- addresses:
  - ip: 172.17.0.1    # Host IP (node's internal IP)
  ports:
  - port: 11434

4. ConfigMap replaces .env.ecosystem

apiVersion: v1
kind: ConfigMap
metadata:
  name: bytelyst-config
  namespace: bytelyst-platform
data:
  COSMOS_ENDPOINT: "http://cosmos-emulator.bytelyst-infra.svc:8081"
  COSMOS_DATABASE: "bytelyst"
  DB_PROVIDER: "cosmos"
  PLATFORM_SERVICE_URL: "http://platform-service.bytelyst-platform.svc:4003"
  EXTRACTION_SERVICE_URL: "http://extraction-service.bytelyst-platform.svc:4005"

Note: K8s DNS uses <service>.<namespace>.svc format for cross-namespace access.

5. Secrets for sensitive values

apiVersion: v1
kind: Secret
metadata:
  name: bytelyst-secrets
type: Opaque
stringData:
  JWT_SECRET: "<generated>"
  COSMOS_KEY: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=="
  AZURE_BLOB_ACCOUNT_KEY: "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=="

6. Health Checks → Readiness/Liveness Probes

Every backend gets K8s-native probes:

readinessProbe:
  httpGet:
    path: /health
    port: 4003
  initialDelaySeconds: 15
  periodSeconds: 10
livenessProbe:
  httpGet:
    path: /health
    port: 4003
  initialDelaySeconds: 30
  periodSeconds: 30

7. Resource Limits

Service type CPU request CPU limit Memory request Memory limit
Backend 100m 500m 256Mi 512Mi
Web app 100m 500m 256Mi 512Mi
Platform service 200m 1000m 384Mi 768Mi
Cosmos emulator 1000m 2000m 2Gi 3Gi
Ollama (host) (host) (host) (host)

Implementation Phases

Phase A: Foundation (Day 1)

  • Install k3s on VM
  • Create 4 namespaces
  • Deploy ConfigMap + Secrets
  • Deploy cosmos-emulator + azurite (StatefulSets)
  • Verify: kubectl get pods -A shows infra running

Phase B: Platform (Day 1-2)

  • Import platform-service Docker image
  • Deploy platform-service (Deployment + Service)
  • Verify: kubectl exec + curl http://platform-service:4003/health
  • Deploy extraction-service + mcp-server
  • Deploy admin-web + tracker-web

Phase C: Products (Day 2-3)

  • Template: create one backend manifest, verify it works
  • Replicate for all 10 backends
  • Create web app manifests (9 services)
  • Verify: all 30 services running

Phase D: Networking (Day 3)

  • Set up Traefik IngressRoute for external access
  • Configure NodePort services for direct port access
  • Create Ollama external service endpoint
  • Verify: health check script works against K8s services

Phase E: Operations (Day 4+)

  • kubectl scale deployment/flowmonk-backend --replicas=2 — test scaling
  • kubectl rollout restart deployment/platform-service — test rolling update
  • kubectl top pods — resource usage monitoring
  • Set up HorizontalPodAutoscaler for one service
  • Practice: kubectl logs, kubectl exec, kubectl describe

Useful Commands (cheat sheet)

# Cluster status
kubectl get nodes
kubectl get pods -A                    # All namespaces
kubectl get pods -n bytelyst-products  # Product namespace

# Deploy / update
kubectl apply -f k8s/                  # Apply all manifests
kubectl apply -f k8s/products/         # Apply product manifests
kubectl rollout restart deployment/flowmonk-backend -n bytelyst-products

# Debugging
kubectl logs deployment/platform-service -n bytelyst-platform -f
kubectl describe pod <pod-name> -n bytelyst-platform
kubectl exec -it deployment/platform-service -n bytelyst-platform -- sh

# Scaling
kubectl scale deployment/flowmonk-backend --replicas=2 -n bytelyst-products
kubectl autoscale deployment/flowmonk-backend --min=1 --max=3 --cpu-percent=70

# Resource monitoring
kubectl top pods -n bytelyst-products
kubectl top nodes

Migration from Docker Compose

Both approaches can coexist on the same VM:

  1. docker/setup.sh builds images and publishes packages (phases 1-5)
  2. docker compose down stops the compose stack
  3. setup-k8s.sh imports images into k3s and applies manifests
  4. Both share the same Gitea registry and Ollama instance