learning_ai_common_plat/docs/devops/KUBERNETES_ROADMAP.md
2026-03-23 18:04:18 -07:00

11 KiB

ByteLyst Ecosystem — Kubernetes Roadmap

This document is the standalone roadmap for moving the ByteLyst ecosystem from Docker Compose on a single VM to local Kubernetes practice and eventually production-grade Kubernetes deployment.

Scope

Use this roadmap for:

  • Docker Compose → Docker Desktop Kubernetes / K3s transition planning
  • local Kubernetes validation strategy
  • Helm/chart planning
  • Kubernetes best practices for deployments, security, probes, ingress, and scaling
  • secrets progression from .env.ecosystem to Kubernetes Secret objects and later Azure Key Vault integration
  • CI/CD expectations for image promotion and chart versioning

This document does not replace docs/devops/SINGLE_VM_DEPLOYMENT.md.

SINGLE_VM_DEPLOYMENT.md remains the source of truth for:

  • single-VM deployment scope
  • Docker Compose ecosystem architecture
  • Dockerization and package-manager-aware deployment guidance
  • current implementation status and audit findings

Current State

Completed foundation

  • Docker Compose ecosystem architecture is documented
  • product repos have Dockerfiles and docker-prep.sh
  • shared services have been built and validated in the ecosystem stack
  • LocalMemGPT Linux-host Ollama access is addressed in Compose via extra_hosts
  • deployment docs now separate Compose/source-of-truth concerns from Kubernetes roadmap concerns

Not yet completed

  • standalone local Kubernetes assets
  • Helm charts / values structure in-repo
  • Kubernetes manifests for the ecosystem
  • local K8s deployment script implementation
  • full K3s / Docker Desktop K8s validation

Phase Plan

Phase 1 — Docker Compose baseline

Goal: keep Compose as the operational baseline while Docker/build/runtime contracts stabilize.

Success criteria:

  • all ecosystem images build successfully
  • all required services start in Docker Compose
  • health endpoints are reachable for shared services and product backends
  • major host/container networking assumptions are documented

Phase 2 — Local Kubernetes practice

Goal: run the same ecosystem ideas on a single-node Kubernetes environment for production-readiness practice.

Two supported paths:

Option A: Docker Desktop Kubernetes

Best for:

  • macOS / Windows development
  • quick iteration
  • visual debugging

Characteristics:

  • built-in kind-style cluster
  • Docker-built images are immediately visible to the cluster
  • easiest local path for validating manifests and Helm shape

Option B: K3s

Best for:

  • Linux VMs
  • Hetzner or cloud-hosted single-node practice
  • future multi-node growth

Characteristics:

  • lightweight CNCF-certified Kubernetes distro
  • built-in Traefik ingress
  • built-in local-path storage class
  • can evolve from single-node to multi-node more naturally than Docker Desktop

Phase 3 — Production-grade Kubernetes shape

Goal: make local K8s patterns production-ready enough to port to AKS/EKS/GKE later without redesign.

Key outcomes:

  • health probes standardized
  • rolling update behavior standardized
  • security context standardized
  • ingress and SSE/WebSocket behavior standardized
  • Helm values layering defined
  • secret management progression defined

Phase 4 — Managed Kubernetes target

Goal: preserve the same deployment model while moving to managed infrastructure.

Expected direction:

  • managed ingress controller and TLS
  • chart/image promotion flow
  • Azure Key Vault CSI integration
  • HPA and environment-specific overlays

Local Kubernetes Best Practices

1. Deployment rollout safety

Use zero-downtime defaults:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0
      maxSurge: 1
  template:
    spec:
      terminationGracePeriodSeconds: 45
      containers:
        - lifecycle:
            preStop:
              exec:
                command: ['sleep', '5']

Guidance:

  • never use aggressive maxUnavailable values for user-facing services
  • match terminationGracePeriodSeconds to graceful shutdown behavior
  • use preStop delay to give the load balancer time to drain

2. Pod security context

Default posture:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  runAsGroup: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

If writable paths are needed:

volumes:
  - name: tmp
    emptyDir: {}
  - name: cache
    emptyDir: {}
volumeMounts:
  - name: tmp
    mountPath: /tmp
  - name: cache
    mountPath: /home/node/.cache

Guidance:

  • Fastify backends should generally tolerate read-only root filesystems
  • Next.js standalone servers may need writable /tmp

3. Health probes

Use dedicated /health endpoints:

livenessProbe:
  httpGet:
    path: /health
    port: 4003
  initialDelaySeconds: 10
  periodSeconds: 10
  timeoutSeconds: 5
readinessProbe:
  httpGet:
    path: /health
    port: 4003
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 5

Guidance:

  • do not use heavy endpoints like /openapi.json for liveness
  • keep timeouts short enough to expose real failures quickly

4. Ingress for SSE / WebSocket traffic

For streaming or long-lived connections:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-read-timeout: '1800'
    nginx.ingress.kubernetes.io/proxy-send-timeout: '1800'
    nginx.ingress.kubernetes.io/proxy-buffering: 'off'
    nginx.ingress.kubernetes.io/proxy-http-version: '1.1'
    nginx.ingress.kubernetes.io/configuration-snippet: |
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection "upgrade";      

Applies to:

  • FlowMonk SSE
  • LocalMemGPT streaming
  • future realtime features

5. HPA API choice

Use:

apiVersion: autoscaling/v2

Avoid:

apiVersion: autoscaling/v1

Docker and Image Guidance for K8s Readiness

Practice Do Avoid
ENTRYPOINT form ENTRYPOINT ["node", "dist/server.js"] shell-form entrypoints
COPY scope selective COPY steps broad COPY . .
Layer count combine related RUN steps fragmented install layers
Non-root run as node or non-root UID root runtime
Local variant allow local Dockerfile variants where needed one Dockerfile that only works in one network environment
Build args use ARG/ENV deliberately hardcoded deployment assumptions

Helm Values Layering

Recommended structure:

values.yaml
├── env/local.yaml
├── env/dev.yaml
└── env/prod.yaml

Recommended usage:

helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/local.yaml
helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/dev.yaml
helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/prod.yaml

Namespace Strategy

Use helpers rather than hardcoded namespaces:

{ { include "myapp.namespace" . } }

Avoid:

{ { .Values.namespace } }

Secrets Progression

Phase Strategy Complexity
Phase 1 .env.ecosystem file (gitignored) Trivial
Phase 2 Native Kubernetes Secret objects Low
Phase 3 Azure Key Vault via CSI SecretProviderClass Medium
Phase 4 AKV + operator/CRD auto-sync model High

CI/CD Expectations

Practice Expectation
Semantic release keep feat: / fix: conventions usable for release automation
Image promotion build once, promote later; do not rebuild for prod
Branch pipelines branch-specific quality and deploy stages
Security gates SAST/SCA in pipeline
Quality gates tests, coverage, type safety, build verification
Chart versioning publish/version charts independently

Local K8s Deployment Workflow Shape

A future local K8s script should do the following:

  1. detect Docker Desktop K8s vs K3s
  2. build required images
  3. load/import images into the local cluster runtime when needed
  4. create namespace
  5. create secrets from .env.ecosystem
  6. deploy Helm chart with local overlay
  7. wait for rollout
  8. print verification commands and port-forward hints

Next now

  • run full Docker Compose ecosystem validation end-to-end
  • capture blockers by service
  • decide whether K8s phase starts with Docker Desktop K8s or K3s first

Next after Compose validation

  • define helm/bytelyst-ecosystem/ layout
  • define namespace and secret model
  • draft minimal shared-service-first Kubernetes manifests or chart values
  • create local K8s deploy helper script

Hold for later

  • full Helm/K3s implementation across the ecosystem
  • managed cluster rollout details
  • advanced autoscaling and production ingress hardening

Quick Reference

Practice Compose Local K8s Prod K8s
Zero-downtime rolling update N/A Apply Apply
Pod security context N/A Apply Apply
Health probes use Docker healthcheck where relevant Apply Apply
SSE/WebSocket ingress tuning N/A If needed Apply
HPA v2 N/A Optional Apply
Exec-form entrypoint Apply now Apply Apply
Selective COPY Apply now Apply Apply
Non-root user Apply now Apply Apply
Values layering N/A Apply Apply
AKV CSI N/A N/A Apply
Image promotion N/A N/A Apply

Status

  • standalone Kubernetes roadmap: created
  • Compose source-of-truth split: done
  • Helm/K3s implementation: held pending validation