360 lines
11 KiB
Markdown
360 lines
11 KiB
Markdown
# ByteLyst Ecosystem — Kubernetes Roadmap
|
|
|
|
This document is the standalone roadmap for moving the ByteLyst ecosystem from Docker Compose on a single VM to local Kubernetes practice and eventually production-grade Kubernetes deployment.
|
|
|
|
## Scope
|
|
|
|
Use this roadmap for:
|
|
|
|
- Docker Compose → Docker Desktop Kubernetes / K3s transition planning
|
|
- local Kubernetes validation strategy
|
|
- Helm/chart planning
|
|
- Kubernetes best practices for deployments, security, probes, ingress, and scaling
|
|
- secrets progression from `.env.ecosystem` to Kubernetes `Secret` objects and later Azure Key Vault integration
|
|
- CI/CD expectations for image promotion and chart versioning
|
|
|
|
This document does **not** replace `docs/devops/SINGLE_VM_DEPLOYMENT.md`.
|
|
|
|
`SINGLE_VM_DEPLOYMENT.md` remains the source of truth for:
|
|
|
|
- single-VM deployment scope
|
|
- Docker Compose ecosystem architecture
|
|
- Dockerization and package-manager-aware deployment guidance
|
|
- current implementation status and audit findings
|
|
|
|
## Current State
|
|
|
|
### Completed foundation
|
|
|
|
- Docker Compose ecosystem architecture is documented
|
|
- product repos have Dockerfiles and `docker-prep.sh`
|
|
- shared services have been built and validated in the ecosystem stack
|
|
- LocalMemGPT Linux-host Ollama access is addressed in Compose via `extra_hosts`
|
|
- deployment docs now separate Compose/source-of-truth concerns from Kubernetes roadmap concerns
|
|
|
|
### Not yet completed
|
|
|
|
- standalone local Kubernetes assets
|
|
- Helm charts / values structure in-repo
|
|
- Kubernetes manifests for the ecosystem
|
|
- local K8s deployment script implementation
|
|
- full K3s / Docker Desktop K8s validation
|
|
|
|
## Phase Plan
|
|
|
|
### Phase 1 — Docker Compose baseline
|
|
|
|
Goal: keep Compose as the operational baseline while Docker/build/runtime contracts stabilize.
|
|
|
|
Success criteria:
|
|
|
|
- all ecosystem images build successfully
|
|
- all required services start in Docker Compose
|
|
- health endpoints are reachable for shared services and product backends
|
|
- major host/container networking assumptions are documented
|
|
|
|
### Phase 2 — Local Kubernetes practice
|
|
|
|
Goal: run the same ecosystem ideas on a single-node Kubernetes environment for production-readiness practice.
|
|
|
|
Two supported paths:
|
|
|
|
#### Option A: Docker Desktop Kubernetes
|
|
|
|
Best for:
|
|
|
|
- macOS / Windows development
|
|
- quick iteration
|
|
- visual debugging
|
|
|
|
Characteristics:
|
|
|
|
- built-in `kind`-style cluster
|
|
- Docker-built images are immediately visible to the cluster
|
|
- easiest local path for validating manifests and Helm shape
|
|
|
|
#### Option B: K3s
|
|
|
|
Best for:
|
|
|
|
- Linux VMs
|
|
- Hetzner or cloud-hosted single-node practice
|
|
- future multi-node growth
|
|
|
|
Characteristics:
|
|
|
|
- lightweight CNCF-certified Kubernetes distro
|
|
- built-in Traefik ingress
|
|
- built-in local-path storage class
|
|
- can evolve from single-node to multi-node more naturally than Docker Desktop
|
|
|
|
### Phase 3 — Production-grade Kubernetes shape
|
|
|
|
Goal: make local K8s patterns production-ready enough to port to AKS/EKS/GKE later without redesign.
|
|
|
|
Key outcomes:
|
|
|
|
- health probes standardized
|
|
- rolling update behavior standardized
|
|
- security context standardized
|
|
- ingress and SSE/WebSocket behavior standardized
|
|
- Helm values layering defined
|
|
- secret management progression defined
|
|
|
|
### Phase 4 — Managed Kubernetes target
|
|
|
|
Goal: preserve the same deployment model while moving to managed infrastructure.
|
|
|
|
Expected direction:
|
|
|
|
- managed ingress controller and TLS
|
|
- chart/image promotion flow
|
|
- Azure Key Vault CSI integration
|
|
- HPA and environment-specific overlays
|
|
|
|
## Local Kubernetes Best Practices
|
|
|
|
### 1. Deployment rollout safety
|
|
|
|
Use zero-downtime defaults:
|
|
|
|
```yaml
|
|
spec:
|
|
strategy:
|
|
type: RollingUpdate
|
|
rollingUpdate:
|
|
maxUnavailable: 0
|
|
maxSurge: 1
|
|
template:
|
|
spec:
|
|
terminationGracePeriodSeconds: 45
|
|
containers:
|
|
- lifecycle:
|
|
preStop:
|
|
exec:
|
|
command: ['sleep', '5']
|
|
```
|
|
|
|
Guidance:
|
|
|
|
- never use aggressive `maxUnavailable` values for user-facing services
|
|
- match `terminationGracePeriodSeconds` to graceful shutdown behavior
|
|
- use `preStop` delay to give the load balancer time to drain
|
|
|
|
### 2. Pod security context
|
|
|
|
Default posture:
|
|
|
|
```yaml
|
|
securityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1000
|
|
runAsGroup: 1000
|
|
allowPrivilegeEscalation: false
|
|
readOnlyRootFilesystem: true
|
|
```
|
|
|
|
If writable paths are needed:
|
|
|
|
```yaml
|
|
volumes:
|
|
- name: tmp
|
|
emptyDir: {}
|
|
- name: cache
|
|
emptyDir: {}
|
|
volumeMounts:
|
|
- name: tmp
|
|
mountPath: /tmp
|
|
- name: cache
|
|
mountPath: /home/node/.cache
|
|
```
|
|
|
|
Guidance:
|
|
|
|
- Fastify backends should generally tolerate read-only root filesystems
|
|
- Next.js standalone servers may need writable `/tmp`
|
|
|
|
### 3. Health probes
|
|
|
|
Use dedicated `/health` endpoints:
|
|
|
|
```yaml
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 4003
|
|
initialDelaySeconds: 10
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 4003
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
timeoutSeconds: 5
|
|
```
|
|
|
|
Guidance:
|
|
|
|
- do not use heavy endpoints like `/openapi.json` for liveness
|
|
- keep timeouts short enough to expose real failures quickly
|
|
|
|
### 4. Ingress for SSE / WebSocket traffic
|
|
|
|
For streaming or long-lived connections:
|
|
|
|
```yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
annotations:
|
|
nginx.ingress.kubernetes.io/proxy-read-timeout: '1800'
|
|
nginx.ingress.kubernetes.io/proxy-send-timeout: '1800'
|
|
nginx.ingress.kubernetes.io/proxy-buffering: 'off'
|
|
nginx.ingress.kubernetes.io/proxy-http-version: '1.1'
|
|
nginx.ingress.kubernetes.io/configuration-snippet: |
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection "upgrade";
|
|
```
|
|
|
|
Applies to:
|
|
|
|
- FlowMonk SSE
|
|
- LocalMemGPT streaming
|
|
- future realtime features
|
|
|
|
### 5. HPA API choice
|
|
|
|
Use:
|
|
|
|
```yaml
|
|
apiVersion: autoscaling/v2
|
|
```
|
|
|
|
Avoid:
|
|
|
|
```yaml
|
|
apiVersion: autoscaling/v1
|
|
```
|
|
|
|
## Docker and Image Guidance for K8s Readiness
|
|
|
|
| Practice | Do | Avoid |
|
|
| ------------------- | -------------------------------------------- | --------------------------------------------------------- |
|
|
| **ENTRYPOINT form** | `ENTRYPOINT ["node", "dist/server.js"]` | shell-form entrypoints |
|
|
| **COPY scope** | selective `COPY` steps | broad `COPY . .` |
|
|
| **Layer count** | combine related `RUN` steps | fragmented install layers |
|
|
| **Non-root** | run as `node` or non-root UID | root runtime |
|
|
| **Local variant** | allow local Dockerfile variants where needed | one Dockerfile that only works in one network environment |
|
|
| **Build args** | use `ARG`/`ENV` deliberately | hardcoded deployment assumptions |
|
|
|
|
## Helm Values Layering
|
|
|
|
Recommended structure:
|
|
|
|
```text
|
|
values.yaml
|
|
├── env/local.yaml
|
|
├── env/dev.yaml
|
|
└── env/prod.yaml
|
|
```
|
|
|
|
Recommended usage:
|
|
|
|
```bash
|
|
helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/local.yaml
|
|
helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/dev.yaml
|
|
helm upgrade --install bytelyst ./helm/bytelyst-ecosystem -f helm/bytelyst-ecosystem/values.yaml -f helm/bytelyst-ecosystem/env/prod.yaml
|
|
```
|
|
|
|
## Namespace Strategy
|
|
|
|
Use helpers rather than hardcoded namespaces:
|
|
|
|
```yaml
|
|
{{ include "myapp.namespace" . }}
|
|
```
|
|
|
|
Avoid:
|
|
|
|
```yaml
|
|
{{ .Values.namespace }}
|
|
```
|
|
|
|
## Secrets Progression
|
|
|
|
| Phase | Strategy | Complexity |
|
|
| ----------- | --------------------------------------------- | ---------- |
|
|
| **Phase 1** | `.env.ecosystem` file (gitignored) | Trivial |
|
|
| **Phase 2** | Native Kubernetes `Secret` objects | Low |
|
|
| **Phase 3** | Azure Key Vault via CSI `SecretProviderClass` | Medium |
|
|
| **Phase 4** | AKV + operator/CRD auto-sync model | High |
|
|
|
|
## CI/CD Expectations
|
|
|
|
| Practice | Expectation |
|
|
| -------------------- | --------------------------------------------------------------- |
|
|
| **Semantic release** | keep `feat:` / `fix:` conventions usable for release automation |
|
|
| **Image promotion** | build once, promote later; do not rebuild for prod |
|
|
| **Branch pipelines** | branch-specific quality and deploy stages |
|
|
| **Security gates** | SAST/SCA in pipeline |
|
|
| **Quality gates** | tests, coverage, type safety, build verification |
|
|
| **Chart versioning** | publish/version charts independently |
|
|
|
|
## Local K8s Deployment Workflow Shape
|
|
|
|
A future local K8s script should do the following:
|
|
|
|
1. detect Docker Desktop K8s vs K3s
|
|
2. build required images
|
|
3. load/import images into the local cluster runtime when needed
|
|
4. create namespace
|
|
5. create secrets from `.env.ecosystem`
|
|
6. deploy Helm chart with local overlay
|
|
7. wait for rollout
|
|
8. print verification commands and port-forward hints
|
|
|
|
## Recommended Next Items
|
|
|
|
### Next now
|
|
|
|
- run full Docker Compose ecosystem validation end-to-end
|
|
- capture blockers by service
|
|
- decide whether K8s phase starts with Docker Desktop K8s or K3s first
|
|
|
|
### Next after Compose validation
|
|
|
|
- define `helm/bytelyst-ecosystem/` layout
|
|
- define namespace and secret model
|
|
- draft minimal shared-service-first Kubernetes manifests or chart values
|
|
- create local K8s deploy helper script
|
|
|
|
### Hold for later
|
|
|
|
- full Helm/K3s implementation across the ecosystem
|
|
- managed cluster rollout details
|
|
- advanced autoscaling and production ingress hardening
|
|
|
|
## Quick Reference
|
|
|
|
| Practice | Compose | Local K8s | Prod K8s |
|
|
| ---------------------------- | --------------------------------------- | --------- | -------- |
|
|
| Zero-downtime rolling update | N/A | Apply | Apply |
|
|
| Pod security context | N/A | Apply | Apply |
|
|
| Health probes | use Docker `healthcheck` where relevant | Apply | Apply |
|
|
| SSE/WebSocket ingress tuning | N/A | If needed | Apply |
|
|
| HPA v2 | N/A | Optional | Apply |
|
|
| Exec-form entrypoint | Apply now | Apply | Apply |
|
|
| Selective COPY | Apply now | Apply | Apply |
|
|
| Non-root user | Apply now | Apply | Apply |
|
|
| Values layering | N/A | Apply | Apply |
|
|
| AKV CSI | N/A | N/A | Apply |
|
|
| Image promotion | N/A | N/A | Apply |
|
|
|
|
## Status
|
|
|
|
- standalone Kubernetes roadmap: **created**
|
|
- Compose source-of-truth split: **done**
|
|
- Helm/K3s implementation: **held pending validation**
|