learning_ai_common_plat/docs/devops
saravanakumardb1 8a568932b4 feat(infra): add production-grade k3s Kubernetes setup for single VM
Complete K8s deployment alternative to Docker Compose, targeting
~50 beta users on a Standard_D8s_v5 Azure VM (8 vCPU, 32 GB RAM).

setup-k8s.sh (6 phases):
  1. Pre-flight: verify docker phases 1-5 ran, disk/RAM checks
  2. Install k3s: Docker runtime, NodePort range 1024-32767
  3. Build images: docker compose build + tag as bytelyst/<svc>
  4. Config: namespaces, ConfigMap (3 copies), Secrets (JWT + blob keys), Ollama
  5. Deploy: infra -> platform -> dashboards -> products (ordered)
  6. Health check: 32 endpoints + kubectl pod status

K8s manifests (18 files):
  - 4 namespaces (infra, platform, dashboards, products)
  - 6 infra (cosmos StatefulSet+PVC, azurite StatefulSet+PVC,
    mailpit, loki StatefulSet+PVC, grafana+PVC, ollama external)
  - 3 platform (Deployment+Service+NodePort each)
  - 2 dashboards (Deployment+Service+NodePort each)
  - 10 backends + 9 webs (all with readiness+liveness probes,
    resource limits, product-specific NEXT_PUBLIC_* env vars)

Design decisions:
  - k3s --docker: reuses existing Docker images, no containerd import
  - Same ports as Docker Compose (NodePort with extended range)
  - ConfigMap replaces .env.ecosystem, copied to 3 app namespaces
  - Blob storage keys injected at deploy time via Secret (not in YAML)
  - Cross-namespace DNS: <svc>.<ns>.svc for service discovery
  - Ollama as Endpoints+Service pointing to host node IP
  - Resource limits: ~19 Gi total, fits in 32 GB with 13 GB headroom
  - Teardown: --teardown flag deletes namespaces, keeps k3s
2026-03-24 14:47:17 -07:00
..
single_azure_vm feat(infra): add production-grade k3s Kubernetes setup for single VM 2026-03-24 14:47:17 -07:00
AZURE_KEY_VAULT_AND_SECRETS_ROTATION.md docs(devops): update stale mywisprai/MyWisprAI branding across 5 AKV docs 2026-03-21 09:15:30 -07:00
AZURE_PORTAL_SETUP.md docs(devops): update stale mywisprai/MyWisprAI branding across 5 AKV docs 2026-03-21 09:15:30 -07:00
AZURE_RESOURCE_INVENTORY.md docs(devops): update stale mywisprai/MyWisprAI branding across 5 AKV docs 2026-03-21 09:15:30 -07:00
CODING_AGENT_AUTOMATION_PLAYBOOK.md docs(devops): refresh backend audit baseline 2026-03-21 10:45:55 -07:00
ECOSYSTEM_DEPLOYMENT.md docs(infra): add complete CLI reference, examples, and phase docs to setup.sh + ECOSYSTEM_DEPLOYMENT.md 2026-03-24 12:24:16 -07:00
END_TO_END_ENCRYPTION_DESIGN.md docs(devops): fix 14 bugs/gaps in E2EE design + roadmap after codebase review 2026-03-21 09:01:35 -07:00
END_TO_END_ENCRYPTION_ROADMAP.md docs(e2ee): detailed SQLCipher + AKV implementation plan for LocalMemGPT Sprint 5.4 2026-03-21 13:39:01 -07:00
ENVIRONMENT_VARIABLES_AND_KEYVAULT_AUDIT.md docs(devops): update stale mywisprai/MyWisprAI branding across 5 AKV docs 2026-03-21 09:15:30 -07:00
GITEA_LOCAL_CI.md fix(ci): prefer ipv4 for local runner registration 2026-03-23 19:39:11 -07:00
GITEA_NPM_REGISTRY_MIGRATION.md docs: clean stale sections in GITEA_NPM_REGISTRY_MIGRATION.md 2026-03-24 08:44:29 -07:00
KUBERNETES_ROADMAP.md fix(docs): restore valid helm template examples 2026-03-23 18:16:01 -07:00
RAILWAY_DEPLOYMENT_RUNBOOK.md chore(devops): improve railway deploy script, add env sync and deployment runbook 2026-03-05 20:03:59 -08:00
REMOTE_DIAGNOSTICS_ROADMAP.md docs(roadmap): mark Phase 3.2 Session Detail View complete 2026-03-03 09:48:15 -08:00
SINGLE_VM_DEPLOYMENT.md docs: remove versioning refs and stale transition language from deployment docs 2026-03-24 08:10:17 -07:00
SINGLE_VM_ENHANCED_PLAN.md docs: remove versioning refs and stale transition language from deployment docs 2026-03-24 08:10:17 -07:00
USER_ISSUE_REPORTING_ROADMAP.md docs(feedback): mark all TODOs as completed in roadmap 2026-03-03 07:20:56 -08:00