diff --git a/docs/devops/single_azure_vm/README.md b/docs/devops/single_azure_vm/README.md index 1e5d3a82..cdfab326 100644 --- a/docs/devops/single_azure_vm/README.md +++ b/docs/devops/single_azure_vm/README.md @@ -1,188 +1,60 @@ # ByteLyst Single-VM Deployment -> Deploy the **entire ByteLyst ecosystem** (30 services, 10 products) on a single **raw** Azure VM. -> Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. -> Two files: this README and `setup.sh`. Copy both to the VM and run the script. +> Deploy the **entire ByteLyst ecosystem** (30 services, 10 products) on a single Azure VM. +> Two orchestration approaches — pick one or learn both side by side. --- -## Prerequisites +## Approaches -- **Azure VM:** Ubuntu 24.04 LTS (or 22.04), **Standard_D8s_v5** (8 vCPU, 32 GB RAM) recommended -- **Disk:** 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts) -- **Network:** NSG allowing inbound on ports listed in the Port Map below -- **GitHub access:** Repos must be accessible (public or `GITHUB_TOKEN` for private) -- **Nothing else needed** — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything +### [`docker/`](docker/) — Docker Compose (Production-ready) -## Quick Start +Proven, battle-tested deployment using `docker-compose.ecosystem.yml`. +Installs everything from scratch on a raw Ubuntu VM in ~20 minutes. ```bash -# 1. SSH into your Azure VM -ssh azureuser@ - -# 2. Copy setup.sh and make executable -chmod +x setup.sh - -# 3. Run — provide your GitHub username (repos are cloned via HTTPS) -# If repos are private, also export GITHUB_TOKEN first. -sudo ./setup.sh - -# 4. Wait ~15-25 minutes for full build + deploy - -# 5. Verify -/opt/bytelyst/check-health.sh +sudo ./docker/setup.sh # Full install +sudo ./docker/setup.sh --resume # Resume after disconnect +/opt/bytelyst/check-health.sh # Verify all 30 services ``` -### Resume & Retry +**Use this if:** You want reliable deployment now. -Phase completion is tracked. If anything fails, you don't have to start over: +### [`k8s/`](k8s/) — Kubernetes via k3s (Learning / Future-ready) -```bash -sudo ./setup.sh --phase=7 # Retry just the deploy phase -sudo ./setup.sh --resume # Auto-resume after SSH disconnect -sudo ./setup.sh --resume-from=7 # Jump to deploy after manual fix -sudo ./setup.sh --status # Check what's done -sudo ./setup.sh --reset # Start completely over -sudo ./setup.sh --help # Show full usage +Same 30 services orchestrated by Kubernetes on a single VM using k3s. +Builds on the same Docker images — no Dockerfile changes needed. + +**Use this if:** You want to learn K8s with real services, practice `kubectl`, +and prepare for multi-node scaling later. + +--- + +## Architecture (shared by both approaches) + +``` +Raw Ubuntu 24.04 VM (Standard_D8s_v5: 8 vCPU, 32 GB RAM) +├── Ollama (systemd, :11434) ─── local LLM inference +├── Gitea (Docker/:3300) ──────── npm package registry +└── 30 Services + ├── Infrastructure (6): cosmos-emulator, azurite, mailpit, loki, grafana, traefik + ├── Platform (3): platform-service, extraction-service, mcp-server + ├── Dashboards (2): admin-web, tracker-web + ├── Backends (10): peakpulse, chronomind, jarvisjr, nomgap, mindlyst, + │ lysnrai, notelett, flowmonk, actiontrail, localmemgpt + └── Web Apps (9): lysnrai-dashboard, chronomind-web, jarvisjr-web, flowmonk-web, + notelett-web, mindlyst-web, nomgap-web, actiontrail-web, localmemgpt-web ``` -## What the Script Installs & Does +## Comparison -### Software installed on the VM (from scratch) - -| Software | Version | Purpose | -|----------|---------|----------| -| **Docker CE** | latest | Container runtime + Compose + BuildKit | -| **Node.js** | 22 LTS | Build toolchain for TypeScript packages | -| **pnpm** | 10.6.5 | Package manager (workspace-aware) | -| **Gitea** | 1.22 (Docker) | Local npm package registry on `:3300` | -| **Ollama** | latest | Local LLM inference for LocalMemGPT on `:11434` | -| **git, jq, curl** | latest | System utilities | - -### Execution phases - -| Phase | Duration | Description | -|-------|----------|-------------| -| 1. System | ~3 min | Pre-flight checks (disk ≥40 GB, RAM ≥16 GB), install Docker, Node.js 22, pnpm 10.6.5, Ollama, git, jq, build-essential | -| 2. Gitea | ~1 min | Start Gitea Docker container, create admin + org + API token | -| 3. Clone | ~3 min | Clone all 11 repos to `/opt/bytelyst/` | -| 4. Build | ~5 min | `pnpm install && pnpm -r build` all `@bytelyst/*` packages | -| 5. Publish | ~3 min | Publish all packages to local Gitea npm registry | -| 6. Env | instant | Generate `.env.ecosystem` with Cosmos emulator key, Azurite key, JWT secret | -| 7. Deploy | ~10 min | Stop Ollama (free RAM), per-service Docker build + deploy (30 services, with fallback), prune build cache, restart Ollama | -| 8. Verify | ~1 min | Health-check all 30+ endpoints + create `/opt/bytelyst/check-health.sh` | - -## Port Map (after deployment) - -### Infrastructure (installed by setup.sh) -| Service | Port | URL | -|---------|------|-----| -| Gitea (npm registry) | 3300 | `http://:3300` | -| Ollama (LLM API) | 11434 | `http://:11434` | -| Cosmos Data Explorer | 1234 | `http://:1234` | -| Azurite (Blob) | 10000 | `http://:10000` | -| Mailpit UI | 8025 | `http://:8025` | -| Loki (Logs) | 3100 | `http://:3100/ready` | -| Grafana | 3000 | `http://:3000` | -| Traefik Dashboard | 8080 | `http://:8080` | - -### Platform Services -| Service | Port | URL | -|---------|------|-----| -| platform-service | 4003 | `http://:4003/health` | -| extraction-service | 4005 | `http://:4005/health` | -| mcp-server | 4007 | `http://:4007/health` | - -### Platform Dashboards -| Dashboard | Port | URL | -|-----------|------|-----| -| Admin Console | 3001 | `http://:3001` | -| Issue Tracker | 3003 | `http://:3003` | - -### Product Backends -| Product | Port | Health | -|---------|------|--------| -| PeakPulse | 4010 | `http://:4010/health` | -| ChronoMind | 4011 | `http://:4011/health` | -| JarvisJr | 4012 | `http://:4012/health` | -| NomGap | 4013 | `http://:4013/health` | -| MindLyst | 4014 | `http://:4014/health` | -| LysnrAI | 4015 | `http://:4015/health` | -| NoteLett | 4016 | `http://:4016/health` | -| FlowMonk | 4017 | `http://:4017/health` | -| ActionTrail | 4018 | `http://:4018/health` | -| LocalMemGPT | 4019 | `http://:4019/health` | - -### Product Web Apps -| Product | Port | URL | -|---------|------|-----| -| LysnrAI Dashboard | 3002 | `http://:3002` | -| ChronoMind | 3030 | `http://:3030` | -| JarvisJr | 3035 | `http://:3035` | -| FlowMonk | 3040 | `http://:3040` | -| NoteLett | 3045 | `http://:3045` | -| MindLyst | 3050 | `http://:3050` | -| NomGap | 3055 | `http://:3055` | -| ActionTrail | 3060 | `http://:3060` | -| LocalMemGPT | 3070 | `http://:3070` | - -## Post-Deployment Commands - -```bash -# Check all service health -/opt/bytelyst/check-health.sh - -# View logs for a specific service -docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \ - logs -f platform-service - -# Restart a specific service -docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \ - restart flowmonk-backend - -# Stop everything -docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down - -# Stop and wipe all data -docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down -v -``` - -## Environment Variables - -All optional — defaults work for most setups: - -| Variable | Default | Description | -|----------|---------|-------------| -| `GITHUB_USER` | `saravanakumardb1` | GitHub org/user to clone repos from | -| `GITHUB_TOKEN` | (empty) | Set for private repos (HTTPS auth) | -| `GITEA_ADMIN` | `bytelyst-admin` | Gitea admin username | -| `GITEA_PASS` | `ByteLyst2026!` | Gitea admin password | -| `OLLAMA_MODEL` | `llama3.2:3b` | Default LLM model to pull | -| `SKIP_CLONE` | `0` | Set `1` to skip cloning (re-runs) | -| `SKIP_BUILD` | `0` | Set `1` to skip package build+publish (re-runs) | - -## CLI Flags - -| Flag | Description | -|------|-------------| -| `--resume` | Auto-resume from last completed phase | -| `--resume-from=N` | Resume from phase N (1-8) | -| `--phase=N` | Run ONLY phase N (useful for retrying) | -| `--reset` | Clear phase markers and start fresh | -| `--status` | Show completed phases and exit | -| `-h`, `--help` | Show usage help | - -## Troubleshooting - -- **Cosmos emulator slow:** It needs 20-30s on first boot. Services wait via health checks. -- **Out of memory:** Use at least 32 GB RAM. Cosmos emulator needs ~4 GB, Ollama needs ~4 GB for 3B models. -- **Build failures:** Check Gitea is running (`docker ps | grep gitea`) and packages published (`curl http://localhost:3300/api/packages/bytelyst/npm/`). Per-service build logs: `/opt/bytelyst/.setup-state/builds/.log`. Retry: `sudo ./setup.sh --phase=7`. -- **Ollama not responding:** Check `systemctl status ollama` or `curl http://localhost:11434/api/version`. -- **Port conflicts:** Ensure nothing else runs on the listed ports before deploying. - -## Known Limitations - -- **Remote browser access:** Product web apps fall back to `http://localhost:` for API calls. This works when browsing from the VM itself but **not from a remote browser** (e.g., laptop accessing `http://:3060`). For remote access, set up a reverse proxy (Traefik rules) or SSH port-forwarding. Health checks and server-side rendering still work regardless. -- **Cosmos emulator is x86-only:** Do not use ARM-based VMs (e.g., Dpsv6). Stick with `Standard_D8s_v5` or similar Intel/AMD instances. -- **Memory pressure:** Phase 7 automatically stops Ollama (~3 GB) during Docker builds and restarts it after. If builds still OOM on 32 GB, retry with `sudo ./setup.sh --phase=7` (per-service fallback skips what already built). -- **Corporate proxy in Dockerfiles:** Already removed at source across all repos. No runtime stripping needed. +| | Docker Compose | K8s (k3s) | +|--|----------------|-----------| +| **Setup time** | ~20 min | ~30 min | +| **RAM overhead** | ~100 MB | ~600 MB | +| **Config files** | 1 compose + 1 .env | ~30 manifests (or Helm) | +| **Scaling** | Manual | `kubectl scale` / HPA | +| **Rolling updates** | Restart-based | Zero-downtime | +| **Resource limits** | Basic | Fine-grained per pod | +| **Multi-VM ready** | Docker Swarm | Native `kubectl join` | +| **Learning value** | Low | High (transferable to AKS/EKS/GKE) | diff --git a/docs/devops/single_azure_vm/docker/README.md b/docs/devops/single_azure_vm/docker/README.md new file mode 100644 index 00000000..1e5d3a82 --- /dev/null +++ b/docs/devops/single_azure_vm/docker/README.md @@ -0,0 +1,188 @@ +# ByteLyst Single-VM Deployment + +> Deploy the **entire ByteLyst ecosystem** (30 services, 10 products) on a single **raw** Azure VM. +> Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. +> Two files: this README and `setup.sh`. Copy both to the VM and run the script. + +--- + +## Prerequisites + +- **Azure VM:** Ubuntu 24.04 LTS (or 22.04), **Standard_D8s_v5** (8 vCPU, 32 GB RAM) recommended +- **Disk:** 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts) +- **Network:** NSG allowing inbound on ports listed in the Port Map below +- **GitHub access:** Repos must be accessible (public or `GITHUB_TOKEN` for private) +- **Nothing else needed** — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything + +## Quick Start + +```bash +# 1. SSH into your Azure VM +ssh azureuser@ + +# 2. Copy setup.sh and make executable +chmod +x setup.sh + +# 3. Run — provide your GitHub username (repos are cloned via HTTPS) +# If repos are private, also export GITHUB_TOKEN first. +sudo ./setup.sh + +# 4. Wait ~15-25 minutes for full build + deploy + +# 5. Verify +/opt/bytelyst/check-health.sh +``` + +### Resume & Retry + +Phase completion is tracked. If anything fails, you don't have to start over: + +```bash +sudo ./setup.sh --phase=7 # Retry just the deploy phase +sudo ./setup.sh --resume # Auto-resume after SSH disconnect +sudo ./setup.sh --resume-from=7 # Jump to deploy after manual fix +sudo ./setup.sh --status # Check what's done +sudo ./setup.sh --reset # Start completely over +sudo ./setup.sh --help # Show full usage +``` + +## What the Script Installs & Does + +### Software installed on the VM (from scratch) + +| Software | Version | Purpose | +|----------|---------|----------| +| **Docker CE** | latest | Container runtime + Compose + BuildKit | +| **Node.js** | 22 LTS | Build toolchain for TypeScript packages | +| **pnpm** | 10.6.5 | Package manager (workspace-aware) | +| **Gitea** | 1.22 (Docker) | Local npm package registry on `:3300` | +| **Ollama** | latest | Local LLM inference for LocalMemGPT on `:11434` | +| **git, jq, curl** | latest | System utilities | + +### Execution phases + +| Phase | Duration | Description | +|-------|----------|-------------| +| 1. System | ~3 min | Pre-flight checks (disk ≥40 GB, RAM ≥16 GB), install Docker, Node.js 22, pnpm 10.6.5, Ollama, git, jq, build-essential | +| 2. Gitea | ~1 min | Start Gitea Docker container, create admin + org + API token | +| 3. Clone | ~3 min | Clone all 11 repos to `/opt/bytelyst/` | +| 4. Build | ~5 min | `pnpm install && pnpm -r build` all `@bytelyst/*` packages | +| 5. Publish | ~3 min | Publish all packages to local Gitea npm registry | +| 6. Env | instant | Generate `.env.ecosystem` with Cosmos emulator key, Azurite key, JWT secret | +| 7. Deploy | ~10 min | Stop Ollama (free RAM), per-service Docker build + deploy (30 services, with fallback), prune build cache, restart Ollama | +| 8. Verify | ~1 min | Health-check all 30+ endpoints + create `/opt/bytelyst/check-health.sh` | + +## Port Map (after deployment) + +### Infrastructure (installed by setup.sh) +| Service | Port | URL | +|---------|------|-----| +| Gitea (npm registry) | 3300 | `http://:3300` | +| Ollama (LLM API) | 11434 | `http://:11434` | +| Cosmos Data Explorer | 1234 | `http://:1234` | +| Azurite (Blob) | 10000 | `http://:10000` | +| Mailpit UI | 8025 | `http://:8025` | +| Loki (Logs) | 3100 | `http://:3100/ready` | +| Grafana | 3000 | `http://:3000` | +| Traefik Dashboard | 8080 | `http://:8080` | + +### Platform Services +| Service | Port | URL | +|---------|------|-----| +| platform-service | 4003 | `http://:4003/health` | +| extraction-service | 4005 | `http://:4005/health` | +| mcp-server | 4007 | `http://:4007/health` | + +### Platform Dashboards +| Dashboard | Port | URL | +|-----------|------|-----| +| Admin Console | 3001 | `http://:3001` | +| Issue Tracker | 3003 | `http://:3003` | + +### Product Backends +| Product | Port | Health | +|---------|------|--------| +| PeakPulse | 4010 | `http://:4010/health` | +| ChronoMind | 4011 | `http://:4011/health` | +| JarvisJr | 4012 | `http://:4012/health` | +| NomGap | 4013 | `http://:4013/health` | +| MindLyst | 4014 | `http://:4014/health` | +| LysnrAI | 4015 | `http://:4015/health` | +| NoteLett | 4016 | `http://:4016/health` | +| FlowMonk | 4017 | `http://:4017/health` | +| ActionTrail | 4018 | `http://:4018/health` | +| LocalMemGPT | 4019 | `http://:4019/health` | + +### Product Web Apps +| Product | Port | URL | +|---------|------|-----| +| LysnrAI Dashboard | 3002 | `http://:3002` | +| ChronoMind | 3030 | `http://:3030` | +| JarvisJr | 3035 | `http://:3035` | +| FlowMonk | 3040 | `http://:3040` | +| NoteLett | 3045 | `http://:3045` | +| MindLyst | 3050 | `http://:3050` | +| NomGap | 3055 | `http://:3055` | +| ActionTrail | 3060 | `http://:3060` | +| LocalMemGPT | 3070 | `http://:3070` | + +## Post-Deployment Commands + +```bash +# Check all service health +/opt/bytelyst/check-health.sh + +# View logs for a specific service +docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \ + logs -f platform-service + +# Restart a specific service +docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \ + restart flowmonk-backend + +# Stop everything +docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down + +# Stop and wipe all data +docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down -v +``` + +## Environment Variables + +All optional — defaults work for most setups: + +| Variable | Default | Description | +|----------|---------|-------------| +| `GITHUB_USER` | `saravanakumardb1` | GitHub org/user to clone repos from | +| `GITHUB_TOKEN` | (empty) | Set for private repos (HTTPS auth) | +| `GITEA_ADMIN` | `bytelyst-admin` | Gitea admin username | +| `GITEA_PASS` | `ByteLyst2026!` | Gitea admin password | +| `OLLAMA_MODEL` | `llama3.2:3b` | Default LLM model to pull | +| `SKIP_CLONE` | `0` | Set `1` to skip cloning (re-runs) | +| `SKIP_BUILD` | `0` | Set `1` to skip package build+publish (re-runs) | + +## CLI Flags + +| Flag | Description | +|------|-------------| +| `--resume` | Auto-resume from last completed phase | +| `--resume-from=N` | Resume from phase N (1-8) | +| `--phase=N` | Run ONLY phase N (useful for retrying) | +| `--reset` | Clear phase markers and start fresh | +| `--status` | Show completed phases and exit | +| `-h`, `--help` | Show usage help | + +## Troubleshooting + +- **Cosmos emulator slow:** It needs 20-30s on first boot. Services wait via health checks. +- **Out of memory:** Use at least 32 GB RAM. Cosmos emulator needs ~4 GB, Ollama needs ~4 GB for 3B models. +- **Build failures:** Check Gitea is running (`docker ps | grep gitea`) and packages published (`curl http://localhost:3300/api/packages/bytelyst/npm/`). Per-service build logs: `/opt/bytelyst/.setup-state/builds/.log`. Retry: `sudo ./setup.sh --phase=7`. +- **Ollama not responding:** Check `systemctl status ollama` or `curl http://localhost:11434/api/version`. +- **Port conflicts:** Ensure nothing else runs on the listed ports before deploying. + +## Known Limitations + +- **Remote browser access:** Product web apps fall back to `http://localhost:` for API calls. This works when browsing from the VM itself but **not from a remote browser** (e.g., laptop accessing `http://:3060`). For remote access, set up a reverse proxy (Traefik rules) or SSH port-forwarding. Health checks and server-side rendering still work regardless. +- **Cosmos emulator is x86-only:** Do not use ARM-based VMs (e.g., Dpsv6). Stick with `Standard_D8s_v5` or similar Intel/AMD instances. +- **Memory pressure:** Phase 7 automatically stops Ollama (~3 GB) during Docker builds and restarts it after. If builds still OOM on 32 GB, retry with `sudo ./setup.sh --phase=7` (per-service fallback skips what already built). +- **Corporate proxy in Dockerfiles:** Already removed at source across all repos. No runtime stripping needed. diff --git a/docs/devops/single_azure_vm/prompt.md b/docs/devops/single_azure_vm/docker/prompt.md similarity index 100% rename from docs/devops/single_azure_vm/prompt.md rename to docs/devops/single_azure_vm/docker/prompt.md diff --git a/docs/devops/single_azure_vm/setup.sh b/docs/devops/single_azure_vm/docker/setup.sh similarity index 100% rename from docs/devops/single_azure_vm/setup.sh rename to docs/devops/single_azure_vm/docker/setup.sh diff --git a/docs/devops/single_azure_vm/k8s/README.md b/docs/devops/single_azure_vm/k8s/README.md new file mode 100644 index 00000000..cbee6174 --- /dev/null +++ b/docs/devops/single_azure_vm/k8s/README.md @@ -0,0 +1,310 @@ +# ByteLyst Single-VM Kubernetes Deployment (k3s) + +> Deploy the ByteLyst ecosystem on Kubernetes using **k3s** — a lightweight, certified K8s distribution +> that runs on a single VM with ~512 MB overhead. + +**Status:** Planning — see design decisions below. + +--- + +## Prerequisites + +Same VM as the Docker Compose approach: +- **Azure VM:** Ubuntu 24.04 LTS, **Standard_D8s_v5** (8 vCPU, 32 GB RAM) +- **Disk:** 128 GB+ +- **Docker images:** Built by `docker/setup.sh` phases 1-5 (reused, not rebuilt) + +## Why k3s? + +| Feature | k3s | minikube | kind | microk8s | +|---------|-----|----------|------|----------| +| RAM overhead | ~512 MB | ~2 GB | ~1 GB | ~800 MB | +| Production-grade | Yes (CNCF certified) | No | No | Yes | +| Built-in Traefik | Yes | No | No | Optional | +| Single binary | Yes | No | No | No (snap) | +| SQLite backend | Yes (no etcd needed) | N/A | N/A | Dqlite | + +## Architecture + +``` +Ubuntu 24.04 VM +├── k3s (single-node cluster) +│ ├── kube-system namespace +│ │ ├── CoreDNS +│ │ ├── Traefik Ingress Controller +│ │ ├── Local Path Provisioner +│ │ └── Metrics Server +│ │ +│ ├── bytelyst-infra namespace +│ │ ├── cosmos-emulator (StatefulSet + PVC) +│ │ ├── azurite (StatefulSet + PVC) +│ │ ├── mailpit (Deployment) +│ │ ├── loki (StatefulSet + PVC) +│ │ └── grafana (Deployment + PVC) +│ │ +│ ├── bytelyst-platform namespace +│ │ ├── platform-service (Deployment, replicas: 1) +│ │ ├── extraction-service (Deployment, replicas: 1) +│ │ └── mcp-server (Deployment, replicas: 1) +│ │ +│ ├── bytelyst-dashboards namespace +│ │ ├── admin-web (Deployment, replicas: 1) +│ │ └── tracker-web (Deployment, replicas: 1) +│ │ +│ └── bytelyst-products namespace +│ ├── *-backend (10 Deployments) +│ └── *-web (9 Deployments) +│ +├── Ollama (systemd, host network — :11434) +└── Gitea (Docker container — :3300, used for build-time only) +``` + +## Manifest Structure (planned) + +``` +k8s/ +├── README.md # This file +├── setup-k8s.sh # Bootstrap script (installs k3s, applies manifests) +├── namespaces.yaml # 4 namespaces +├── config/ +│ ├── configmap.yaml # Shared env vars (replaces .env.ecosystem) +│ └── secrets.yaml # JWT_SECRET, COSMOS_KEY, etc. +├── infra/ +│ ├── cosmos-emulator.yaml # StatefulSet + Service + PVC +│ ├── azurite.yaml # StatefulSet + Service + PVC +│ ├── mailpit.yaml # Deployment + Service +│ ├── loki.yaml # StatefulSet + Service + PVC +│ └── grafana.yaml # Deployment + Service + PVC +├── platform/ +│ ├── platform-service.yaml # Deployment + Service +│ ├── extraction-service.yaml # Deployment + Service +│ └── mcp-server.yaml # Deployment + Service +├── dashboards/ +│ ├── admin-web.yaml # Deployment + Service +│ └── tracker-web.yaml # Deployment + Service +├── products/ +│ ├── _backend-template.yaml # Helm-like template (for reference) +│ ├── peakpulse-backend.yaml +│ ├── chronomind-backend.yaml +│ ├── ... (8 more backends) +│ ├── lysnrai-dashboard.yaml +│ ├── chronomind-web.yaml +│ └── ... (7 more web apps) +└── ingress/ + └── ingress.yaml # Traefik IngressRoute rules +``` + +## Key Design Decisions + +### 1. Image Source: Import from Docker + +k3s uses containerd, not Docker. We import the Docker-built images: + +```bash +# Build images with Docker (phases 1-7 from docker/setup.sh) +docker save platform-service:latest | k3s ctr images import - + +# Or build directly with nerdctl (k3s-native) +nerdctl build -t platform-service:latest -f services/platform-service/Dockerfile . +``` + +**Decision:** Import from Docker first (simpler), migrate to nerdctl later. + +### 2. Cosmos Emulator: StatefulSet with PVC + +The Cosmos emulator needs persistent storage and specific env vars. +Use a `StatefulSet` (not Deployment) for stable network identity: + +```yaml +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: cosmos-emulator + namespace: bytelyst-infra +spec: + replicas: 1 + serviceName: cosmos-emulator + template: + spec: + containers: + - name: cosmos + image: mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator:latest + ports: + - containerPort: 8081 + - containerPort: 1234 + env: + - name: AZURE_COSMOS_EMULATOR_ENABLE_DATA_PERSISTENCE + value: "true" + - name: ENABLE_EXPLORER + value: "true" + resources: + limits: + memory: "3Gi" + cpu: "2" + volumeClaimTemplates: + - metadata: + name: cosmos-data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 10Gi +``` + +### 3. Ollama: Host Network + +Ollama stays as a systemd service on the host. Pods reach it via `hostNetwork` +or a manually created Endpoints + Service pointing to the node IP: + +```yaml +apiVersion: v1 +kind: Service +metadata: + name: ollama + namespace: bytelyst-products +spec: + ports: + - port: 11434 +--- +apiVersion: v1 +kind: Endpoints +metadata: + name: ollama + namespace: bytelyst-products +subsets: +- addresses: + - ip: 172.17.0.1 # Host IP (node's internal IP) + ports: + - port: 11434 +``` + +### 4. ConfigMap replaces .env.ecosystem + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: bytelyst-config + namespace: bytelyst-platform +data: + COSMOS_ENDPOINT: "http://cosmos-emulator.bytelyst-infra.svc:8081" + COSMOS_DATABASE: "bytelyst" + DB_PROVIDER: "cosmos" + PLATFORM_SERVICE_URL: "http://platform-service.bytelyst-platform.svc:4003" + EXTRACTION_SERVICE_URL: "http://extraction-service.bytelyst-platform.svc:4005" +``` + +Note: K8s DNS uses `..svc` format for cross-namespace access. + +### 5. Secrets for sensitive values + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: bytelyst-secrets +type: Opaque +stringData: + JWT_SECRET: "" + COSMOS_KEY: "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==" + AZURE_BLOB_ACCOUNT_KEY: "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==" +``` + +### 6. Health Checks → Readiness/Liveness Probes + +Every backend gets K8s-native probes: + +```yaml +readinessProbe: + httpGet: + path: /health + port: 4003 + initialDelaySeconds: 15 + periodSeconds: 10 +livenessProbe: + httpGet: + path: /health + port: 4003 + initialDelaySeconds: 30 + periodSeconds: 30 +``` + +### 7. Resource Limits + +| Service type | CPU request | CPU limit | Memory request | Memory limit | +|-------------|------------|-----------|---------------|-------------| +| Backend | 100m | 500m | 256Mi | 512Mi | +| Web app | 100m | 500m | 256Mi | 512Mi | +| Platform service | 200m | 1000m | 384Mi | 768Mi | +| Cosmos emulator | 1000m | 2000m | 2Gi | 3Gi | +| Ollama | (host) | (host) | (host) | (host) | + +## Implementation Phases + +### Phase A: Foundation (Day 1) +- [ ] Install k3s on VM +- [ ] Create 4 namespaces +- [ ] Deploy ConfigMap + Secrets +- [ ] Deploy cosmos-emulator + azurite (StatefulSets) +- [ ] Verify: `kubectl get pods -A` shows infra running + +### Phase B: Platform (Day 1-2) +- [ ] Import platform-service Docker image +- [ ] Deploy platform-service (Deployment + Service) +- [ ] Verify: `kubectl exec` + `curl http://platform-service:4003/health` +- [ ] Deploy extraction-service + mcp-server +- [ ] Deploy admin-web + tracker-web + +### Phase C: Products (Day 2-3) +- [ ] Template: create one backend manifest, verify it works +- [ ] Replicate for all 10 backends +- [ ] Create web app manifests (9 services) +- [ ] Verify: all 30 services running + +### Phase D: Networking (Day 3) +- [ ] Set up Traefik IngressRoute for external access +- [ ] Configure NodePort services for direct port access +- [ ] Create Ollama external service endpoint +- [ ] Verify: health check script works against K8s services + +### Phase E: Operations (Day 4+) +- [ ] `kubectl scale deployment/flowmonk-backend --replicas=2` — test scaling +- [ ] `kubectl rollout restart deployment/platform-service` — test rolling update +- [ ] `kubectl top pods` — resource usage monitoring +- [ ] Set up HorizontalPodAutoscaler for one service +- [ ] Practice: `kubectl logs`, `kubectl exec`, `kubectl describe` + +## Useful Commands (cheat sheet) + +```bash +# Cluster status +kubectl get nodes +kubectl get pods -A # All namespaces +kubectl get pods -n bytelyst-products # Product namespace + +# Deploy / update +kubectl apply -f k8s/ # Apply all manifests +kubectl apply -f k8s/products/ # Apply product manifests +kubectl rollout restart deployment/flowmonk-backend -n bytelyst-products + +# Debugging +kubectl logs deployment/platform-service -n bytelyst-platform -f +kubectl describe pod -n bytelyst-platform +kubectl exec -it deployment/platform-service -n bytelyst-platform -- sh + +# Scaling +kubectl scale deployment/flowmonk-backend --replicas=2 -n bytelyst-products +kubectl autoscale deployment/flowmonk-backend --min=1 --max=3 --cpu-percent=70 + +# Resource monitoring +kubectl top pods -n bytelyst-products +kubectl top nodes +``` + +## Migration from Docker Compose + +Both approaches can coexist on the same VM: +1. `docker/setup.sh` builds images and publishes packages (phases 1-5) +2. `docker compose down` stops the compose stack +3. `setup-k8s.sh` imports images into k3s and applies manifests +4. Both share the same Gitea registry and Ollama instance