diff --git a/docs/devops/GITEA_NPM_REGISTRY_MIGRATION.md b/docs/devops/GITEA_NPM_REGISTRY_MIGRATION.md index fd2d3c7a..fd4759c2 100644 --- a/docs/devops/GITEA_NPM_REGISTRY_MIGRATION.md +++ b/docs/devops/GITEA_NPM_REGISTRY_MIGRATION.md @@ -182,6 +182,43 @@ That blocker must be resolved before we can claim full local E2E completion. --- +## 5.2 Remaining Gaps From Local Mac Validation + +The local rehearsal on this Mac has proven enough to validate the **host-side** registry model, but it has **not** yet proven the full VM-ready or K8s-ready path. + +### Gaps still open on this Mac + +- Docker / BuildKit package installs from the local Homebrew Gitea instance are still not green +- the Homebrew Gitea `ROOT_URL` / package tarball URL behavior is still fragile across host and Docker boundaries +- local Gitea Actions has not yet been validated end-to-end for the package build → publish → consumer flow +- the FlowMonk pilot is currently validated only for backend/web host-side consumption; Docker is still blocked and mobile remains outside the registry-backed pilot slice +- not all `@bytelyst/*` packages have been revalidated under the local Gitea registry with a fresh consumer install after the `pnpm pack` publish flow changes + +### Why this matters for Azure + +Azure should be a replication step, not a redesign step. + +That means the Azure VM rollout should wait until the remaining local gaps are cleared for: + +- package metadata correctness +- tarball URL correctness +- Docker consumer correctness +- local CI correctness + +If we skip those validations locally, the first Azure VM trial becomes a debugging environment instead of a deployment environment. + +### Concrete local-exit criteria before Azure + +Before starting Azure VM rollout, we should be able to demonstrate all of the following on this Mac: + +- one Gitea deployment shape whose package URLs work for both host installs and Docker builds +- one publish path for `@bytelyst/*` packages that works repeatably with `pnpm pack` +- one pilot repo that installs from the registry on the host and inside Docker without fallback tarballs +- one local Gitea Actions path that can build/publish/install with the same registry assumptions +- one documented rollback path that cleanly returns a pilot repo to tarball-based Docker consumption if needed + +--- + ## 6. Migration Strategy ### Stage A — Local registry rehearsal @@ -463,6 +500,34 @@ After the local rehearsal is green, Azure should follow the same validated recip Azure should be a replication step, not a redesign step. +### What Azure still needs beyond current local proof + +Even though the codebase is already designed to be highly configurable, Azure VM rollout still requires a few validations that have not been fully cleared by the local Mac rehearsal yet: + +- Gitea running in a deployment shape where package tarball URLs are stable for both host consumers and containerized consumers +- registry-backed Docker builds for the pilot repo without `docker-prep.sh` +- local Gitea Actions or equivalent host-runner proof for package build/publish/install +- a final decision on whether mobile-facing `@bytelyst/*` packages stay on local `file:` links during the pilot phase or join the registry migration in a later wave +- one Azure-ready secrets model for Gitea token handling, service envs, and registry auth that maps cleanly from local env vars to VM secrets/config files + +### What should require minimal change when moving to Azure + +The good news is that most of the ecosystem has already been implemented in a way that keeps scale-up mostly configuration-driven once the registry and image flows are proven. + +Expected low-change areas: + +- service code already externalizes most environment-specific values via env/config +- Compose and K8s models already map cleanly to Deployments, Services, Ingress, ConfigMaps, and Secrets +- Traefik, readiness probes, service ports, and namespace separation are already documented in a K8s-friendly way +- scaling stateless services should mostly mean changing replica counts, resource requests/limits, and HPA settings +- moving from single-node K3s to multi-node K3s or managed Kubernetes should mostly reuse the same manifests with infra-level adjustments + +What is **not** yet proven enough to call low-change: + +- the package-distribution layer for Docker/K8s image builds +- the exact image build/publish flow for the full ecosystem after registry migration +- the complete repo-by-repo removal of tarball-based Docker prep + --- ## 14. Definition Of Done diff --git a/docs/devops/SINGLE_VM_DEPLOYMENT.md b/docs/devops/SINGLE_VM_DEPLOYMENT.md index 543f6754..2388cf74 100644 --- a/docs/devops/SINGLE_VM_DEPLOYMENT.md +++ b/docs/devops/SINGLE_VM_DEPLOYMENT.md @@ -669,6 +669,84 @@ spec: --- +## 5.1 Scaling-Readiness Reality Check + +The ecosystem code and infra design are already **mostly** aligned for a low-effort scaling path. + +That does **not** mean the whole end-to-end deployment path is proven yet. + +### What is already designed for low-change scaling + +- service identity, ports, and inter-service URLs are already env/config driven in most backends and dashboards +- Compose service boundaries already map cleanly to Kubernetes Deployments + Services +- Traefik-based routing already maps cleanly to Kubernetes Ingress resources +- resource requests/limits and replica counts are already represented in a K8s-friendly way in the documented manifest examples +- the namespace split (`infra`, `platform`, `products`, `web`) already gives a usable organizational model for K8s +- moving from 1 replica to N replicas for stateless services should mostly be infra config work, not application rewrites +- moving from Docker Desktop K8s to K3s or later managed K8s should mostly reuse the same manifest model + +### What is not yet proven enough to call low-change + +- the shared package distribution path for container builds after the Gitea npm registry migration +- the pilot Docker build path from the local Gitea registry +- the final removal of `docker-prep.sh` / tarball fallback across the wider ecosystem +- the complete image build/publish/deploy workflow for all repos under one consistent registry strategy + +### Practical interpretation + +If we solve the package registry + image build path cleanly, then scaling the running system should mostly mean: + +- increasing replica counts +- applying HPAs where justified +- adjusting resource requests and limits +- moving from local-path storage to stronger persistent storage where needed +- adding worker nodes or moving to a managed cluster + +That is the sense in which the ecosystem is already mostly configurable and scale-friendly. + +The remaining work is concentrated more in the **build and package-distribution layer** than in the application service code. + +--- + +## 5.2 Remaining Gaps Before Azure VM / K3s Rollout + +### Local Mac rehearsal gaps still open + +- local Homebrew Gitea package metadata and tarball URLs are not yet reliable across both host installs and Docker builds +- pilot Docker builds from the local Gitea registry are still blocked +- local Gitea Actions has not yet been fully validated for the package publish + consumer install path +- the pilot registry migration is only validated for FlowMonk backend/web host-side usage, not end-to-end across Docker and all surfaces + +### Azure VM readiness gaps + +Before calling the Azure VM rollout low-risk, we still need one validated answer for each of these: + +- where Gitea runs and what `ROOT_URL` / package URL shape it advertises +- how Docker image builds authenticate to and consume the package registry +- whether package builds happen on the VM directly, inside CI runners, or via a dedicated package-publish pipeline +- how image publishing is handled for K3s / later multi-node rollout +- which repos are fully registry-backed versus still temporarily using tarball fallback during migration + +### K3s / scaling readiness gaps + +The application architecture is largely ready for scale-by-configuration, but these operational items still need proof or stronger artifacts: + +- concrete K8s manifests or Helm values for the full ecosystem, not just examples +- image registry strategy for K3s and later multi-node or managed Kubernetes rollout +- persistent volume strategy for Gitea, Grafana, Loki, Azurite, and any stateful product workloads +- autoscaling thresholds and resource defaults validated under representative load +- deployment ordering and health-check gating for infra → platform → products → web surfaces + +### Recommended order to close these gaps + +1. clear the local Gitea package URL and Docker build problem +2. validate one full pilot path: package publish → Docker build → runtime start +3. validate the same path in local Gitea Actions +4. choose the image registry / image distribution pattern for Azure VM and K3s +5. generate and validate concrete K8s/Helm artifacts for the platform + one pilot product first + +--- + ## 6. K3s Practice Exercises (on single VM) These exercises simulate real production scenarios: