From 626e19f7761931f2624c0926654ccd084a1f0b51 Mon Sep 17 00:00:00 2001 From: root Date: Sun, 29 Mar 2026 22:27:31 +0000 Subject: [PATCH] docs(devops): add secure single-vm api exposure guidance --- docs/devops/single_azure_vm/docker/README.md | 4 + .../docker/SECURE_API_EXPOSURE.md | 456 ++++++++++++++++++ 2 files changed, 460 insertions(+) create mode 100644 docs/devops/single_azure_vm/docker/SECURE_API_EXPOSURE.md diff --git a/docs/devops/single_azure_vm/docker/README.md b/docs/devops/single_azure_vm/docker/README.md index bc65db78..ca52dbcd 100644 --- a/docs/devops/single_azure_vm/docker/README.md +++ b/docs/devops/single_azure_vm/docker/README.md @@ -4,6 +4,10 @@ > Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. > Two files: this README and `setup.sh`. Copy both to the VM and run the script. +Related: + +- [`SECURE_API_EXPOSURE.md`](./SECURE_API_EXPOSURE.md) — recommended public API exposure model, alternatives, and security guidance for client-facing URLs + --- ## Prerequisites diff --git a/docs/devops/single_azure_vm/docker/SECURE_API_EXPOSURE.md b/docs/devops/single_azure_vm/docker/SECURE_API_EXPOSURE.md new file mode 100644 index 00000000..9fde43d8 --- /dev/null +++ b/docs/devops/single_azure_vm/docker/SECURE_API_EXPOSURE.md @@ -0,0 +1,456 @@ +# Secure API Exposure for Single-Azure-VM Docker Deployment + +> Decision note for the single-VM Docker deployment in this folder. +> Goal: stop exposing raw `http://:` backend URLs directly to web and mobile clients. + +--- + +## Summary + +For the current ByteLyst single-VM Docker deployment, the recommended public access pattern is: + +1. Use one public DNS name such as `api.`. +2. Terminate TLS on the VM at the HTTP gateway layer. +3. Keep backend containers on the Docker network and avoid publishing their service ports publicly. +4. Route requests by path prefix at the gateway: + - `/platform` + - `/extraction` + - `/mcp` + - `/peakpulse` + - `/chronomind` + - `/jarvisjr` + - `/nomgap` + - `/mindlyst` + - `/lysnrai` + - `/notelett` + - `/flowmonk` + - `/actiontrail` + - `/localmemgpt` + +For this repo and deployment model, the best near-term choice is: + +- **Primary recommendation:** keep the existing gateway pattern and harden it into a single HTTPS API entrypoint. +- **Implementation preference:** either: + - **Traefik with ACME/Let's Encrypt** if you want to stay aligned with the existing stack, or + - **Caddy** if you want the simplest single-VM HTTPS setup. + +Do **not** keep exposing `4003`, `4005`, `4007`, and `4010-4019` as public app-facing endpoints for production-like clients. + +--- + +## Problem + +The current single-VM documentation and deployment model expose many direct backend ports: + +- platform services on `4003`, `4005`, `4007` +- product backends on `4010` through `4019` + +That creates avoidable issues: + +- raw VM IPs leak into app config +- every public port must be opened in the Azure NSG +- no central TLS termination point +- no single place for CORS, auth forwarding, rate limiting, or request logging +- harder DNS migration if the VM changes +- mobile and web clients become coupled to infrastructure details + +The right abstraction for clients is a stable HTTPS domain, not a VM IP and port. + +--- + +## Options Compared + +### Option A: Keep direct IP:port exposure + +Example: + +- `http://:4003` +- `http://:4012` + +**Pros** + +- simplest to bootstrap +- no reverse-proxy configuration required +- easy to debug locally + +**Cons** + +- poor security posture +- difficult to hide infrastructure details +- not suitable for polished web/mobile clients +- many inbound ports remain publicly reachable +- forces client config changes if host or port mapping changes +- HTTPS must be solved separately per surface + +**Recommendation** + +- acceptable only for temporary prototype smoke testing +- not recommended for client-facing environments + +--- + +### Option B: Single HTTPS domain with path-based routing on the VM + +Example: + +- `https://api./platform/...` +- `https://api./chronomind/...` + +**Pros** + +- one public URL for all apps +- only `80/443` need to be public +- TLS termination is centralized +- easiest client configuration model +- easy to add middleware for auth, CORS, rate limits, headers, and logging +- easiest migration path from one VM to another + +**Cons** + +- gateway config must be maintained +- path-prefix design requires route hygiene across services + +**Recommendation** + +- **best fit for the current single-VM ByteLyst deployment** + +--- + +### Option C: Per-service HTTPS subdomains + +Example: + +- `https://platform-api.` +- `https://chronomind-api.` +- `https://jarvisjr-api.` + +**Pros** + +- clean separation by product/service +- natural fit if products will be separated operationally later +- easier to apply product-specific policies at DNS/cert level + +**Cons** + +- more DNS records +- more certificates/listeners to manage +- more client env variables than a single API hostname + +**Recommendation** + +- good alternative if product isolation is a stronger goal than client simplicity +- still better than raw IP:port exposure + +--- + +### Option D: Azure Application Gateway in front of the VM + +**Pros** + +- Azure-native L7 entrypoint +- TLS termination +- host/path routing +- WAF support +- health probes and backend pools + +**Cons** + +- extra Azure cost and operational layer +- more infrastructure to provision and maintain +- heavier than needed for one raw VM unless WAF/compliance is required + +**Recommendation** + +- good when you need Azure-managed edge controls or WAF +- otherwise more than this deployment currently needs + +--- + +### Option E: Azure Front Door in front of the VM + +**Pros** + +- global edge entrypoint +- TLS termination +- WAF support +- good for multi-region or globally distributed traffic + +**Cons** + +- overkill for a single-region, single-VM deployment +- extra cost and configuration +- best value appears when you have multiple regions or origins + +**Recommendation** + +- not the first choice for this repo's current single-VM architecture + +--- + +### Option F: Azure API Management in front of the VM + +**Pros** + +- strongest API product/governance story +- subscriptions, API versioning, developer portal, transformations, quotas +- very good if third parties consume your APIs + +**Cons** + +- significantly heavier and costlier than a simple gateway +- more than needed for first-party web/mobile clients on one VM + +**Recommendation** + +- consider later if ByteLyst needs external developer APIs or formal API governance +- not the first thing to add for this deployment + +--- + +### Option G: Caddy on the VM + +**Pros** + +- very simple configuration +- automatic HTTPS +- good default fit for one VM and one domain +- lower operational overhead than a more feature-rich proxy + +**Cons** + +- less aligned with the current repo's Traefik-based stack references +- if the team already standardizes on Traefik, introduces another gateway pattern + +**Recommendation** + +- best choice if simplicity is the priority and gateway features stay modest + +--- + +### Option H: Traefik on the VM + +**Pros** + +- already present in the ecosystem stack and docs +- supports host/path routing and ACME +- works well with Docker labels and compose-driven deployments +- consistent with the repo's current gateway direction + +**Cons** + +- current docs and compose posture still expose too many ports directly +- must be hardened: + - remove insecure dashboard exposure + - add TLS + - stop publishing backend ports publicly + +**Recommendation** + +- best choice if we want minimal architectural change from the current stack + +--- + +## Decision Matrix + +| Option | Security | Complexity | Azure cost | Fit for current repo | Recommended | +| ---------------------------- | -------- | ----------- | ----------- | ----------------------------- | ----------- | +| Direct IP:port | Low | Low | Low | Prototype only | No | +| Single domain + path routing | High | Medium | Low | Excellent | **Yes** | +| Per-service subdomains | High | Medium | Low | Good | Yes | +| Application Gateway | High | Medium/High | Medium | Good | Maybe later | +| Front Door | High | High | Medium/High | Weak for single VM | No for now | +| API Management | High | High | High | Weak for first-party apps now | Later only | +| Caddy | High | Low | Low | Good | **Yes** | +| Traefik | High | Medium | Low | Excellent | **Yes** | + +--- + +## Recommended Path + +### Recommendation 1: Public clients should know only one API hostname + +Use: + +- `https://api.` + +Do not use: + +- `http://:4003` +- `http://:4012` +- `http://:4019` + +### Recommendation 2: Use path-based routing for all APIs + +Suggested mapping: + +- `https://api./platform` +- `https://api./extraction` +- `https://api./mcp` +- `https://api./peakpulse` +- `https://api./chronomind` +- `https://api./jarvisjr` +- `https://api./nomgap` +- `https://api./mindlyst` +- `https://api./lysnrai` +- `https://api./notelett` +- `https://api./flowmonk` +- `https://api./actiontrail` +- `https://api./localmemgpt` + +This is the cleanest model for web and mobile apps because they only need: + +- one host +- one TLS policy +- one place for future cross-cutting controls + +### Recommendation 3: Keep only `80/443` public for app traffic + +Publicly reachable: + +- `80` +- `443` + +Not publicly reachable for clients: + +- `4003` +- `4005` +- `4007` +- `4010-4019` + +Operational ports should also be restricted or removed from the public internet where possible: + +- `3000` Grafana +- `3300` Gitea +- `8025` Mailpit +- `8080` Traefik dashboard +- `10000` Azurite +- `11434` Ollama + +### Recommendation 4: Choose Traefik or Caddy based on team preference + +#### If staying close to the current repo + +Use **Traefik** and harden it: + +- enable ACME / Let's Encrypt +- remove insecure dashboard exposure +- use host/path routers for each backend +- stop exposing backend container ports publicly + +#### If optimizing for the fewest moving parts + +Use **Caddy**: + +- simpler config +- automatic HTTPS +- excellent fit for one VM and one API domain + +### Recommendation 5: Do not add Azure Front Door or API Management yet + +Those make sense later if one of these becomes true: + +- global traffic and multi-region failover matter +- a WAF is mandatory at the Azure edge +- third-party API consumers need subscriptions, products, quotas, or a developer portal + +For the current single-VM Docker deployment, they add more complexity than value. + +--- + +## Concrete Recommendation for ByteLyst + +### Near-term + +- Keep the current single VM. +- Keep Docker Compose. +- Keep a gateway in front of services. +- Move clients to **one HTTPS API hostname**. +- Use **path-based routing**. +- Remove public exposure of backend ports. + +### Preferred implementation + +- **Preferred:** harden the existing Traefik-based gateway approach +- **Alternative:** replace Traefik with Caddy if the team values simpler config over consistency with current stack references + +### Why this is the best fit + +- smallest change from the current architecture +- lowest added Azure cost +- best client developer experience +- hides VM IP and port details +- keeps future migration options open + +--- + +## Suggested Client Base URLs + +### Web and mobile + +Use one shared base host: + +- `https://api.` + +Then per-service prefixes: + +- platform: `https://api./platform` +- extraction: `https://api./extraction` +- mcp: `https://api./mcp` +- peakpulse: `https://api./peakpulse` +- chronomind: `https://api./chronomind` +- jarvisjr: `https://api./jarvisjr` +- nomgap: `https://api./nomgap` +- mindlyst: `https://api./mindlyst` +- lysnrai: `https://api./lysnrai` +- notelett: `https://api./notelett` +- flowmonk: `https://api./flowmonk` +- actiontrail: `https://api./actiontrail` +- localmemgpt: `https://api./localmemgpt` + +--- + +## Implementation Notes for This Folder + +To make this real in the single-VM Docker deployment, this folder should eventually be updated to: + +1. change the documented public access model from `http://:` to `https://api./` +2. reduce Azure NSG public inbound rules to `80/443` for app traffic +3. stop publishing backend ports publicly in compose for client access +4. add gateway routing rules for all backend services +5. add TLS automation +6. lock down or remove public access to operational tools + +--- + +## Final Recommendation + +**Recommended architecture for this repo's single-Azure-VM Docker deployment:** + +- **One HTTPS API domain** +- **Path-based routing** +- **Backends private on the Docker network** +- **Traefik hardened and kept, or Caddy adopted for simplicity** +- **Do not expose raw service ports to web/mobile clients** + +If the team wants the least disruptive route, use **Traefik + ACME + path routing**. +If the team wants the simplest operator experience, use **Caddy + path routing**. + +For the current deployment, **Azure Front Door**, **Application Gateway**, and **API Management** are valid alternatives, but they are not the best first move. + +--- + +## References + +- Microsoft Learn, Application Gateway components: + - https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-components +- Microsoft Learn, Azure Front Door overview via Microsoft Learn guidance: + - https://learn.microsoft.com/en-us/azure/azure-signalr/signalr-howto-work-with-azure-front-door +- Caddy documentation, automatic HTTPS and reverse proxy: + - https://caddyserver.com/docs/ + - https://caddyserver.com/docs/caddyfile/directives/reverse_proxy +- Traefik documentation: + - https://doc.traefik.io/traefik/ +- Existing repo context: + - `docs/devops/single_azure_vm/docker/README.md` + - `docs/devops/SINGLE_VM_ENHANCED_PLAN.md` + - `docs/audits/AI_SECURITY_AUDIT_REPORT.md`