docs(devops): add secure single-vm api exposure guidance

This commit is contained in:
root 2026-03-29 22:27:31 +00:00
parent ac6d1e0911
commit 626e19f776
2 changed files with 460 additions and 0 deletions

View File

@ -4,6 +4,10 @@
> Nothing pre-installed required — the script handles everything from a blank Ubuntu machine.
> Two files: this README and `setup.sh`. Copy both to the VM and run the script.
Related:
- [`SECURE_API_EXPOSURE.md`](./SECURE_API_EXPOSURE.md) — recommended public API exposure model, alternatives, and security guidance for client-facing URLs
---
## Prerequisites

View File

@ -0,0 +1,456 @@
# Secure API Exposure for Single-Azure-VM Docker Deployment
> Decision note for the single-VM Docker deployment in this folder.
> Goal: stop exposing raw `http://<vm-ip>:<port>` backend URLs directly to web and mobile clients.
---
## Summary
For the current ByteLyst single-VM Docker deployment, the recommended public access pattern is:
1. Use one public DNS name such as `api.<domain>`.
2. Terminate TLS on the VM at the HTTP gateway layer.
3. Keep backend containers on the Docker network and avoid publishing their service ports publicly.
4. Route requests by path prefix at the gateway:
- `/platform`
- `/extraction`
- `/mcp`
- `/peakpulse`
- `/chronomind`
- `/jarvisjr`
- `/nomgap`
- `/mindlyst`
- `/lysnrai`
- `/notelett`
- `/flowmonk`
- `/actiontrail`
- `/localmemgpt`
For this repo and deployment model, the best near-term choice is:
- **Primary recommendation:** keep the existing gateway pattern and harden it into a single HTTPS API entrypoint.
- **Implementation preference:** either:
- **Traefik with ACME/Let's Encrypt** if you want to stay aligned with the existing stack, or
- **Caddy** if you want the simplest single-VM HTTPS setup.
Do **not** keep exposing `4003`, `4005`, `4007`, and `4010-4019` as public app-facing endpoints for production-like clients.
---
## Problem
The current single-VM documentation and deployment model expose many direct backend ports:
- platform services on `4003`, `4005`, `4007`
- product backends on `4010` through `4019`
That creates avoidable issues:
- raw VM IPs leak into app config
- every public port must be opened in the Azure NSG
- no central TLS termination point
- no single place for CORS, auth forwarding, rate limiting, or request logging
- harder DNS migration if the VM changes
- mobile and web clients become coupled to infrastructure details
The right abstraction for clients is a stable HTTPS domain, not a VM IP and port.
---
## Options Compared
### Option A: Keep direct IP:port exposure
Example:
- `http://<vm-ip>:4003`
- `http://<vm-ip>:4012`
**Pros**
- simplest to bootstrap
- no reverse-proxy configuration required
- easy to debug locally
**Cons**
- poor security posture
- difficult to hide infrastructure details
- not suitable for polished web/mobile clients
- many inbound ports remain publicly reachable
- forces client config changes if host or port mapping changes
- HTTPS must be solved separately per surface
**Recommendation**
- acceptable only for temporary prototype smoke testing
- not recommended for client-facing environments
---
### Option B: Single HTTPS domain with path-based routing on the VM
Example:
- `https://api.<domain>/platform/...`
- `https://api.<domain>/chronomind/...`
**Pros**
- one public URL for all apps
- only `80/443` need to be public
- TLS termination is centralized
- easiest client configuration model
- easy to add middleware for auth, CORS, rate limits, headers, and logging
- easiest migration path from one VM to another
**Cons**
- gateway config must be maintained
- path-prefix design requires route hygiene across services
**Recommendation**
- **best fit for the current single-VM ByteLyst deployment**
---
### Option C: Per-service HTTPS subdomains
Example:
- `https://platform-api.<domain>`
- `https://chronomind-api.<domain>`
- `https://jarvisjr-api.<domain>`
**Pros**
- clean separation by product/service
- natural fit if products will be separated operationally later
- easier to apply product-specific policies at DNS/cert level
**Cons**
- more DNS records
- more certificates/listeners to manage
- more client env variables than a single API hostname
**Recommendation**
- good alternative if product isolation is a stronger goal than client simplicity
- still better than raw IP:port exposure
---
### Option D: Azure Application Gateway in front of the VM
**Pros**
- Azure-native L7 entrypoint
- TLS termination
- host/path routing
- WAF support
- health probes and backend pools
**Cons**
- extra Azure cost and operational layer
- more infrastructure to provision and maintain
- heavier than needed for one raw VM unless WAF/compliance is required
**Recommendation**
- good when you need Azure-managed edge controls or WAF
- otherwise more than this deployment currently needs
---
### Option E: Azure Front Door in front of the VM
**Pros**
- global edge entrypoint
- TLS termination
- WAF support
- good for multi-region or globally distributed traffic
**Cons**
- overkill for a single-region, single-VM deployment
- extra cost and configuration
- best value appears when you have multiple regions or origins
**Recommendation**
- not the first choice for this repo's current single-VM architecture
---
### Option F: Azure API Management in front of the VM
**Pros**
- strongest API product/governance story
- subscriptions, API versioning, developer portal, transformations, quotas
- very good if third parties consume your APIs
**Cons**
- significantly heavier and costlier than a simple gateway
- more than needed for first-party web/mobile clients on one VM
**Recommendation**
- consider later if ByteLyst needs external developer APIs or formal API governance
- not the first thing to add for this deployment
---
### Option G: Caddy on the VM
**Pros**
- very simple configuration
- automatic HTTPS
- good default fit for one VM and one domain
- lower operational overhead than a more feature-rich proxy
**Cons**
- less aligned with the current repo's Traefik-based stack references
- if the team already standardizes on Traefik, introduces another gateway pattern
**Recommendation**
- best choice if simplicity is the priority and gateway features stay modest
---
### Option H: Traefik on the VM
**Pros**
- already present in the ecosystem stack and docs
- supports host/path routing and ACME
- works well with Docker labels and compose-driven deployments
- consistent with the repo's current gateway direction
**Cons**
- current docs and compose posture still expose too many ports directly
- must be hardened:
- remove insecure dashboard exposure
- add TLS
- stop publishing backend ports publicly
**Recommendation**
- best choice if we want minimal architectural change from the current stack
---
## Decision Matrix
| Option | Security | Complexity | Azure cost | Fit for current repo | Recommended |
| ---------------------------- | -------- | ----------- | ----------- | ----------------------------- | ----------- |
| Direct IP:port | Low | Low | Low | Prototype only | No |
| Single domain + path routing | High | Medium | Low | Excellent | **Yes** |
| Per-service subdomains | High | Medium | Low | Good | Yes |
| Application Gateway | High | Medium/High | Medium | Good | Maybe later |
| Front Door | High | High | Medium/High | Weak for single VM | No for now |
| API Management | High | High | High | Weak for first-party apps now | Later only |
| Caddy | High | Low | Low | Good | **Yes** |
| Traefik | High | Medium | Low | Excellent | **Yes** |
---
## Recommended Path
### Recommendation 1: Public clients should know only one API hostname
Use:
- `https://api.<domain>`
Do not use:
- `http://<vm-ip>:4003`
- `http://<vm-ip>:4012`
- `http://<vm-ip>:4019`
### Recommendation 2: Use path-based routing for all APIs
Suggested mapping:
- `https://api.<domain>/platform`
- `https://api.<domain>/extraction`
- `https://api.<domain>/mcp`
- `https://api.<domain>/peakpulse`
- `https://api.<domain>/chronomind`
- `https://api.<domain>/jarvisjr`
- `https://api.<domain>/nomgap`
- `https://api.<domain>/mindlyst`
- `https://api.<domain>/lysnrai`
- `https://api.<domain>/notelett`
- `https://api.<domain>/flowmonk`
- `https://api.<domain>/actiontrail`
- `https://api.<domain>/localmemgpt`
This is the cleanest model for web and mobile apps because they only need:
- one host
- one TLS policy
- one place for future cross-cutting controls
### Recommendation 3: Keep only `80/443` public for app traffic
Publicly reachable:
- `80`
- `443`
Not publicly reachable for clients:
- `4003`
- `4005`
- `4007`
- `4010-4019`
Operational ports should also be restricted or removed from the public internet where possible:
- `3000` Grafana
- `3300` Gitea
- `8025` Mailpit
- `8080` Traefik dashboard
- `10000` Azurite
- `11434` Ollama
### Recommendation 4: Choose Traefik or Caddy based on team preference
#### If staying close to the current repo
Use **Traefik** and harden it:
- enable ACME / Let's Encrypt
- remove insecure dashboard exposure
- use host/path routers for each backend
- stop exposing backend container ports publicly
#### If optimizing for the fewest moving parts
Use **Caddy**:
- simpler config
- automatic HTTPS
- excellent fit for one VM and one API domain
### Recommendation 5: Do not add Azure Front Door or API Management yet
Those make sense later if one of these becomes true:
- global traffic and multi-region failover matter
- a WAF is mandatory at the Azure edge
- third-party API consumers need subscriptions, products, quotas, or a developer portal
For the current single-VM Docker deployment, they add more complexity than value.
---
## Concrete Recommendation for ByteLyst
### Near-term
- Keep the current single VM.
- Keep Docker Compose.
- Keep a gateway in front of services.
- Move clients to **one HTTPS API hostname**.
- Use **path-based routing**.
- Remove public exposure of backend ports.
### Preferred implementation
- **Preferred:** harden the existing Traefik-based gateway approach
- **Alternative:** replace Traefik with Caddy if the team values simpler config over consistency with current stack references
### Why this is the best fit
- smallest change from the current architecture
- lowest added Azure cost
- best client developer experience
- hides VM IP and port details
- keeps future migration options open
---
## Suggested Client Base URLs
### Web and mobile
Use one shared base host:
- `https://api.<domain>`
Then per-service prefixes:
- platform: `https://api.<domain>/platform`
- extraction: `https://api.<domain>/extraction`
- mcp: `https://api.<domain>/mcp`
- peakpulse: `https://api.<domain>/peakpulse`
- chronomind: `https://api.<domain>/chronomind`
- jarvisjr: `https://api.<domain>/jarvisjr`
- nomgap: `https://api.<domain>/nomgap`
- mindlyst: `https://api.<domain>/mindlyst`
- lysnrai: `https://api.<domain>/lysnrai`
- notelett: `https://api.<domain>/notelett`
- flowmonk: `https://api.<domain>/flowmonk`
- actiontrail: `https://api.<domain>/actiontrail`
- localmemgpt: `https://api.<domain>/localmemgpt`
---
## Implementation Notes for This Folder
To make this real in the single-VM Docker deployment, this folder should eventually be updated to:
1. change the documented public access model from `http://<vm-ip>:<port>` to `https://api.<domain>/<service>`
2. reduce Azure NSG public inbound rules to `80/443` for app traffic
3. stop publishing backend ports publicly in compose for client access
4. add gateway routing rules for all backend services
5. add TLS automation
6. lock down or remove public access to operational tools
---
## Final Recommendation
**Recommended architecture for this repo's single-Azure-VM Docker deployment:**
- **One HTTPS API domain**
- **Path-based routing**
- **Backends private on the Docker network**
- **Traefik hardened and kept, or Caddy adopted for simplicity**
- **Do not expose raw service ports to web/mobile clients**
If the team wants the least disruptive route, use **Traefik + ACME + path routing**.
If the team wants the simplest operator experience, use **Caddy + path routing**.
For the current deployment, **Azure Front Door**, **Application Gateway**, and **API Management** are valid alternatives, but they are not the best first move.
---
## References
- Microsoft Learn, Application Gateway components:
- https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-components
- Microsoft Learn, Azure Front Door overview via Microsoft Learn guidance:
- https://learn.microsoft.com/en-us/azure/azure-signalr/signalr-howto-work-with-azure-front-door
- Caddy documentation, automatic HTTPS and reverse proxy:
- https://caddyserver.com/docs/
- https://caddyserver.com/docs/caddyfile/directives/reverse_proxy
- Traefik documentation:
- https://doc.traefik.io/traefik/
- Existing repo context:
- `docs/devops/single_azure_vm/docker/README.md`
- `docs/devops/SINGLE_VM_ENHANCED_PLAN.md`
- `docs/audits/AI_SECURITY_AUDIT_REPORT.md`