docs(docker): update README, prompt.md, .env.ecosystem.example with audit fixes
- README: NSG port list inline, phase 7 count 31, CORS/NODE_ENV troubleshooting, SSH port-forwarding example - prompt.md: mark tasks 5+6 done, add 8 new bug fixes to table, update definition of done with llmlab-dashboard - .env.ecosystem.example: add NODE_ENV=production and CORS_ORIGIN=*
This commit is contained in:
parent
f9a20e4612
commit
91a651805c
@ -70,6 +70,12 @@ FIELD_ENCRYPT_KEY_PROVIDER=memory
|
||||
# ── Product Identity ─────────────────────────────────────────────
|
||||
DEFAULT_PRODUCT_ID=lysnrai
|
||||
|
||||
# ── Runtime environment ─────────────────────────────────────────
|
||||
NODE_ENV=production
|
||||
|
||||
# ── CORS (allow all origins for dev/test — restrict in production) ──
|
||||
CORS_ORIGIN=*
|
||||
|
||||
# ── Webhooks (optional) ─────────────────────────────────────────
|
||||
WEBHOOK_INVITATION_REDEEMED_URL=
|
||||
WEBHOOK_REFERRAL_STATUS_URL=
|
||||
|
||||
@ -10,7 +10,7 @@
|
||||
|
||||
- **Azure VM:** Ubuntu 24.04 LTS (or 22.04), **Standard_D8s_v5** (8 vCPU, 32 GB RAM) recommended
|
||||
- **Disk:** 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts)
|
||||
- **Network:** NSG allowing inbound on ports listed in the Port Map below
|
||||
- **Network:** NSG allowing inbound on ports: `22, 80, 1025, 1234, 3000-3003, 3030, 3035, 3040, 3045, 3050, 3055, 3060, 3070, 3075, 3100, 3300, 4003, 4005, 4007, 4010-4019, 8025, 8080, 10000, 11434`
|
||||
- **GitHub access:** Repos must be accessible (public or `GITHUB_TOKEN` for private)
|
||||
- **Nothing else needed** — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything
|
||||
|
||||
@ -69,7 +69,7 @@ sudo ./setup.sh --help # Show full usage
|
||||
| 4. Build | ~5 min | `pnpm install && pnpm -r build` all `@bytelyst/*` packages |
|
||||
| 5. Publish | ~3 min | Publish all packages to local Gitea npm registry |
|
||||
| 6. Env | instant | Generate `.env.ecosystem` with Cosmos emulator key, Azurite key, JWT secret |
|
||||
| 7. Deploy | ~10 min | Stop Ollama (free RAM), per-service Docker build + deploy (30 services, with fallback), prune build cache, restart Ollama |
|
||||
| 7. Deploy | ~10 min | Stop Ollama (free RAM), per-service Docker build + deploy (31 services, with fallback), prune build cache, restart Ollama |
|
||||
| 8. Verify | ~1 min | Health-check all 31+ endpoints + create `/opt/bytelyst/check-health.sh` |
|
||||
|
||||
## Port Map (after deployment)
|
||||
@ -185,10 +185,25 @@ All optional — defaults work for most setups:
|
||||
- **Build failures:** Check Gitea is running (`docker ps | grep gitea`) and packages published (`curl http://localhost:3300/api/packages/bytelyst/npm/`). Per-service build logs: `/opt/bytelyst/.setup-state/builds/<service>.log`. Retry: `sudo ./setup.sh --phase=7`.
|
||||
- **Ollama not responding:** Check `systemctl status ollama` or `curl http://localhost:11434/api/version`.
|
||||
- **Port conflicts:** Ensure nothing else runs on the listed ports before deploying.
|
||||
- **CORS errors in browser:** The generated `.env.ecosystem` sets `CORS_ORIGIN=*` for dev/test. If you restrict it, update the value to match your access URL.
|
||||
- **Services in development mode:** `.env.ecosystem` now sets `NODE_ENV=production` for all services. If you need debug logging, remove or change this value.
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- **Remote browser access:** Product web apps fall back to `http://localhost:<port>` for API calls. This works when browsing from the VM itself but **not from a remote browser** (e.g., laptop accessing `http://<vm-ip>:3060`). For remote access, set up a reverse proxy (Traefik rules) or SSH port-forwarding. Health checks and server-side rendering still work regardless.
|
||||
- **Remote browser access:** Product web apps use `http://localhost:<port>` for browser-side API calls (baked at Next.js build time via `NEXT_PUBLIC_*` args). This works when browsing from the VM itself but **not from a remote browser** (e.g., laptop accessing `http://<vm-ip>:3060`). For remote access, use SSH port-forwarding:
|
||||
```bash
|
||||
# Forward all product ports to your laptop (run from your laptop)
|
||||
ssh -N -L 3001:localhost:3001 -L 3002:localhost:3002 -L 3030:localhost:3030 \
|
||||
-L 3035:localhost:3035 -L 3040:localhost:3040 -L 3045:localhost:3045 \
|
||||
-L 3050:localhost:3050 -L 3055:localhost:3055 -L 3060:localhost:3060 \
|
||||
-L 3070:localhost:3070 -L 3075:localhost:3075 \
|
||||
-L 4003:localhost:4003 -L 4010:localhost:4010 -L 4011:localhost:4011 \
|
||||
-L 4012:localhost:4012 -L 4013:localhost:4013 -L 4014:localhost:4014 \
|
||||
-L 4015:localhost:4015 -L 4016:localhost:4016 -L 4017:localhost:4017 \
|
||||
-L 4018:localhost:4018 -L 4019:localhost:4019 \
|
||||
azureuser@<vm-ip>
|
||||
```
|
||||
Then open `http://localhost:3060` etc. on your laptop. Server-side code (API routes, SSR) uses Docker service names and works regardless.
|
||||
- **Cosmos emulator is x86-only:** Do not use ARM-based VMs (e.g., Dpsv6). Stick with `Standard_D8s_v5` or similar Intel/AMD instances.
|
||||
- **Memory pressure:** Phase 7 automatically stops Ollama (~3 GB) during Docker builds and restarts it after. If builds still OOM on 32 GB, retry with `sudo ./setup.sh --phase=7` (per-service fallback skips what already built).
|
||||
- **Corporate proxy in Dockerfiles:** Already removed at source across all repos. No runtime stripping needed.
|
||||
|
||||
@ -14,7 +14,7 @@ This folder contains three files you must work with:
|
||||
- **`README.md`** — Deployment guide documenting what the script does, ports, troubleshooting
|
||||
- **`prompt.md`** — This file (agent instructions)
|
||||
|
||||
The script installs everything from scratch (Docker, Node.js, pnpm, Gitea, act_runner, Ollama) then clones 12 repos, builds + publishes ~57 `@bytelyst/*` npm packages to a local Gitea registry, generates environment config, and deploys 31 Docker Compose services (6 infra + 3 platform + 2 dashboards + 10 backends + 9 webs + 1 standalone).
|
||||
The script installs everything from scratch (Docker, Node.js, pnpm, Gitea, act_runner, Ollama) then clones 12 repos, builds + publishes ~57 `@bytelyst/*` npm packages to a local Gitea registry, generates environment config, and deploys 31 Docker Compose services (6 infra + 3 platform + 2 dashboards + 10 backends + 9 webs + 1 standalone LLM Lab dashboard).
|
||||
|
||||
### Current State (ALREADY IMPLEMENTED — do NOT redo)
|
||||
|
||||
@ -23,7 +23,7 @@ The following features are already built and tested in `setup.sh`:
|
||||
- **Resume/retry support:** `--resume`, `--resume-from=N`, `--phase=N`, `--reset`, `--status`, `--help` CLI flags
|
||||
- **Phase completion markers:** Stored in `/opt/bytelyst/.setup-state/phaseN.done`
|
||||
- **GITEA_NPM_TOKEN auto-restore:** Token saved to `/opt/bytelyst/.gitea_token`, restored on resume
|
||||
- **Per-service Docker build:** Phase 7 builds each of 30 services individually with `[N/30]` progress
|
||||
- **Per-service Docker build:** Phase 7 builds each of 31 services individually with `[N/31]` progress
|
||||
- **Per-service fallback:** Failed builds are skipped, remaining services still start
|
||||
- **Build logs:** Saved per-service to `/opt/bytelyst/.setup-state/builds/<service>.log`
|
||||
- **Phase 7 partial failure handling:** Phase 7 NOT marked done if builds fail, so `--resume` retries it
|
||||
@ -76,6 +76,13 @@ The following issues have already been identified and fixed in the current `setu
|
||||
| `detect_docker_host_ip()` uses `ip` command not in minimal installs | Added `iproute2` to apt deps | `ddd2db84` |
|
||||
| SSH disconnect loses all output | `exec > >(tee -a setup.log) 2>&1` | `ddd2db84` |
|
||||
| `localmemgpt-backend` can't reach Ollama on Linux | `extra_hosts: ['host.docker.internal:host-gateway']` in compose | `3b31709b` |
|
||||
| `llmlab-dashboard` missing from setup.sh service arrays | Added to WEB_SERVICES + check-health.sh | `d8908093` |
|
||||
| Service count inconsistent (30 vs 31 across files) | Fixed all comments/docs to 31 | `d8908093` |
|
||||
| Phase 3 `cd` side effect leaves CWD in last repo dir | Added `cd "$INSTALL_DIR"` after loop | `d8908093` |
|
||||
| No `CORS_ORIGIN` in .env.ecosystem (remote browser CORS errors) | Added `CORS_ORIGIN=*` to phase6_env | `d8908093` |
|
||||
| `NODE_ENV` not set for backends (run in dev mode) | Added `NODE_ENV=production` to phase6_env | `d8908093` |
|
||||
| 9 product web services missing healthchecks in compose | Added `healthcheck:` to all 9 web services | `f9a20e46` |
|
||||
| Dead `NEXT_PUBLIC_*` runtime env vars in compose (no effect on client code) | Replaced with non-prefixed server-side vars | `f9a20e46` |
|
||||
| Dashboard Dockerfiles had hardcoded corporate proxy | Converted to `ARG`-based proxy with empty defaults | `2b9fd717` |
|
||||
| `pnpm install --frozen-lockfile` fails on shallow clones | Removed `--frozen-lockfile` | `3b31709b` |
|
||||
| 3 service Dockerfiles had stale package.json COPY lists | Updated to all 57 packages + workspace members | `85aca553` |
|
||||
@ -106,8 +113,8 @@ The following issues have already been identified and fixed in the current `setu
|
||||
|
||||
## Your Tasks (in priority order)
|
||||
|
||||
> **Tasks 1-3 are ALREADY DONE.** See "Current State" above and "Bugs Already Fixed" above.
|
||||
> Focus on Tasks 4-7 which are the remaining work.
|
||||
> **Tasks 1-6 are DONE.** See "Current State" above and "Bugs Already Fixed" above.
|
||||
> Only Task 4 (dry-run, low priority) and Task 7 (test plan) remain.
|
||||
|
||||
### ~~1. Audit `setup.sh` for correctness~~ ✅ DONE
|
||||
|
||||
@ -120,7 +127,7 @@ The script has been audited and all identified bugs fixed (see table above). Pha
|
||||
- Phase 5 publish: tolerates 409 conflicts
|
||||
- Phase 6 env: heredoc with Cosmos/Azurite emulator keys, semicolons handled
|
||||
- Phase 7: per-service build with fallback, BuildKit secrets via `GITEA_NPM_TOKEN` env export
|
||||
- Phase 8: health check covers all 30 services + Gitea + Ollama
|
||||
- Phase 8: health check covers all 31 services + Gitea + Ollama
|
||||
|
||||
### ~~2. Fix every bug you find~~ ✅ DONE
|
||||
|
||||
@ -136,7 +143,7 @@ Already implemented:
|
||||
- **Per-service fallback:** Failed Docker builds are skipped, remaining services start
|
||||
- **Build logs:** Per-service to `/opt/bytelyst/.setup-state/builds/<service>.log`
|
||||
|
||||
### 4. Add a dry-run / validation mode (TODO)
|
||||
### 4. Add a dry-run / validation mode (TODO — low priority)
|
||||
|
||||
Add `--dry-run` support that:
|
||||
|
||||
@ -147,26 +154,28 @@ Add `--dry-run` support that:
|
||||
- Does NOT build, publish, or deploy
|
||||
- Prints a summary of what WOULD happen
|
||||
|
||||
### 5. Validate the `docker-compose.ecosystem.yml` integration
|
||||
### ~~5. Validate the `docker-compose.ecosystem.yml` integration~~ ✅ DONE
|
||||
|
||||
Read `docker-compose.ecosystem.yml` (in the repo root) and verify:
|
||||
Validated and fixed:
|
||||
|
||||
- Every service's `build.context` and `build.dockerfile` paths are correct relative to the compose file location
|
||||
- Every service's port mapping matches the backend's `PORT` env var
|
||||
- The `x-product-build` anchor correctly provides `GITEA_NPM_HOST` and `gitea_npm_token` secret
|
||||
- All `depends_on` conditions reference services that actually exist
|
||||
- The `localmemgpt-backend` service has `extra_hosts: ['host.docker.internal:host-gateway']` for Ollama access
|
||||
- **30 total services:** 6 infra (pre-built images) + 24 built from Dockerfiles
|
||||
- All 31 services verified: build contexts, Dockerfile paths, port mappings
|
||||
- `x-product-build` anchor correctly provides `GITEA_NPM_HOST` and `gitea_npm_token` secret
|
||||
- All `depends_on` conditions reference services that exist
|
||||
- `localmemgpt-backend` has `extra_hosts: ['host.docker.internal:host-gateway']`
|
||||
- Added healthchecks to all 9 product web services (were missing)
|
||||
- Removed dead `NEXT_PUBLIC_*` runtime env vars (Next.js bakes at build time only)
|
||||
- Replaced with non-prefixed server-side vars (`PLATFORM_SERVICE_URL`, `BACKEND_URL`, etc.)
|
||||
- **31 total services:** 6 infra (pre-built images) + 25 built from Dockerfiles
|
||||
|
||||
### 6. Update `README.md`
|
||||
### ~~6. Update `README.md`~~ ✅ DONE
|
||||
|
||||
After all fixes, update `README.md` to reflect:
|
||||
Updated:
|
||||
|
||||
- CLI flags: `--resume`, `--resume-from=N`, `--phase=N`, `--reset`, `--status`, `--help`
|
||||
- Correct service count: 30 (not 27)
|
||||
- Updated duration estimates if phases changed
|
||||
- Any new troubleshooting entries
|
||||
- NSG port list: `22, 80, 1025, 1234, 3000-3003, 3030, 3035, 3040, 3045, 3050, 3055, 3060, 3070, 3100, 3300, 4003, 4005, 4007, 4010-4019, 8025, 8080, 8081, 10000, 11434`
|
||||
- Service count: 31 (was 30 in some places)
|
||||
- NSG port list added inline in prerequisites (includes 3075 for llmlab-dashboard)
|
||||
- Phase 7 description: 31 services
|
||||
- Troubleshooting: added CORS and NODE_ENV entries
|
||||
- Known Limitations: expanded remote browser access with SSH port-forwarding command
|
||||
|
||||
### 7. Create a test plan
|
||||
|
||||
@ -209,12 +218,13 @@ Add a section to `README.md` (or a separate `test-plan.md`) that describes how t
|
||||
|
||||
- [ ] `setup.sh` runs flawlessly from `sudo ./setup.sh` on a raw Ubuntu 24.04 VM
|
||||
- [ ] All 8 phases complete without manual intervention
|
||||
- [ ] `/opt/bytelyst/check-health.sh` shows ALL 30+ services green
|
||||
- [ ] `/opt/bytelyst/check-health.sh` shows ALL 31 services green (including llmlab-dashboard :3075)
|
||||
- [ ] All 10 product backends respond to `/health` with `{"status":"ok",...}`
|
||||
- [ ] All 9 product web apps serve their landing page
|
||||
- [ ] All 10 product web apps serve their landing page (9 product + 1 LLM Lab)
|
||||
- [ ] Admin dashboard (`http://<vm-ip>:3001`) loads
|
||||
- [ ] Tracker dashboard (`http://<vm-ip>:3003`) loads
|
||||
- [ ] LocalMemGPT can reach Ollama (`curl http://localhost:4019/api/models` returns models)
|
||||
- [ ] LLM Lab dashboard (`http://<vm-ip>:3075`) loads and connects to Ollama
|
||||
- [ ] Gitea UI accessible at `http://<vm-ip>:3300` with all `@bytelyst/*` packages visible
|
||||
- [ ] Grafana accessible at `http://<vm-ip>:3000` (admin / bytelyst)
|
||||
- [ ] Mailpit accessible at `http://<vm-ip>:8025`
|
||||
|
||||
Loading…
Reference in New Issue
Block a user