learning_ai_common_plat/docs/devops/single_azure_vm/docker
2026-03-31 09:12:59 +00:00
..
Caddyfile.bytelyst.example docs(devops): add Track A handoff and prep gateway changes 2026-03-29 23:57:03 +00:00
DEPLOYMENT_STATUS_2026-03-29.md docs(devops): clarify LLM UI hosting roles 2026-03-31 09:12:59 +00:00
prompt.md docs(docker): rewrite prompt.md as execution guide for Codex agent on fresh VM 2026-03-28 02:06:52 -07:00
README.md docs(devops): clarify LLM UI hosting roles 2026-03-31 09:12:59 +00:00
SECURE_API_EXPOSURE.md docs(devops): add secure single-vm api exposure guidance 2026-03-29 22:29:08 +00:00
setup.sh chore(platform): align docker and package outputs 2026-03-29 23:41:08 +00:00
test-plan.md feat(docker): add --dry-run mode + test-plan.md, complete all 7 prompt tasks 2026-03-28 01:58:15 -07:00

ByteLyst Single-VM Deployment

Deploy the entire ByteLyst ecosystem (31 services, 11 products) on a single raw Azure VM. Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. Two files: this README and setup.sh. Copy both to the VM and run the script.

Related:


Prerequisites

  • Azure VM: Ubuntu 24.04 LTS (or 22.04), Standard_D8s_v5 (8 vCPU, 32 GB RAM) recommended
  • Disk: 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts)
  • Network: NSG allowing inbound on ports: 22, 80, 1025, 1234, 3000-3003, 3030, 3035, 3040, 3045, 3050, 3055, 3060, 3070, 3075, 3100, 3300, 4003, 4005, 4007, 4010-4019, 8025, 8080, 10000, 11434
  • GitHub access: Repos must be accessible (public or GITHUB_TOKEN for private)
  • Nothing else needed — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything

Quick Start

# 1. SSH into your Azure VM
ssh azureuser@<vm-ip>

# 2. Copy setup.sh and make executable
chmod +x setup.sh

# 3. Run — provide your GitHub username (repos are cloned via HTTPS)
#    If repos are private, also export GITHUB_TOKEN first.
sudo ./setup.sh

# 4. Wait ~15-25 minutes for full build + deploy

# 5. Verify
/opt/bytelyst/check-health.sh

Resume & Retry

Phase completion is tracked. If anything fails, you don't have to start over:

sudo ./setup.sh --phase=7          # Retry just the deploy phase
sudo ./setup.sh --resume           # Auto-resume after SSH disconnect
sudo ./setup.sh --resume-from=7    # Jump to deploy after manual fix
sudo ./setup.sh --status           # Check what's done
sudo ./setup.sh --reset            # Start completely over
sudo ./setup.sh --help             # Show full usage

What the Script Installs & Does

Software installed on the VM (from scratch)

Software Version Purpose
Docker CE latest Container runtime + Compose + BuildKit
Node.js 22 LTS Build toolchain for TypeScript packages
pnpm 10.6.5 Package manager (workspace-aware)
Gitea 1.22 (Docker) Local npm package registry on :3300
Ollama latest Local LLM inference for LocalMemGPT on :11434
git, jq, curl latest System utilities

Execution phases

Phase Duration Description
1. System ~3 min Pre-flight checks (disk ≥40 GB, RAM ≥16 GB), install Docker, Node.js 22, pnpm 10.6.5, Ollama, git, jq, build-essential
2. Gitea + CI ~2 min Start Gitea Docker container, admin + org + token, install act_runner
3. Clone ~3 min Clone all 12 repos to /opt/bytelyst/, push to Gitea for CI
4. Build ~5 min pnpm install && pnpm -r build all @bytelyst/* packages
5. Publish ~3 min Publish all packages to local Gitea npm registry
6. Env instant Generate .env.ecosystem with Cosmos emulator key, Azurite key, JWT secret
7. Deploy ~10 min Stop Ollama (free RAM), per-service Docker build + deploy (31 services, with fallback), prune build cache, restart Ollama
8. Verify ~1 min Health-check all 31+ endpoints + create /opt/bytelyst/check-health.sh

Port Map (after deployment)

Infrastructure (installed by setup.sh)

Service Port URL
Gitea (npm registry) 3300 http://<vm-ip>:3300
Ollama (LLM API) 11434 http://<vm-ip>:11434
Cosmos Data Explorer 1234 http://<vm-ip>:1234
Azurite (Blob) 10000 http://<vm-ip>:10000
Mailpit UI 8025 http://<vm-ip>:8025
Loki (Logs) 3100 http://<vm-ip>:3100/ready
Grafana 3000 http://<vm-ip>:3000
Traefik Dashboard 8080 http://<vm-ip>:8080

Platform Services

Service Port URL
platform-service 4003 http://<vm-ip>:4003/health
extraction-service 4005 http://<vm-ip>:4005/health
mcp-server 4007 http://<vm-ip>:4007/health

Platform Dashboards

Dashboard Port URL
Admin Console 3001 http://<vm-ip>:3001
Issue Tracker 3003 http://<vm-ip>:3003

Product Backends

Product Port Health
PeakPulse 4010 http://<vm-ip>:4010/health
ChronoMind 4011 http://<vm-ip>:4011/health
JarvisJr 4012 http://<vm-ip>:4012/health
NomGap 4013 http://<vm-ip>:4013/health
MindLyst 4014 http://<vm-ip>:4014/health
LysnrAI 4015 http://<vm-ip>:4015/health
NoteLett 4016 http://<vm-ip>:4016/health
FlowMonk 4017 http://<vm-ip>:4017/health
ActionTrail 4018 http://<vm-ip>:4018/health
LocalMemGPT 4019 http://<vm-ip>:4019/health

Product Web Apps

Product Port URL
LysnrAI Dashboard 3002 http://<vm-ip>:3002
ChronoMind 3030 http://<vm-ip>:3030
JarvisJr 3035 http://<vm-ip>:3035
FlowMonk 3040 http://<vm-ip>:3040
NoteLett 3045 http://<vm-ip>:3045
MindLyst 3050 http://<vm-ip>:3050
NomGap 3055 http://<vm-ip>:3055
ActionTrail 3060 http://<vm-ip>:3060
LocalMemGPT 3070 http://<vm-ip>:3070
Efforise 3080 http://<vm-ip>:3080

Internal tooling web apps

Tool Port URL
LLM Lab Dashboard 3075 http://<vm-ip>:3075

VM-hosted web surfaces

These are the browser-facing UIs currently hosted on the VM and tracked by the admin ops inventory:

Surface Port Audience Notes
Admin Console 3001 internal Primary ops and admin UI, including Mission Control, VM inventory, and Valkey tools
Issue Tracker 3003 internal Internal tracker UI
Grafana 3000 internal Observability dashboards
Gitea Registry 3300 internal Source control and private package registry
Mailpit 8025 internal Email sink UI
Traefik Dashboard 8080 internal Legacy gateway dashboard
LysnrAI Dashboard 3002 internal Product web app
ChronoMind 3030 internal Product web app
JarvisJr 3035 internal Product web app
FlowMonk 3040 internal Product web app
NoteLett 3045 internal Product web app
MindLyst 3050 internal Product web app
NomGap 3055 internal Product web app
ActionTrail 3060 internal Product web app
LocalMemGPT 3070 public-candidate Product web app if promoted beyond internal or prototype use
LLM Lab Dashboard 3075 internal Internal LLM and Ollama tooling dashboard
Efforise 3080 internal Product web app

Post-Deployment Commands

# Check all service health
/opt/bytelyst/check-health.sh

# View logs for a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  logs -f platform-service

# Restart a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  restart flowmonk-backend

# Stop everything
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down

# Stop and wipe all data
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down -v

Optional Phase 2 profiles

The compose file now includes opt-in profiles for the next internal-only infrastructure additions:

# Metrics stack: Prometheus + node-exporter + cadvisor
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  --profile phase2-observability up -d prometheus node-exporter cadvisor

# Shared cache/pubsub layer: Valkey
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  --profile phase2-shared up -d valkey

Notes:

  • these services are intended to stay internal-only on the VM
  • prometheus is provisioned for Grafana automatically through the Grafana datasource directory
  • neither prometheus nor valkey needs a raw public port exposure for normal operation
  • extraction-service now uses Valkey for shared per-product rate limiting when PRODUCT_RATE_LIMIT_STORE=valkey
  • extraction-service falls back to its in-memory limiter only if Valkey is unavailable at runtime
  • the admin Mission Control page now includes a VM inventory tab and a read-only Valkey inspector tab

Shared limiter env

extraction-service can be pointed at Valkey explicitly:

PRODUCT_RATE_LIMIT_STORE=valkey
VALKEY_URL=redis://valkey:6379

Live verification on the VM:

docker exec learning_ai_common_plat-extraction-service-1 \
  wget -qO- 'http://127.0.0.1:4005/api/extract/rate-limits/product?productId=<product-id>'

docker exec learning_ai_common_plat-valkey-1 \
  valkey-cli KEYS 'extraction:product-rate-limit:*'

Admin ops surface

The internal admin dashboard at http://<vm-ip>:3001/ops now exposes:

  • Mission Control health for the internal stack
  • a VM inventory view of Docker-managed services and host tooling
  • live status coverage for the VM-hosted web surfaces, including the product web apps
  • a read-only Valkey inspector for key pattern scans, TTLs, and small previews
  • restart buttons for allowlisted internal containers
  • guarded Valkey delete actions for exact keys and concrete prefix patterns

These views are protected by the admin dashboard auth layer.

Note:

  • restart actions require the Docker socket to be mounted into admin-web
  • the current implementation mounts /var/run/docker.sock into the admin container, so this route should remain internal-only and admin-only

Environment Variables

All optional — defaults work for most setups:

Variable Default Description
GITHUB_USER saravanakumardb1 GitHub org/user to clone repos from
GITHUB_TOKEN (empty) Set for private repos (HTTPS auth)
GITEA_ADMIN bytelyst-admin Gitea admin username
GITEA_PASS ByteLyst2026! Gitea admin password
OLLAMA_MODEL llama3.2:3b Default LLM model to pull
PRODUCT_RATE_LIMIT_STORE valkey in compose Shared product throttling backend for extraction-service
VALKEY_URL redis://valkey:6379 Internal Valkey connection string for shared counters
SKIP_CLONE 0 Set 1 to skip cloning (re-runs)
SKIP_BUILD 0 Set 1 to skip package build+publish (re-runs)

CLI Flags

Flag Description
--resume Auto-resume from last completed phase
--resume-from=N Resume from phase N (1-8)
--phase=N Run ONLY phase N (useful for retrying)
--dry-run Validate prerequisites without building or deploying
--reset Clear phase markers and start fresh
--status Show completed phases and exit
-h, --help Show usage help

Troubleshooting

  • Cosmos emulator slow: It needs 20-30s on first boot. Services wait via health checks.
  • Out of memory: Use at least 32 GB RAM. Cosmos emulator needs ~4 GB, Ollama needs ~4 GB for 3B models.
  • Build failures: Check Gitea is running (docker ps | grep gitea) and packages published (curl http://localhost:3300/api/packages/bytelyst/npm/). Per-service build logs: /opt/bytelyst/.setup-state/builds/<service>.log. Retry: sudo ./setup.sh --phase=7.
  • Ollama not responding: Check systemctl status ollama or curl http://localhost:11434/api/version.
  • Port conflicts: Ensure nothing else runs on the listed ports before deploying.
  • CORS errors in browser: The generated .env.ecosystem sets CORS_ORIGIN=* for dev/test. If you restrict it, update the value to match your access URL.
  • Services in development mode: .env.ecosystem now sets NODE_ENV=production for all services. If you need debug logging, remove or change this value.

HTTPS Gateway

  • Public backend access is intended to flow through Caddy on https://api.bytelyst.com, not direct backend port exposure.
  • The gateway config lives at /opt/bytelyst/Caddyfile and is mounted into the caddy container.
  • Backend routes are path-based and strip their prefixes before proxying:
    • /platform/*platform-service:4003
    • /extraction/*extraction-service:4005
    • /mcp/*mcp-server:4007
    • /peakpulse/*peakpulse-backend:4010
    • /chronomind/*chronomind-backend:4011
    • /jarvisjr/*jarvisjr-backend:4012
    • /nomgap/*nomgap-backend:4013
    • /mindlyst/*mindlyst-backend:4014
    • /lysnrai/*lysnrai-backend:4015
    • /notelett/*notelett-backend:4016
    • /flowmonk/*flowmonk-backend:4017
    • /actiontrail/*actiontrail-backend:4018
    • /localmemgpt/*localmemgpt-backend:4019
  • Keep backend ports closed publicly once DNS and NSG rules are aligned. Docker-internal service discovery remains unchanged.

Known Limitations

  • Remote browser access: Product web apps use http://localhost:<port> for browser-side API calls (baked at Next.js build time via NEXT_PUBLIC_* args). This works when browsing from the VM itself but not from a remote browser (e.g., laptop accessing http://<vm-ip>:3060). For remote access, use SSH port-forwarding:
    # Forward all product ports to your laptop (run from your laptop)
    ssh -N -L 3001:localhost:3001 -L 3002:localhost:3002 -L 3030:localhost:3030 \
      -L 3035:localhost:3035 -L 3040:localhost:3040 -L 3045:localhost:3045 \
      -L 3050:localhost:3050 -L 3055:localhost:3055 -L 3060:localhost:3060 \
      -L 3070:localhost:3070 -L 3075:localhost:3075 \
      -L 4003:localhost:4003 -L 4010:localhost:4010 -L 4011:localhost:4011 \
      -L 4012:localhost:4012 -L 4013:localhost:4013 -L 4014:localhost:4014 \
      -L 4015:localhost:4015 -L 4016:localhost:4016 -L 4017:localhost:4017 \
      -L 4018:localhost:4018 -L 4019:localhost:4019 \
      azureuser@<vm-ip>
    
    Then open http://localhost:3060 etc. on your laptop. Server-side code (API routes, SSR) uses Docker service names and works regardless.
  • Cosmos emulator is x86-only: Do not use ARM-based VMs (e.g., Dpsv6). Stick with Standard_D8s_v5 or similar Intel/AMD instances.
  • Memory pressure: Phase 7 automatically stops Ollama (~3 GB) during Docker builds and restarts it after. If builds still OOM on 32 GB, retry with sudo ./setup.sh --phase=7 (per-service fallback skips what already built).
  • Corporate proxy in Dockerfiles: Already removed at source across all repos. No runtime stripping needed.