learning_ai_common_plat/docs/devops/single_azure_vm
saravanakumardb1 35021b67b9 docs(infra): fix stale service count (27→30), update prompt.md + README.md for Codex agent readiness
- prompt.md: mark tasks 1-3 as DONE, add 'Current State' section listing
  all implemented features, update bugs-fixed table (16 items), fix service
  count in architecture diagram, add CLI reference, remove stale --frozen-lockfile
- README.md: add Resume & Retry section with examples, add CLI Flags table,
  fix service count in title/phases, update build failure troubleshooting
  with build log paths and retry command
- setup.sh: fix '27 services' → '30 services' in header comment and banner
2026-03-24 12:35:59 -07:00
..
prompt.md docs(infra): fix stale service count (27→30), update prompt.md + README.md for Codex agent readiness 2026-03-24 12:35:59 -07:00
README.md docs(infra): fix stale service count (27→30), update prompt.md + README.md for Codex agent readiness 2026-03-24 12:35:59 -07:00
setup.sh docs(infra): fix stale service count (27→30), update prompt.md + README.md for Codex agent readiness 2026-03-24 12:35:59 -07:00

ByteLyst Single-VM Deployment

Deploy the entire ByteLyst ecosystem (30 services, 10 products) on a single raw Azure VM. Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. Two files: this README and setup.sh. Copy both to the VM and run the script.


Prerequisites

  • Azure VM: Ubuntu 24.04 LTS (or 22.04), Standard_D8s_v5 (8 vCPU, 32 GB RAM) recommended
  • Disk: 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts)
  • Network: NSG allowing inbound on ports listed in the Port Map below
  • GitHub access: Repos must be accessible (public or GITHUB_TOKEN for private)
  • Nothing else needed — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything

Quick Start

# 1. SSH into your Azure VM
ssh azureuser@<vm-ip>

# 2. Copy setup.sh and make executable
chmod +x setup.sh

# 3. Run — provide your GitHub username (repos are cloned via HTTPS)
#    If repos are private, also export GITHUB_TOKEN first.
sudo ./setup.sh

# 4. Wait ~15-25 minutes for full build + deploy

# 5. Verify
/opt/bytelyst/check-health.sh

Resume & Retry

Phase completion is tracked. If anything fails, you don't have to start over:

sudo ./setup.sh --phase=7          # Retry just the deploy phase
sudo ./setup.sh --resume           # Auto-resume after SSH disconnect
sudo ./setup.sh --resume-from=7    # Jump to deploy after manual fix
sudo ./setup.sh --status           # Check what's done
sudo ./setup.sh --reset            # Start completely over
sudo ./setup.sh --help             # Show full usage

What the Script Installs & Does

Software installed on the VM (from scratch)

Software Version Purpose
Docker CE latest Container runtime + Compose + BuildKit
Node.js 22 LTS Build toolchain for TypeScript packages
pnpm 10.6.5 Package manager (workspace-aware)
Gitea 1.22 (Docker) Local npm package registry on :3300
Ollama latest Local LLM inference for LocalMemGPT on :11434
git, jq, curl latest System utilities

Execution phases

Phase Duration Description
1. System ~3 min Install Docker, Node.js 22, pnpm 10.6.5, Ollama, git, jq
2. Gitea ~1 min Start Gitea Docker container, create admin + org + API token
3. Clone ~3 min Clone all 11 repos to /opt/bytelyst/, strip corporate proxy from Dockerfiles
4. Build ~5 min pnpm install && pnpm -r build all @bytelyst/* packages
5. Publish ~3 min Publish all packages to local Gitea npm registry
6. Env instant Generate .env.ecosystem with Cosmos emulator key, Azurite key, JWT secret
7. Deploy ~10 min Per-service Docker build + deploy (30 services, with fallback)
8. Verify ~1 min Health-check all 30+ endpoints + create /opt/bytelyst/check-health.sh

Port Map (after deployment)

Infrastructure (installed by setup.sh)

Service Port URL
Gitea (npm registry) 3300 http://<vm-ip>:3300
Ollama (LLM API) 11434 http://<vm-ip>:11434
Cosmos Data Explorer 1234 http://<vm-ip>:1234
Azurite (Blob) 10000
Mailpit UI 8025 http://<vm-ip>:8025
Grafana 3000 http://<vm-ip>:3000
Traefik Dashboard 8080 http://<vm-ip>:8080

Platform Services

Service Port URL
platform-service 4003 http://<vm-ip>:4003/health
extraction-service 4005 http://<vm-ip>:4005/health
mcp-server 4007 http://<vm-ip>:4007/health

Platform Dashboards

Dashboard Port URL
Admin Console 3001 http://<vm-ip>:3001
Issue Tracker 3003 http://<vm-ip>:3003

Product Backends

Product Port Health
PeakPulse 4010 http://<vm-ip>:4010/health
ChronoMind 4011 http://<vm-ip>:4011/health
JarvisJr 4012 http://<vm-ip>:4012/health
NomGap 4013 http://<vm-ip>:4013/health
MindLyst 4014 http://<vm-ip>:4014/health
LysnrAI 4015 http://<vm-ip>:4015/health
NoteLett 4016 http://<vm-ip>:4016/health
FlowMonk 4017 http://<vm-ip>:4017/health
ActionTrail 4018 http://<vm-ip>:4018/health
LocalMemGPT 4019 http://<vm-ip>:4019/health

Product Web Apps

Product Port URL
LysnrAI Dashboard 3002 http://<vm-ip>:3002
ChronoMind 3030 http://<vm-ip>:3030
JarvisJr 3035 http://<vm-ip>:3035
FlowMonk 3040 http://<vm-ip>:3040
NoteLett 3045 http://<vm-ip>:3045
MindLyst 3050 http://<vm-ip>:3050
NomGap 3055 http://<vm-ip>:3055
ActionTrail 3060 http://<vm-ip>:3060
LocalMemGPT 3070 http://<vm-ip>:3070

Post-Deployment Commands

# Check all service health
/opt/bytelyst/check-health.sh

# View logs for a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  logs -f platform-service

# Restart a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  restart flowmonk-backend

# Stop everything
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down

# Stop and wipe all data
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down -v

Environment Variables

All optional — defaults work for most setups:

Variable Default Description
GITHUB_USER saravanakumardb1 GitHub org/user to clone repos from
GITHUB_TOKEN (empty) Set for private repos (HTTPS auth)
GITEA_ADMIN bytelyst-admin Gitea admin username
GITEA_PASS ByteLyst2026! Gitea admin password
OLLAMA_MODEL llama3.2:3b Default LLM model to pull
SKIP_CLONE 0 Set 1 to skip cloning (re-runs)
SKIP_BUILD 0 Set 1 to skip package build+publish (re-runs)

CLI Flags

Flag Description
--resume Auto-resume from last completed phase
--resume-from=N Resume from phase N (1-8)
--phase=N Run ONLY phase N (useful for retrying)
--reset Clear phase markers and start fresh
--status Show completed phases and exit
-h, --help Show usage help

Troubleshooting

  • Cosmos emulator slow: It needs 20-30s on first boot. Services wait via health checks.
  • Out of memory: Use at least 32 GB RAM. Cosmos emulator needs ~4 GB, Ollama needs ~4 GB for 3B models.
  • Build failures: Check Gitea is running (docker ps | grep gitea) and packages published (curl http://localhost:3300/api/packages/bytelyst/npm/). Per-service build logs: /opt/bytelyst/.setup-state/builds/<service>.log. Retry: sudo ./setup.sh --phase=7.
  • Ollama not responding: Check systemctl status ollama or curl http://localhost:11434/api/version.
  • Port conflicts: Ensure nothing else runs on the listed ports before deploying.
  • Corporate proxy in Dockerfiles: The script auto-strips hardcoded proxy ENVs from cloned Dockerfiles.