History

root 68c8fc0d8d docs(devops): clarify LLM UI hosting roles		2026-03-31 09:12:59 +00:00
..
Caddyfile.bytelyst.example	docs(devops): add Track A handoff and prep gateway changes	2026-03-29 23:57:03 +00:00
DEPLOYMENT_STATUS_2026-03-29.md	docs(devops): clarify LLM UI hosting roles	2026-03-31 09:12:59 +00:00
prompt.md	docs(docker): rewrite prompt.md as execution guide for Codex agent on fresh VM	2026-03-28 02:06:52 -07:00
README.md	docs(devops): clarify LLM UI hosting roles	2026-03-31 09:12:59 +00:00
SECURE_API_EXPOSURE.md	docs(devops): add secure single-vm api exposure guidance	2026-03-29 22:29:08 +00:00
setup.sh	chore(platform): align docker and package outputs	2026-03-29 23:41:08 +00:00
test-plan.md	feat(docker): add --dry-run mode + test-plan.md, complete all 7 prompt tasks	2026-03-28 01:58:15 -07:00

README.md

ByteLyst Single-VM Deployment

Deploy the entire ByteLyst ecosystem (31 services, 11 products) on a single raw Azure VM. Nothing pre-installed required — the script handles everything from a blank Ubuntu machine. Two files: this README and setup.sh. Copy both to the VM and run the script.

SECURE_API_EXPOSURE.md — recommended public API exposure model, alternatives, and security guidance for client-facing URLs
DEPLOYMENT_STATUS_2026-03-29.md — deployment snapshot: what completed on the Azure VM, what was manually fixed, and what remains

Prerequisites

Azure VM: Ubuntu 24.04 LTS (or 22.04), Standard_D8s_v5 (8 vCPU, 32 GB RAM) recommended
Disk: 128 GB+ (Docker images, Cosmos emulator, Ollama models, build artifacts)
Network: NSG allowing inbound on ports: 22, 80, 1025, 1234, 3000-3003, 3030, 3035, 3040, 3045, 3050, 3055, 3060, 3070, 3075, 3100, 3300, 4003, 4005, 4007, 4010-4019, 8025, 8080, 10000, 11434
GitHub access: Repos must be accessible (public or GITHUB_TOKEN for private)
Nothing else needed — the script installs Docker, Node.js, pnpm, Gitea, Ollama, and everything

Quick Start

# 1. SSH into your Azure VM
ssh azureuser@<vm-ip>

# 2. Copy setup.sh and make executable
chmod +x setup.sh

# 3. Run — provide your GitHub username (repos are cloned via HTTPS)
#    If repos are private, also export GITHUB_TOKEN first.
sudo ./setup.sh

# 4. Wait ~15-25 minutes for full build + deploy

# 5. Verify
/opt/bytelyst/check-health.sh

Resume & Retry

Phase completion is tracked. If anything fails, you don't have to start over:

sudo ./setup.sh --phase=7          # Retry just the deploy phase
sudo ./setup.sh --resume           # Auto-resume after SSH disconnect
sudo ./setup.sh --resume-from=7    # Jump to deploy after manual fix
sudo ./setup.sh --status           # Check what's done
sudo ./setup.sh --reset            # Start completely over
sudo ./setup.sh --help             # Show full usage

What the Script Installs & Does

Software installed on the VM (from scratch)

Software	Version	Purpose
Docker CE	latest	Container runtime + Compose + BuildKit
Node.js	22 LTS	Build toolchain for TypeScript packages
pnpm	10.6.5	Package manager (workspace-aware)
Gitea	1.22 (Docker)	Local npm package registry on `:3300`
Ollama	latest	Local LLM inference for LocalMemGPT on `:11434`
git, jq, curl	latest	System utilities

Execution phases

Phase	Duration	Description
1. System	~3 min	Pre-flight checks (disk ≥40 GB, RAM ≥16 GB), install Docker, Node.js 22, pnpm 10.6.5, Ollama, git, jq, build-essential
2. Gitea + CI	~2 min	Start Gitea Docker container, admin + org + token, install act_runner
3. Clone	~3 min	Clone all 12 repos to `/opt/bytelyst/`, push to Gitea for CI
4. Build	~5 min	`pnpm install && pnpm -r build` all `@bytelyst/*` packages
5. Publish	~3 min	Publish all packages to local Gitea npm registry
6. Env	instant	Generate `.env.ecosystem` with Cosmos emulator key, Azurite key, JWT secret
7. Deploy	~10 min	Stop Ollama (free RAM), per-service Docker build + deploy (31 services, with fallback), prune build cache, restart Ollama
8. Verify	~1 min	Health-check all 31+ endpoints + create `/opt/bytelyst/check-health.sh`

Port Map (after deployment)

Infrastructure (installed by setup.sh)

Service	Port	URL
Gitea (npm registry)	3300	`http://<vm-ip>:3300`
Ollama (LLM API)	11434	`http://<vm-ip>:11434`
Cosmos Data Explorer	1234	`http://<vm-ip>:1234`
Azurite (Blob)	10000	`http://<vm-ip>:10000`
Mailpit UI	8025	`http://<vm-ip>:8025`
Loki (Logs)	3100	`http://<vm-ip>:3100/ready`
Grafana	3000	`http://<vm-ip>:3000`
Traefik Dashboard	8080	`http://<vm-ip>:8080`

Platform Services

Service	Port	URL
platform-service	4003	`http://<vm-ip>:4003/health`
extraction-service	4005	`http://<vm-ip>:4005/health`
mcp-server	4007	`http://<vm-ip>:4007/health`

Platform Dashboards

Dashboard	Port	URL
Admin Console	3001	`http://<vm-ip>:3001`
Issue Tracker	3003	`http://<vm-ip>:3003`

Product Backends

Product	Port	Health
PeakPulse	4010	`http://<vm-ip>:4010/health`
ChronoMind	4011	`http://<vm-ip>:4011/health`
JarvisJr	4012	`http://<vm-ip>:4012/health`
NomGap	4013	`http://<vm-ip>:4013/health`
MindLyst	4014	`http://<vm-ip>:4014/health`
LysnrAI	4015	`http://<vm-ip>:4015/health`
NoteLett	4016	`http://<vm-ip>:4016/health`
FlowMonk	4017	`http://<vm-ip>:4017/health`
ActionTrail	4018	`http://<vm-ip>:4018/health`
LocalMemGPT	4019	`http://<vm-ip>:4019/health`

Product Web Apps

Product	Port	URL
LysnrAI Dashboard	3002	`http://<vm-ip>:3002`
ChronoMind	3030	`http://<vm-ip>:3030`
JarvisJr	3035	`http://<vm-ip>:3035`
FlowMonk	3040	`http://<vm-ip>:3040`
NoteLett	3045	`http://<vm-ip>:3045`
MindLyst	3050	`http://<vm-ip>:3050`
NomGap	3055	`http://<vm-ip>:3055`
ActionTrail	3060	`http://<vm-ip>:3060`
LocalMemGPT	3070	`http://<vm-ip>:3070`
Efforise	3080	`http://<vm-ip>:3080`

Internal tooling web apps

Tool	Port	URL
LLM Lab Dashboard	3075	`http://<vm-ip>:3075`

VM-hosted web surfaces

These are the browser-facing UIs currently hosted on the VM and tracked by the admin ops inventory:

Surface	Port	Audience	Notes
Admin Console	3001	internal	Primary ops and admin UI, including Mission Control, VM inventory, and Valkey tools
Issue Tracker	3003	internal	Internal tracker UI
Grafana	3000	internal	Observability dashboards
Gitea Registry	3300	internal	Source control and private package registry
Mailpit	8025	internal	Email sink UI
Traefik Dashboard	8080	internal	Legacy gateway dashboard
LysnrAI Dashboard	3002	internal	Product web app
ChronoMind	3030	internal	Product web app
JarvisJr	3035	internal	Product web app
FlowMonk	3040	internal	Product web app
NoteLett	3045	internal	Product web app
MindLyst	3050	internal	Product web app
NomGap	3055	internal	Product web app
ActionTrail	3060	internal	Product web app
LocalMemGPT	3070	public-candidate	Product web app if promoted beyond internal or prototype use
LLM Lab Dashboard	3075	internal	Internal LLM and Ollama tooling dashboard
Efforise	3080	internal	Product web app

Post-Deployment Commands

# Check all service health
/opt/bytelyst/check-health.sh

# View logs for a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  logs -f platform-service

# Restart a specific service
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  restart flowmonk-backend

# Stop everything
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down

# Stop and wipe all data
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml down -v

Optional Phase 2 profiles

The compose file now includes opt-in profiles for the next internal-only infrastructure additions:

# Metrics stack: Prometheus + node-exporter + cadvisor
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  --profile phase2-observability up -d prometheus node-exporter cadvisor

# Shared cache/pubsub layer: Valkey
docker compose -f /opt/bytelyst/learning_ai_common_plat/docker-compose.ecosystem.yml \
  --profile phase2-shared up -d valkey

Notes:

these services are intended to stay internal-only on the VM
prometheus is provisioned for Grafana automatically through the Grafana datasource directory
neither prometheus nor valkey needs a raw public port exposure for normal operation
extraction-service now uses Valkey for shared per-product rate limiting when PRODUCT_RATE_LIMIT_STORE=valkey
extraction-service falls back to its in-memory limiter only if Valkey is unavailable at runtime
the admin Mission Control page now includes a VM inventory tab and a read-only Valkey inspector tab

Shared limiter env

extraction-service can be pointed at Valkey explicitly:

PRODUCT_RATE_LIMIT_STORE=valkey
VALKEY_URL=redis://valkey:6379

Live verification on the VM:

docker exec learning_ai_common_plat-extraction-service-1 \
  wget -qO- 'http://127.0.0.1:4005/api/extract/rate-limits/product?productId=<product-id>'

docker exec learning_ai_common_plat-valkey-1 \
  valkey-cli KEYS 'extraction:product-rate-limit:*'

Admin ops surface

The internal admin dashboard at http://<vm-ip>:3001/ops now exposes:

Mission Control health for the internal stack
a VM inventory view of Docker-managed services and host tooling
live status coverage for the VM-hosted web surfaces, including the product web apps
a read-only Valkey inspector for key pattern scans, TTLs, and small previews
restart buttons for allowlisted internal containers
guarded Valkey delete actions for exact keys and concrete prefix patterns

These views are protected by the admin dashboard auth layer.

Note:

restart actions require the Docker socket to be mounted into admin-web
the current implementation mounts /var/run/docker.sock into the admin container, so this route should remain internal-only and admin-only

Environment Variables

All optional — defaults work for most setups:

Variable	Default	Description
`GITHUB_USER`	`saravanakumardb1`	GitHub org/user to clone repos from
`GITHUB_TOKEN`	(empty)	Set for private repos (HTTPS auth)
`GITEA_ADMIN`	`bytelyst-admin`	Gitea admin username
`GITEA_PASS`	`ByteLyst2026!`	Gitea admin password
`OLLAMA_MODEL`	`llama3.2:3b`	Default LLM model to pull
`PRODUCT_RATE_LIMIT_STORE`	`valkey` in compose	Shared product throttling backend for `extraction-service`
`VALKEY_URL`	`redis://valkey:6379`	Internal Valkey connection string for shared counters
`SKIP_CLONE`	`0`	Set `1` to skip cloning (re-runs)
`SKIP_BUILD`	`0`	Set `1` to skip package build+publish (re-runs)

CLI Flags

Flag	Description
`--resume`	Auto-resume from last completed phase
`--resume-from=N`	Resume from phase N (1-8)
`--phase=N`	Run ONLY phase N (useful for retrying)
`--dry-run`	Validate prerequisites without building or deploying
`--reset`	Clear phase markers and start fresh
`--status`	Show completed phases and exit
`-h`, `--help`	Show usage help

Troubleshooting

Cosmos emulator slow: It needs 20-30s on first boot. Services wait via health checks.
Out of memory: Use at least 32 GB RAM. Cosmos emulator needs ~4 GB, Ollama needs ~4 GB for 3B models.
Build failures: Check Gitea is running (docker ps | grep gitea) and packages published (curl http://localhost:3300/api/packages/bytelyst/npm/). Per-service build logs: /opt/bytelyst/.setup-state/builds/<service>.log. Retry: sudo ./setup.sh --phase=7.
Ollama not responding: Check systemctl status ollama or curl http://localhost:11434/api/version.
Port conflicts: Ensure nothing else runs on the listed ports before deploying.
CORS errors in browser: The generated .env.ecosystem sets CORS_ORIGIN=* for dev/test. If you restrict it, update the value to match your access URL.
Services in development mode: .env.ecosystem now sets NODE_ENV=production for all services. If you need debug logging, remove or change this value.

HTTPS Gateway

Public backend access is intended to flow through Caddy on https://api.bytelyst.com, not direct backend port exposure.
The gateway config lives at /opt/bytelyst/Caddyfile and is mounted into the caddy container.
Backend routes are path-based and strip their prefixes before proxying:
- /platform/* → platform-service:4003
- /extraction/* → extraction-service:4005
- /mcp/* → mcp-server:4007
- /peakpulse/* → peakpulse-backend:4010
- /chronomind/* → chronomind-backend:4011
- /jarvisjr/* → jarvisjr-backend:4012
- /nomgap/* → nomgap-backend:4013
- /mindlyst/* → mindlyst-backend:4014
- /lysnrai/* → lysnrai-backend:4015
- /notelett/* → notelett-backend:4016
- /flowmonk/* → flowmonk-backend:4017
- /actiontrail/* → actiontrail-backend:4018
- /localmemgpt/* → localmemgpt-backend:4019
Keep backend ports closed publicly once DNS and NSG rules are aligned. Docker-internal service discovery remains unchanged.

Known Limitations

Remote browser access: Product web apps use http://localhost:<port> for browser-side API calls (baked at Next.js build time via NEXT_PUBLIC_* args). This works when browsing from the VM itself but not from a remote browser (e.g., laptop accessing http://<vm-ip>:3060). For remote access, use SSH port-forwarding:

# Forward all product ports to your laptop (run from your laptop)
ssh -N -L 3001:localhost:3001 -L 3002:localhost:3002 -L 3030:localhost:3030 \
  -L 3035:localhost:3035 -L 3040:localhost:3040 -L 3045:localhost:3045 \
  -L 3050:localhost:3050 -L 3055:localhost:3055 -L 3060:localhost:3060 \
  -L 3070:localhost:3070 -L 3075:localhost:3075 \
  -L 4003:localhost:4003 -L 4010:localhost:4010 -L 4011:localhost:4011 \
  -L 4012:localhost:4012 -L 4013:localhost:4013 -L 4014:localhost:4014 \
  -L 4015:localhost:4015 -L 4016:localhost:4016 -L 4017:localhost:4017 \
  -L 4018:localhost:4018 -L 4019:localhost:4019 \
  azureuser@<vm-ip>

Then open http://localhost:3060 etc. on your laptop. Server-side code (API routes, SSR) uses Docker service names and works regardless.

Cosmos emulator is x86-only: Do not use ARM-based VMs (e.g., Dpsv6). Stick with Standard_D8s_v5 or similar Intel/AMD instances.
Memory pressure: Phase 7 automatically stops Ollama (~3 GB) during Docker builds and restarts it after. If builds still OOM on 32 GB, retry with sudo ./setup.sh --phase=7 (per-service fallback skips what already built).
Corporate proxy in Dockerfiles: Already removed at source across all repos. No runtime stripping needed.