e573e98cc1
docs(devops): add llmlab dns handoff
2026-03-31 02:32:01 -07:00
356c96e1d9
docs(devops): capture godaddy dns cutover and vm handoff
2026-03-31 02:25:58 -07:00
root
6bb17e1d9c
docs(devops): move localmemgpt web to vercel
2026-03-31 09:19:49 +00:00
root
68c8fc0d8d
docs(devops): clarify LLM UI hosting roles
2026-03-31 09:12:59 +00:00
root
a597646034
docs(devops): add GoDaddy DNS runbook for bytelyst
2026-03-31 09:00:11 +00:00
root
abfbb70583
docs(devops): document VM-hosted web surfaces
2026-03-31 08:55:52 +00:00
root
3d0183b8c6
docs(devops): note restart and valkey write controls
2026-03-31 08:44:44 +00:00
root
23d5ef44f3
docs(devops): note admin ops inventory and valkey panel
2026-03-31 08:32:50 +00:00
root
2cf557a2c8
docs(devops): document valkey-backed extraction throttling
2026-03-31 08:09:24 +00:00
root
b8661392c6
feat(observability): add phase 2 monitoring and valkey services
2026-03-31 06:57:12 +00:00
root
d4d8c48a4c
docs(architecture): extend internal-only policy to shared infra
2026-03-31 06:52:59 +00:00
root
4aba0a83cc
docs(devops): add phased VM stack recommendations
2026-03-31 06:52:01 +00:00
root
b7b3869014
docs(architecture): keep monitoring stacks internal on VM
2026-03-31 06:47:39 +00:00
root
5cec039905
docs(architecture): keep internal dashboards on VM Docker
2026-03-31 06:39:19 +00:00
root
bb85bf6176
docs(devops): refresh Track A handoff status
2026-03-30 00:11:45 +00:00
root
5cff282961
docs(architecture): move dashboards to Vercel
2026-03-30 00:05:50 +00:00
root
c0cf80d96b
docs(devops): add Track A handoff and prep gateway changes
2026-03-29 23:57:03 +00:00
root
eba6c7a641
chore(platform): align docker and package outputs
2026-03-29 23:41:08 +00:00
root
1b7a68c8a8
feat(devops): add efforise to single-vm ecosystem
2026-03-29 23:34:36 +00:00
saravanakumardb1
80e6268924
docs(vercel): improve Codex prompts with progress trackers, preconditions, verification gates, and per-repo checklists
2026-03-29 16:29:42 -07:00
saravanakumardb1
5fb5a7d468
docs(vercel): split Codex prompts into Track A (Azure VM) and Track B (Vercel code) — replace monolithic file
2026-03-29 16:15:49 -07:00
saravanakumardb1
133d9fe337
docs(vercel): add Codex agent prompts for remaining Vercel deployment work — 8 prompts in dependency order
2026-03-29 16:09:26 -07:00
saravanakumardb1
8dd0036fc4
docs(vercel): cross-reference Azure VM Caddy gateway — concrete gitea.bytelyst.com and api.bytelyst.com URLs across all roadmaps
2026-03-29 16:05:57 -07:00
saravanakumardb1
e6b625f4e2
docs(vercel): review and fix ecosystem web apps audit — update registry strategy to Gitea-on-Azure-VM, fix effort estimates, fix EffoRise path, remove spurious PeakPulse entry, add prerequisite section
2026-03-29 15:46:44 -07:00
saravanakumardb1
64885dbc33
docs: update documentation
2026-03-29 15:46:44 -07:00
root
b261c5d13f
fix(devops): harden single-vm gitea bootstrap
2026-03-29 22:44:02 +00:00
root
388d71a06f
docs(devops): add azure vm deployment status snapshot
2026-03-29 22:42:33 +00:00
root
626e19f776
docs(devops): add secure single-vm api exposure guidance
2026-03-29 22:29:08 +00:00
saravanakumardb1
21ff1058a4
docs(docker): rewrite prompt.md as execution guide for Codex agent on fresh VM
...
- Reframed from 'review and fix' to 'execute, monitor, fix failures, validate'
- 4 clear tasks: run script, handle failures, validate deployment, report results
- Moved bug history and development context to background reference
- Added copy-pastable validation commands for all 31 services
- Simplified constraints: don't modify unless actual runtime failure
2026-03-28 02:06:52 -07:00
saravanakumardb1
7c4f0bc3d9
feat(docker): add --dry-run mode + test-plan.md, complete all 7 prompt tasks
...
- Task 4: Add --dry-run flag that validates system, Docker, Node, Ollama, Gitea, repos, GitHub access, compose file, env file, and phase state without building or deploying
- Task 7: Create test-plan.md with phase-by-phase verification, functional smoke tests, idempotency/resume tests, remote connectivity via SSH forwarding, and service count summary
- Update README CLI flags table with --dry-run
- Mark all 7 tasks done in prompt.md
2026-03-28 01:58:15 -07:00
saravanakumardb1
91a651805c
docs(docker): update README, prompt.md, .env.ecosystem.example with audit fixes
...
- README: NSG port list inline, phase 7 count 31, CORS/NODE_ENV troubleshooting, SSH port-forwarding example
- prompt.md: mark tasks 5+6 done, add 8 new bug fixes to table, update definition of done with llmlab-dashboard
- .env.ecosystem.example: add NODE_ENV=production and CORS_ORIGIN=*
2026-03-28 00:45:38 -07:00
saravanakumardb1
d8908093fa
fix(docker): add llmlab-dashboard to setup.sh, fix service count to 31, add CORS_ORIGIN + NODE_ENV
...
- B1: Add llmlab-dashboard to WEB_SERVICES array (was missing, 30→31)
- B2: Add llmlab-dashboard to check-health.sh (port 3075)
- B3: Fix service count comments throughout (30→31)
- B6: Restore CWD after phase 3 git push loop
- G1: Add CORS_ORIGIN=* to phase6_env for remote browser access
- G2: Add NODE_ENV=production to phase6_env for all services
2026-03-28 00:40:25 -07:00
saravanakumardb1
fc12a8eaa2
feat(devops): add Local LLM Lab to ecosystem deployment
...
- docker-compose.ecosystem.yml: add llmlab-dashboard service (port 3075)
- setup.sh: add learning_ai_local_llms as 12th repo
- README.md: update to 31 services, 11 products, add Docker vs K8s recommendation
- docker/README.md: update port map, phase descriptions
- prompt.md: update repo list and service counts
2026-03-27 00:10:40 -07:00
saravanakumardb1
70fdc6b279
feat(devops): add Gitea CI (act_runner) to Azure VM setup
...
- Phase 2: install act_runner binary, register with Gitea, create systemd service
- Phase 3: push all 11 repos to VM Gitea after cloning from GitHub
- Expanded Gitea API token scopes (write:repository, write:user)
- Runner config: host mode, capacity 2, GITEA_NPM_TOKEN injected
- Enables CI on the VM for NETWORK!=corp usage
2026-03-26 23:19:37 -07:00
saravanakumardb1
aa139d5021
feat(ci): add auto-publish job for @bytelyst/* packages + update migration doc
...
- Add publish-packages job to CI workflow (runs after build-and-test)
- Publish 13 remaining packages to Gitea (56 total, up from 43)
- Update act_runner token to read+write scope
- Fix package counts throughout migration doc (43 → 56)
- Update CI status: all 10/10 repos now have CI workflows
- Add package inventory section (§15.1)
2026-03-26 23:18:05 -07:00
saravanakumardb1
5ba9518722
docs: update Gitea registry docs for NETWORK-aware GITEA_NPM_HOST
...
- GITEA_NPM_REGISTRY_MIGRATION.md: update .npmrc examples, add home
row to network topology table, note switch-network.sh sets the host
- SINGLE_VM_DEPLOYMENT.md: consolidate .npmrc example to show unified
${GITEA_NPM_HOST}:3300 pattern (host-side + Docker-side)
- GITEA_LOCAL_CI.md: add NPM registry host note to Key Settings
2026-03-24 15:57:20 -07:00
saravanakumardb1
32522b218a
fix(k8s): setup-k8s.sh — fail phase 3 on build errors, fix non-root crash
...
- Phase 3 now exits with error if any image builds fail, preventing
mark_phase_done from running. Previously it just warned and continued,
which could lead to phase 5 deploying with missing images.
- Moved mkdir from top-level scope into mark_phase_done(). The old
top-level mkdir -p /opt/bytelyst/.setup-state-k8s crashed non-root
invocations (--status, --help) due to set -e + permission denied.
- Fixed header comment: 'containerd' → 'Docker runtime' (we use --docker).
- Added --resume to header usage block (was supported but undocumented).
2026-03-24 14:52:53 -07:00
saravanakumardb1
a25d6f7847
fix(k8s): remove YAML anchors that break across document separators
...
YAML anchors (&name/*name) are scoped per document. In multi-document
files (separated by ---), anchors defined in one document cannot be
referenced from another. This caused all backends/webs after the first
to fail kubectl apply with unknown alias errors.
Fixed by inlining envFrom, resources, and labels in every Deployment.
2026-03-24 14:51:48 -07:00
saravanakumardb1
8a568932b4
feat(infra): add production-grade k3s Kubernetes setup for single VM
...
Complete K8s deployment alternative to Docker Compose, targeting
~50 beta users on a Standard_D8s_v5 Azure VM (8 vCPU, 32 GB RAM).
setup-k8s.sh (6 phases):
1. Pre-flight: verify docker phases 1-5 ran, disk/RAM checks
2. Install k3s: Docker runtime, NodePort range 1024-32767
3. Build images: docker compose build + tag as bytelyst/<svc>
4. Config: namespaces, ConfigMap (3 copies), Secrets (JWT + blob keys), Ollama
5. Deploy: infra -> platform -> dashboards -> products (ordered)
6. Health check: 32 endpoints + kubectl pod status
K8s manifests (18 files):
- 4 namespaces (infra, platform, dashboards, products)
- 6 infra (cosmos StatefulSet+PVC, azurite StatefulSet+PVC,
mailpit, loki StatefulSet+PVC, grafana+PVC, ollama external)
- 3 platform (Deployment+Service+NodePort each)
- 2 dashboards (Deployment+Service+NodePort each)
- 10 backends + 9 webs (all with readiness+liveness probes,
resource limits, product-specific NEXT_PUBLIC_* env vars)
Design decisions:
- k3s --docker: reuses existing Docker images, no containerd import
- Same ports as Docker Compose (NodePort with extended range)
- ConfigMap replaces .env.ecosystem, copied to 3 app namespaces
- Blob storage keys injected at deploy time via Secret (not in YAML)
- Cross-namespace DNS: <svc>.<ns>.svc for service discovery
- Ollama as Endpoints+Service pointing to host node IP
- Resource limits: ~19 Gi total, fits in 32 GB with 13 GB headroom
- Teardown: --teardown flag deletes namespaces, keeps k3s
2026-03-24 14:47:17 -07:00
saravanakumardb1
7d0c469858
refactor(infra): reorganize single_azure_vm into docker/ and k8s/ subfolders
...
- Move setup.sh, README.md, prompt.md into docker/ subfolder
- Create top-level README.md comparing both approaches
- Create k8s/README.md with full design doc: k3s architecture,
namespace strategy, manifest structure, ConfigMap/Secret design,
Cosmos emulator StatefulSet, Ollama host service, resource limits,
5-phase implementation plan, and kubectl cheat sheet
2026-03-24 14:11:50 -07:00
saravanakumardb1
40731e06f4
docs(infra): update prompt.md with 15 new bug fixes and stale corrections
...
- Added 15 recent fixes to the Bugs Already Fixed table
- Fixed line count (~940 → ~990)
- Fixed stale lysnrai-web → lysnrai-dashboard in architecture diagram
- Fixed test plan service count (27+ → 30+)
- Updated constraint: compose/Dockerfile changes allowed with verification
2026-03-24 13:49:17 -07:00
saravanakumardb1
d64ea4fba7
fix(infra): add cd path to banner compose logs command
...
The banner showed bare COMPOSE_FILE filename without the directory,
making the command unusable via copy-paste. Now shows the cd first.
2026-03-24 13:48:05 -07:00
saravanakumardb1
e928ec6025
fix(infra): audit round 2 — token guard, frozen-lockfile, build cache, docs
...
- Add require_gitea_token() guard — fail early with actionable message
if GITEA_NPM_TOKEN is empty after restore (prevents silent failures
in Phase 4/5/7)
- Wire require_gitea_token() into phase4_build and setup_compose_env
- Remove --frozen-lockfile from admin-web + tracker-web Dockerfiles
(Docker context is missing services/ and scripts/ workspace members;
Phase 4 reconciles lockfile so --frozen-lockfile is unnecessary)
- Add docker builder prune after Phase 7 builds (reclaim 20-40 GB)
- Update README: pre-flight thresholds, Ollama stop/restart behavior,
Loki + Azurite in port map, updated memory pressure note
2026-03-24 13:37:21 -07:00
saravanakumardb1
1a8697d8ed
fix(infra): fix last stale service count comment (27→30) in setup.sh
2026-03-24 13:18:12 -07:00
saravanakumardb1
f78d382d62
fix(infra): add Azurite + Loki to health check script
...
- Azurite blob storage (:10000) was missing from check-health.sh
- Loki log aggregation (:3100/ready) was missing from check-health.sh
- Now covers all 30 compose services + Gitea + Ollama = 32 endpoints
2026-03-24 13:08:12 -07:00
saravanakumardb1
1a1f7dd55c
fix(infra): harden setup.sh — pre-flight checks, pipefail safety, RAM management
...
- Add pre-flight disk space + memory checks after root validation
- Add --batch --yes to gpg dearmor calls (idempotent on re-run)
- Fix jq abort on malformed Gitea token response (|| echo guard)
- Wrap pnpm install/build in if-blocks with explicit fail() messages
- Stop Ollama during Phase 7 Docker builds to free ~3 GB RAM
- Restart Ollama after Phase 7 builds complete (before Phase 8 health check)
2026-03-24 13:06:05 -07:00
saravanakumardb1
c2ca7f53b4
fix(infra): harden setup.sh from independent audit findings
...
- Replace deprecated NodeSource curl|bash with modern GPG key + apt source
- Add build-essential + python3 to apt deps (native addons: better-sqlite3)
- Add --if-present to pnpm -r build (defensive: skip workspace members without build script)
- Fix README: remove stale proxy stripping reference from Phase 3
- Add Known Limitations section: remote browser access, ARM VM, memory pressure
- Remove AUDIT_PROMPT.md (served its purpose)
2026-03-24 12:56:43 -07:00
saravanakumardb1
35021b67b9
docs(infra): fix stale service count (27→30), update prompt.md + README.md for Codex agent readiness
...
- prompt.md: mark tasks 1-3 as DONE, add 'Current State' section listing
all implemented features, update bugs-fixed table (16 items), fix service
count in architecture diagram, add CLI reference, remove stale --frozen-lockfile
- README.md: add Resume & Retry section with examples, add CLI Flags table,
fix service count in title/phases, update build failure troubleshooting
with build log paths and retry command
- setup.sh: fix '27 services' → '30 services' in header comment and banner
2026-03-24 12:35:59 -07:00
saravanakumardb1
acbab75aaa
docs(infra): add complete CLI reference, examples, and phase docs to setup.sh + ECOSYSTEM_DEPLOYMENT.md
...
setup.sh header now includes:
- All 6 CLI flags (--resume, --resume-from, --phase, --reset, --status, --help)
- Phase descriptions (1-8)
- 6 usage examples (fresh install, retry, resume, jump, status, reset)
- Resume/retry explanation with state dir and build log paths
ECOSYSTEM_DEPLOYMENT.md now includes:
- Single-VM Bootstrap section with quick start
- Resume & Retry examples
- Phase table
- Per-service build & fallback explanation
- Health check script reference
2026-03-24 12:24:16 -07:00
saravanakumardb1
b634708da8
fix(infra): make ollama model pull non-fatal in setup.sh
...
ollama pull piped through tail with set -euo pipefail would abort the
entire 8-phase setup on a slow network or wrong model name. Only
LocalMemGPT needs the model — the other 9 products are unaffected.
2026-03-24 12:20:13 -07:00