docs(devops): Hostinger runner prompt v2 — org migration + monitoring + hardening

Adds the missing pieces revealed during review: §1 Multi-repo registration decision — choose repo-level vs org-level up-front. Default doc remains repo-level, but explicitly calls out org-level as the scaling path for 20+ repos. §2 Pre-flight check additions: - Arch detection (x86_64 / aarch64) before downloading runner tarball - github.com + objects.githubusercontent.com reachability check - gh CLI auth status check (must be saravanakumardb1) §4 Installation hardening: - Step 1 is now idempotent (getent guards on useradd/usermod) - Step 3 queries latest runner version via gh api (no more stale pin) - Step 3 includes SHA256 verification of the downloaded tarball against the release-notes manifest, with explicit STOP-if-mismatch - Step 3 has REGISTRATION_URL var with commented Option A/B for repo-level vs org-level scope §5 Smoke test — added explicit git checkout/add/commit/push commands for creating the runner/smoke branch (was implicit before). §8 (renamed) — comprehensive org migration guide: - Side-by-side table: personal account today vs under-an-org - Bash loop to transfer all 18 repos via gh api - git remote set-url commands for each local clone - Post-migration org-level registration token fetch - Workflow propagation strategies (reusable workflow vs sync script) §9 (new) — Monitoring + observability: - GitHub Actions tab per-repo + per-org workflow views - Runner pool health (Settings → Actions → Runners) at repo + org level - gh CLI commands for scripted monitoring (run watch, list, view, runners) - Host-side journalctl + _diag/ inspection commands §14 Questions — updated to ask about scope (repo vs org) first. Section numbering shifted by +1 from §9 onward to make room for the new Monitoring section.
2026-05-24 18:04:50 -07:00 · 2026-05-24 18:04:50 -07:00 · 6bf15eae7a
commit 6bf15eae7a
parent d5e0778af6
1 changed files with 223 additions and 31 deletions
--- a/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md
+++ b/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md
@ -9,10 +9,10 @@

 Set up a GitHub Actions self-hosted runner on the Hostinger VM that can:

-1. Receive workflow triggers from `saravanakumardb1/learning_ai_common_plat` (and, later, all `@bytelyst/*` repos).
-2. Build `@bytelyst/*` npm packages from a tagged release.
+1. Receive workflow triggers from **all 20+ `@bytelyst/*` repos** (see §8 for the org-vs-repo registration decision).
+2. Build `@bytelyst/*` npm packages from tagged releases.
 3. Publish them to **the local Gitea instance on this VM** (`http://localhost:3300/api/packages/bytelyst/npm/`).
-4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via a separate `bytelyst-sync` script described in a follow-up prompt).
+4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via the separate `bytelyst-sync` script described in a follow-up prompt).

 Self-hosted on Hostinger beats GitHub-hosted runners because:

@ -20,6 +20,17 @@ Self-hosted on Hostinger beats GitHub-hosted runners because:
 - Gitea is on `localhost` from this VM → zero-latency publish, no public TLS needed.
 - VM is always on; runner is reachable indefinitely.

+### Multi-repo registration decision
+
+**Decide before Step 2** whether this runner serves a single repo or all 20+:
+
+| Approach                             | Registration scope        | Use when                                                                                                       |
+| ------------------------------------ | ------------------------- | -------------------------------------------------------------------------------------------------------------- |
+| **Repo-level** (default in this doc) | Single repo URL           | You're validating the runner first, or you only have 1–2 repos publishing packages                             |
+| **Org-level** (recommended at scale) | A GitHub Organization URL | You've migrated 20+ repos under one org — see §8 for migration steps. One registration, all org repos eligible |
+
+The install steps are identical — only the `--url` flag in `config.sh` (Step 3) changes.
+
 ---

 ## 2. Pre-flight checks (run first, do not skip)
@ -41,18 +52,34 @@ pnpm --version 2>/dev/null || echo "pnpm not installed (will install in Step 5)"
 # 5. Confirm gh CLI exists
 gh --version 2>/dev/null || echo "gh CLI not installed (will install in Step 5)"

-# 6. Disk free
+# 6a. Detect architecture (used for runner tarball selection in Step 3)
+ARCH=$(uname -m)
+case "$ARCH" in
+  x86_64)  export RUNNER_ARCH="linux-x64";;
+  aarch64) export RUNNER_ARCH="linux-arm64";;
+  *) echo "Unsupported arch: $ARCH — STOP and report"; ;;
+esac
+echo "Will install runner for: $RUNNER_ARCH"
+
+# 6b. Disk free
 df -h /     # Need ~5 GB headroom

 # 7. Confirm no existing runner
 ls -la ~/actions-runner 2>/dev/null && echo "Runner dir exists — STOP and confirm with human" || echo "No existing runner"

-# 8. Confirm github.com reachable
-curl -s -o /dev/null -w "%{http_code}\n" https://api.github.com/   # Expected: 200
+# 8. Confirm github.com reachable (both API and download CDN)
+curl -s -o /dev/null -w "api.github.com: %{http_code}\n" https://api.github.com/
+curl -s -o /dev/null -w "objects.githubusercontent.com: %{http_code}\n" -L \
+  https://objects.githubusercontent.com/   # runner tarball download host
+# Expected: 200 / 403 (403 from CDN root is normal; what matters is non-network-error)

 # 9. Confirm the Gitea token file exists somewhere on this VM
 sudo find /home /root -maxdepth 3 -name ".gitea_npm_token" 2>/dev/null | head -5
 # Expected: at least one path. Note the owning user — needed in Step 6.
+
+# 10. Confirm gh CLI is auth'd as saravanakumardb1 (needed for registration token)
+gh auth status 2>&1 | grep -E "Logged in to|saravanakumardb1" | head -5
+# If not logged in as saravanakumardb1, run: gh auth login (and pick saravanakumardb1)
 ```

 If any check fails or surprises you, **stop and report back** before proceeding.
@ -75,11 +102,22 @@ If any check fails or surprises you, **stop and report back** before proceeding.

 ## 4. Installation

-### Step 1 — Create the dedicated runner user
+### Step 1 — Create the dedicated runner user (idempotent)

 ```bash
-sudo useradd -m -s /bin/bash gha-runner
-sudo usermod -aG docker gha-runner    # only if any workflow uses Docker
+if ! getent passwd gha-runner >/dev/null; then
+  sudo useradd -m -s /bin/bash gha-runner
+  echo "Created gha-runner user"
+else
+  echo "gha-runner user already exists — skipping useradd"
+fi
+
+# Add to docker group only if docker is on this host
+if getent group docker >/dev/null; then
+  sudo usermod -aG docker gha-runner
+  echo "Added gha-runner to docker group"
+fi
+
 id gha-runner
 ```

@ -102,28 +140,54 @@ read -s RUNNER_TOKEN   # paste, press Enter (no echo)
 ### Step 3 — Download, verify, configure the runner

 ```bash
-sudo -iu gha-runner bash <<'EOF'
+# Query the latest runner version (don't hardcode)
+LATEST=$(gh api /repos/actions/runner/releases/latest --jq '.tag_name' | sed 's/^v//')
+echo "Latest runner version: $LATEST"
+# As of writing this doc: 2.319.1. If LATEST is wildly different, STOP and confirm with human.
+
+sudo -iu gha-runner bash <<EOF
 mkdir -p ~/actions-runner && cd ~/actions-runner
-RUNNER_VERSION="2.319.1"
+RUNNER_VERSION="$LATEST"
+RUNNER_ARCH="${RUNNER_ARCH}"   # from pre-flight 5a
+TARBALL="actions-runner-\${RUNNER_ARCH}-\${RUNNER_VERSION}.tar.gz"

-# Download
-curl -fSL -o "actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz" \
-  "https://github.com/actions/runner/releases/download/v${RUNNER_VERSION}/actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
+# Download tarball
+curl -fSL -o "\$TARBALL" \
+  "https://github.com/actions/runner/releases/download/v\${RUNNER_VERSION}/\$TARBALL"

-# Verify SHA against the GitHub release page (https://github.com/actions/runner/releases/tag/v2.319.1).
-# If the sha doesn't match, STOP and report.
+# Download checksum manifest and verify (GitHub publishes SHA256 alongside each release)
+EXPECTED_SHA=\$(gh api /repos/actions/runner/releases/tags/v\${RUNNER_VERSION} \
+  --jq ".body" | grep -oE "\b[0-9a-f]{64}\s+\$TARBALL\b" | awk '{print \$1}')
+ACTUAL_SHA=\$(sha256sum "\$TARBALL" | awk '{print \$1}')

-tar xzf "./actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
+if [ "\$EXPECTED_SHA" != "\$ACTUAL_SHA" ]; then
+  echo "FAIL: SHA mismatch"
+  echo "  Expected: \$EXPECTED_SHA"
+  echo "  Actual:   \$ACTUAL_SHA"
+  exit 1
+fi
+echo "PASS: tarball SHA verified"
+
+tar xzf "./\$TARBALL"
 EOF
 ```

-Register:
+Note: if `gh api` parsing of the SHA from the release body fails (GitHub sometimes changes release-note formatting), fall back to the official hashes page:
+`https://github.com/actions/runner/releases/tag/v<version>`. If you can't verify the SHA, STOP and report — don't run unverified binaries.
+
+Register the runner. **Choose the URL based on your scope decision (§1):**

 ```bash
+# OPTION A — repo-level (default during validation)
+REGISTRATION_URL="https://github.com/saravanakumardb1/learning_ai_common_plat"
+
+# OPTION B — org-level (once you've migrated to an org per §8)
+# REGISTRATION_URL="https://github.com/<your-org-name>"
+
 sudo -u gha-runner -E -i bash -c "
 cd ~/actions-runner && \
 ./config.sh \
-  --url https://github.com/saravanakumardb1/learning_ai_common_plat \
+  --url $REGISTRATION_URL \
  --token $RUNNER_TOKEN \
  --name hostinger-bytelyst-1 \
  --labels self-hosted,linux,x64,hostinger,bytelyst \
@ -189,7 +253,17 @@ sudo -u gha-runner bash -c 'wc -c < ~/.gitea_npm_token && stat -c "%a %U:%G" ~/.

 ## 5. Smoke test (basic — runner picks up jobs)

-Create branch `runner/smoke` in `learning_ai_common_plat` with this file:
+Create branch `runner/smoke` in `learning_ai_common_plat` with the workflow below.
+
+```bash
+cd ~/code/mygh/learning_ai_common_plat
+git checkout -b runner/smoke
+mkdir -p .github/workflows
+# (paste workflow below into .github/workflows/runner-smoke.yml)
+git add .github/workflows/runner-smoke.yml
+git commit -m "ci: add self-hosted runner smoke-test workflow"
+git push origin runner/smoke
+```

 ```yaml
 # .github/workflows/runner-smoke.yml
@ -447,18 +521,134 @@ The runner's `GITHUB_TOKEN` (provided by GitHub Actions automatically) is scoped

 ---

-## 8. Scaling to more repos later
+## 8. Scaling to all 20+ repos — GitHub Organization migration

-A single runner installation can serve multiple repos **only if** registered at org level. For your personal-account setup:
+A single self-hosted runner can serve all 20+ `@bytelyst/*` repos **only if** registered at **GitHub Organization level**. The personal-account path (repo-level registration) doesn't scale beyond 1–3 repos.

- **Recommended:** Move all 20+ repos to a free GitHub organization. Register the runner once at org level. Single runner serves everyone.
- **Workaround for now:** Add the same runner to additional repos by re-running `config.sh` with each repo's URL and a fresh token (creates separate registrations sharing the same physical binary). Acceptable up to 2–3 repos.
+### Why migrate to an org

-Recommend evaluating the org migration before scaling beyond 2 actively-publishing repos.
+| Concern                          | Personal account today                     | Under an org                                       |
+| -------------------------------- | ------------------------------------------ | -------------------------------------------------- |
+| Self-hosted runner reuse         | One registration per repo                  | One registration covers all org repos              |
+| Secrets management               | Per-repo (duplicated)                      | Org-level secrets inherited by all repos           |
+| Visibility                       | Per-repo Actions tabs (no cross-repo view) | Org-level Actions dashboard across all repos       |
+| Permissions / team collaboration | Limited                                    | Teams, code owners, etc.                           |
+| Cost                             | Free                                       | Free for unlimited public + private repos          |
+| Move cost                        | —                                          | ~1–2 hours total for 20 repos (mostly automatable) |
+
+### Migration steps (do these BEFORE Step 3 if going org-level from day 1)
+
+```bash
+# 1. Create the org via the GitHub UI:
+#    https://github.com/organizations/plan
+#    Choose "Free" plan. Suggested name: bytelyst-platform (or whatever fits).
+
+# 2. Transfer each repo to the org (one-time, preserves all history + issues + stars)
+for repo in learning_ai_common_plat learning_ai_clock learning_ai_notes \
+            learning_ai_flowmonk learning_ai_trails learning_ai_jarvis_jr \
+            learning_ai_fastgap learning_ai_peakpulse learning_ai_efforise \
+            learning_ai_auth_app learning_voice_ai_agent learning_multimodal_memory_agents \
+            learning_ai_local_memory_gpt learning_ai_local_llms learning_ai_talk2obsidian \
+            learning_ai_mac_tooling learning_ai_productivity_web learning_ai_smart_auth; do
+  echo "Transferring $repo..."
+  gh api -X POST "/repos/saravanakumardb1/$repo/transfer" -f new_owner="<your-org-name>"
+done
+
+# 3. Update your local clones to point to the new owner
+#    (run on each machine, in each repo dir)
+cd ~/code/mygh/<repo>
+git remote set-url origin https://github.com/<your-org-name>/<repo>.git
+```
+
+GitHub automatically sets up redirects from the old URLs, so external links won't break immediately — but you should update CI references, README badges, and any inter-repo URL references.
+
+### After migration
+
+- Get a runner registration token at the **org level**:
+  ```bash
+  gh api -X POST /orgs/<your-org-name>/actions/runners/registration-token --jq .token
+  ```
+- Use the org URL in Step 3's `config.sh` (Option B above).
+- The runner now picks up jobs from any repo in the org that targets `runs-on: [self-hosted, hostinger, bytelyst]`.
+
+### Workflow propagation across 20+ repos
+
+Once the runner is org-level, the next problem is propagating the `publish-packages.yml` workflow file to every repo that publishes packages. Two strategies:
+
+1. **Reusable workflow** (preferred) — define `publish-packages.yml` once as a `workflow_call` reusable workflow in `learning_ai_common_plat/.github/workflows/`, then each consuming repo has a tiny stub that calls it.
+2. **Per-repo copy maintained by a sync script** — follow the same pattern as the existing `sync-npmrc.sh` in `scripts/`. Less elegant but works fine for a small repo count.
+
+Deliver as a separate follow-up prompt.

 ---

-## 9. Deliverables — report back to the human
+## 9. Monitoring + observability — how to track this runner
+
+The GitHub Actions tab tracks runner state at three levels:
+
+### a. Per-repo (or per-org) workflow runs
+
+`https://github.com/<owner>/<repo>/actions` (or `/orgs/<org>/actions/` after migration) shows every workflow run with live-streaming logs. The "Set up job" step always logs:
+
+```
+Runner name: 'hostinger-bytelyst-1'
+Runner group name: 'Default'
+Machine name: 'hostinger-vm'
+```
+
+This is how you confirm the right runner picked up the job.
+
+### b. Runner pool health
+
+- **Repo level:** `Settings → Actions → Runners`
+- **Org level:** `Org settings → Actions → Runners`
+
+Shows: status (`Idle` / `Active` / `Offline`), labels, OS, last connection time. This is where you debug "is my runner alive?".
+
+### c. Scripted monitoring via `gh` CLI
+
+```bash
+# Watch a specific run live
+gh run watch --repo <owner>/<repo>
+
+# List recent runs
+gh run list --repo <owner>/<repo> --limit 10
+
+# View finished run with full logs
+gh run view <run-id> --log --repo <owner>/<repo>
+
+# List runners + their status (admin scope required)
+gh api /repos/<owner>/<repo>/actions/runners \
+  --jq '.runners[] | {name, status, busy, labels: [.labels[].name]}'
+
+# Or at org level:
+gh api /orgs/<org>/actions/runners \
+  --jq '.runners[] | {name, status, busy}'
+```
+
+### d. Host-side observability (on the Hostinger VM)
+
+```bash
+SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
+
+# Live tail
+sudo journalctl -u "$SVC_NAME" -f
+
+# Last 100 lines
+sudo journalctl -u "$SVC_NAME" -n 100 --no-pager
+
+# Per-run diagnostic logs
+ls -la /home/gha-runner/actions-runner/_diag/
+
+# Current systemd state
+sudo systemctl status "$SVC_NAME"
+```
+
+Use host-side logs when the runner shows "Offline" in GitHub UI but the VM is reachable — typically a daemon crash, expired registration, or network blip.
+
+---
+
+## 10. Deliverables — report back to the human

 When complete:

@ -480,7 +670,7 @@ When complete:

 ---

-## 10. Guardrails
+## 11. Guardrails

 - **Do not** run the runner as root.
 - **Do not** persist the GitHub registration token to disk — memory only.
@ -492,7 +682,7 @@ When complete:

 ---

-## 11. Rollback
+## 12. Rollback

 ```bash
 SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
@ -510,7 +700,7 @@ sudo userdel -r gha-runner

 ---

-## 12. Follow-up prompts (separate tasks)
+## 13. Follow-up prompts (separate tasks)

 Once this runner is verified end-to-end, the next prompts to issue:

@ -520,9 +710,11 @@ Once this runner is verified end-to-end, the next prompts to issue:

 ---

-## 13. Questions to ask the human BEFORE starting if anything is ambiguous
+## 14. Questions to ask the human BEFORE starting if anything is ambiguous

- "Which GitHub repo am I registering this runner for? (default: `saravanakumardb1/learning_ai_common_plat`)"
+- "Are we registering this runner at **repo level** (one repo only) or **org level** (after migrating 20+ repos to a GitHub org)? See §1 and §8."
+- "If org-level: what is the org name? Has the migration in §8 already happened?"
+- "If repo-level: which repo am I registering for? (default: `saravanakumardb1/learning_ai_common_plat`)"
 - "Is Docker required on the runner — i.e., does any planned workflow run `docker` commands? (default: no, only Gitea uses Docker)"
 - "What user currently owns `~/.gitea_npm_token` on this VM? (pre-flight check #9 will tell us)"
 - "Do you have a runner registration token, or should I fetch one via `gh api`?"