From 6bf15eae7ab446d10bb66e66591651da91d50d30 Mon Sep 17 00:00:00 2001 From: saravanakumardb1 Date: Sun, 24 May 2026 18:04:50 -0700 Subject: [PATCH] =?UTF-8?q?docs(devops):=20Hostinger=20runner=20prompt=20v?= =?UTF-8?q?2=20=E2=80=94=20org=20migration=20+=20monitoring=20+=20hardenin?= =?UTF-8?q?g?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the missing pieces revealed during review: §1 Multi-repo registration decision — choose repo-level vs org-level up-front. Default doc remains repo-level, but explicitly calls out org-level as the scaling path for 20+ repos. §2 Pre-flight check additions: - Arch detection (x86_64 / aarch64) before downloading runner tarball - github.com + objects.githubusercontent.com reachability check - gh CLI auth status check (must be saravanakumardb1) §4 Installation hardening: - Step 1 is now idempotent (getent guards on useradd/usermod) - Step 3 queries latest runner version via gh api (no more stale pin) - Step 3 includes SHA256 verification of the downloaded tarball against the release-notes manifest, with explicit STOP-if-mismatch - Step 3 has REGISTRATION_URL var with commented Option A/B for repo-level vs org-level scope §5 Smoke test — added explicit git checkout/add/commit/push commands for creating the runner/smoke branch (was implicit before). §8 (renamed) — comprehensive org migration guide: - Side-by-side table: personal account today vs under-an-org - Bash loop to transfer all 18 repos via gh api - git remote set-url commands for each local clone - Post-migration org-level registration token fetch - Workflow propagation strategies (reusable workflow vs sync script) §9 (new) — Monitoring + observability: - GitHub Actions tab per-repo + per-org workflow views - Runner pool health (Settings → Actions → Runners) at repo + org level - gh CLI commands for scripted monitoring (run watch, list, view, runners) - Host-side journalctl + _diag/ inspection commands §14 Questions — updated to ask about scope (repo vs org) first. Section numbering shifted by +1 from §9 onward to make room for the new Monitoring section. --- docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md | 254 ++++++++++++++++--- 1 file changed, 223 insertions(+), 31 deletions(-) diff --git a/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md b/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md index a4fa0815..0c35880d 100644 --- a/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md +++ b/docs/devops/HOSTINGER_GITHUB_RUNNER_SETUP.md @@ -9,10 +9,10 @@ Set up a GitHub Actions self-hosted runner on the Hostinger VM that can: -1. Receive workflow triggers from `saravanakumardb1/learning_ai_common_plat` (and, later, all `@bytelyst/*` repos). -2. Build `@bytelyst/*` npm packages from a tagged release. +1. Receive workflow triggers from **all 20+ `@bytelyst/*` repos** (see §8 for the org-vs-repo registration decision). +2. Build `@bytelyst/*` npm packages from tagged releases. 3. Publish them to **the local Gitea instance on this VM** (`http://localhost:3300/api/packages/bytelyst/npm/`). -4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via a separate `bytelyst-sync` script described in a follow-up prompt). +4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via the separate `bytelyst-sync` script described in a follow-up prompt). Self-hosted on Hostinger beats GitHub-hosted runners because: @@ -20,6 +20,17 @@ Self-hosted on Hostinger beats GitHub-hosted runners because: - Gitea is on `localhost` from this VM → zero-latency publish, no public TLS needed. - VM is always on; runner is reachable indefinitely. +### Multi-repo registration decision + +**Decide before Step 2** whether this runner serves a single repo or all 20+: + +| Approach | Registration scope | Use when | +| ------------------------------------ | ------------------------- | -------------------------------------------------------------------------------------------------------------- | +| **Repo-level** (default in this doc) | Single repo URL | You're validating the runner first, or you only have 1–2 repos publishing packages | +| **Org-level** (recommended at scale) | A GitHub Organization URL | You've migrated 20+ repos under one org — see §8 for migration steps. One registration, all org repos eligible | + +The install steps are identical — only the `--url` flag in `config.sh` (Step 3) changes. + --- ## 2. Pre-flight checks (run first, do not skip) @@ -41,18 +52,34 @@ pnpm --version 2>/dev/null || echo "pnpm not installed (will install in Step 5)" # 5. Confirm gh CLI exists gh --version 2>/dev/null || echo "gh CLI not installed (will install in Step 5)" -# 6. Disk free +# 6a. Detect architecture (used for runner tarball selection in Step 3) +ARCH=$(uname -m) +case "$ARCH" in + x86_64) export RUNNER_ARCH="linux-x64";; + aarch64) export RUNNER_ARCH="linux-arm64";; + *) echo "Unsupported arch: $ARCH — STOP and report"; ;; +esac +echo "Will install runner for: $RUNNER_ARCH" + +# 6b. Disk free df -h / # Need ~5 GB headroom # 7. Confirm no existing runner ls -la ~/actions-runner 2>/dev/null && echo "Runner dir exists — STOP and confirm with human" || echo "No existing runner" -# 8. Confirm github.com reachable -curl -s -o /dev/null -w "%{http_code}\n" https://api.github.com/ # Expected: 200 +# 8. Confirm github.com reachable (both API and download CDN) +curl -s -o /dev/null -w "api.github.com: %{http_code}\n" https://api.github.com/ +curl -s -o /dev/null -w "objects.githubusercontent.com: %{http_code}\n" -L \ + https://objects.githubusercontent.com/ # runner tarball download host +# Expected: 200 / 403 (403 from CDN root is normal; what matters is non-network-error) # 9. Confirm the Gitea token file exists somewhere on this VM sudo find /home /root -maxdepth 3 -name ".gitea_npm_token" 2>/dev/null | head -5 # Expected: at least one path. Note the owning user — needed in Step 6. + +# 10. Confirm gh CLI is auth'd as saravanakumardb1 (needed for registration token) +gh auth status 2>&1 | grep -E "Logged in to|saravanakumardb1" | head -5 +# If not logged in as saravanakumardb1, run: gh auth login (and pick saravanakumardb1) ``` If any check fails or surprises you, **stop and report back** before proceeding. @@ -75,11 +102,22 @@ If any check fails or surprises you, **stop and report back** before proceeding. ## 4. Installation -### Step 1 — Create the dedicated runner user +### Step 1 — Create the dedicated runner user (idempotent) ```bash -sudo useradd -m -s /bin/bash gha-runner -sudo usermod -aG docker gha-runner # only if any workflow uses Docker +if ! getent passwd gha-runner >/dev/null; then + sudo useradd -m -s /bin/bash gha-runner + echo "Created gha-runner user" +else + echo "gha-runner user already exists — skipping useradd" +fi + +# Add to docker group only if docker is on this host +if getent group docker >/dev/null; then + sudo usermod -aG docker gha-runner + echo "Added gha-runner to docker group" +fi + id gha-runner ``` @@ -102,28 +140,54 @@ read -s RUNNER_TOKEN # paste, press Enter (no echo) ### Step 3 — Download, verify, configure the runner ```bash -sudo -iu gha-runner bash <<'EOF' +# Query the latest runner version (don't hardcode) +LATEST=$(gh api /repos/actions/runner/releases/latest --jq '.tag_name' | sed 's/^v//') +echo "Latest runner version: $LATEST" +# As of writing this doc: 2.319.1. If LATEST is wildly different, STOP and confirm with human. + +sudo -iu gha-runner bash <`. If you can't verify the SHA, STOP and report — don't run unverified binaries. + +Register the runner. **Choose the URL based on your scope decision (§1):** ```bash +# OPTION A — repo-level (default during validation) +REGISTRATION_URL="https://github.com/saravanakumardb1/learning_ai_common_plat" + +# OPTION B — org-level (once you've migrated to an org per §8) +# REGISTRATION_URL="https://github.com/" + sudo -u gha-runner -E -i bash -c " cd ~/actions-runner && \ ./config.sh \ - --url https://github.com/saravanakumardb1/learning_ai_common_plat \ + --url $REGISTRATION_URL \ --token $RUNNER_TOKEN \ --name hostinger-bytelyst-1 \ --labels self-hosted,linux,x64,hostinger,bytelyst \ @@ -189,7 +253,17 @@ sudo -u gha-runner bash -c 'wc -c < ~/.gitea_npm_token && stat -c "%a %U:%G" ~/. ## 5. Smoke test (basic — runner picks up jobs) -Create branch `runner/smoke` in `learning_ai_common_plat` with this file: +Create branch `runner/smoke` in `learning_ai_common_plat` with the workflow below. + +```bash +cd ~/code/mygh/learning_ai_common_plat +git checkout -b runner/smoke +mkdir -p .github/workflows +# (paste workflow below into .github/workflows/runner-smoke.yml) +git add .github/workflows/runner-smoke.yml +git commit -m "ci: add self-hosted runner smoke-test workflow" +git push origin runner/smoke +``` ```yaml # .github/workflows/runner-smoke.yml @@ -447,18 +521,134 @@ The runner's `GITHUB_TOKEN` (provided by GitHub Actions automatically) is scoped --- -## 8. Scaling to more repos later +## 8. Scaling to all 20+ repos — GitHub Organization migration -A single runner installation can serve multiple repos **only if** registered at org level. For your personal-account setup: +A single self-hosted runner can serve all 20+ `@bytelyst/*` repos **only if** registered at **GitHub Organization level**. The personal-account path (repo-level registration) doesn't scale beyond 1–3 repos. -- **Recommended:** Move all 20+ repos to a free GitHub organization. Register the runner once at org level. Single runner serves everyone. -- **Workaround for now:** Add the same runner to additional repos by re-running `config.sh` with each repo's URL and a fresh token (creates separate registrations sharing the same physical binary). Acceptable up to 2–3 repos. +### Why migrate to an org -Recommend evaluating the org migration before scaling beyond 2 actively-publishing repos. +| Concern | Personal account today | Under an org | +| -------------------------------- | ------------------------------------------ | -------------------------------------------------- | +| Self-hosted runner reuse | One registration per repo | One registration covers all org repos | +| Secrets management | Per-repo (duplicated) | Org-level secrets inherited by all repos | +| Visibility | Per-repo Actions tabs (no cross-repo view) | Org-level Actions dashboard across all repos | +| Permissions / team collaboration | Limited | Teams, code owners, etc. | +| Cost | Free | Free for unlimited public + private repos | +| Move cost | — | ~1–2 hours total for 20 repos (mostly automatable) | + +### Migration steps (do these BEFORE Step 3 if going org-level from day 1) + +```bash +# 1. Create the org via the GitHub UI: +# https://github.com/organizations/plan +# Choose "Free" plan. Suggested name: bytelyst-platform (or whatever fits). + +# 2. Transfer each repo to the org (one-time, preserves all history + issues + stars) +for repo in learning_ai_common_plat learning_ai_clock learning_ai_notes \ + learning_ai_flowmonk learning_ai_trails learning_ai_jarvis_jr \ + learning_ai_fastgap learning_ai_peakpulse learning_ai_efforise \ + learning_ai_auth_app learning_voice_ai_agent learning_multimodal_memory_agents \ + learning_ai_local_memory_gpt learning_ai_local_llms learning_ai_talk2obsidian \ + learning_ai_mac_tooling learning_ai_productivity_web learning_ai_smart_auth; do + echo "Transferring $repo..." + gh api -X POST "/repos/saravanakumardb1/$repo/transfer" -f new_owner="" +done + +# 3. Update your local clones to point to the new owner +# (run on each machine, in each repo dir) +cd ~/code/mygh/ +git remote set-url origin https://github.com//.git +``` + +GitHub automatically sets up redirects from the old URLs, so external links won't break immediately — but you should update CI references, README badges, and any inter-repo URL references. + +### After migration + +- Get a runner registration token at the **org level**: + ```bash + gh api -X POST /orgs//actions/runners/registration-token --jq .token + ``` +- Use the org URL in Step 3's `config.sh` (Option B above). +- The runner now picks up jobs from any repo in the org that targets `runs-on: [self-hosted, hostinger, bytelyst]`. + +### Workflow propagation across 20+ repos + +Once the runner is org-level, the next problem is propagating the `publish-packages.yml` workflow file to every repo that publishes packages. Two strategies: + +1. **Reusable workflow** (preferred) — define `publish-packages.yml` once as a `workflow_call` reusable workflow in `learning_ai_common_plat/.github/workflows/`, then each consuming repo has a tiny stub that calls it. +2. **Per-repo copy maintained by a sync script** — follow the same pattern as the existing `sync-npmrc.sh` in `scripts/`. Less elegant but works fine for a small repo count. + +Deliver as a separate follow-up prompt. --- -## 9. Deliverables — report back to the human +## 9. Monitoring + observability — how to track this runner + +The GitHub Actions tab tracks runner state at three levels: + +### a. Per-repo (or per-org) workflow runs + +`https://github.com///actions` (or `/orgs//actions/` after migration) shows every workflow run with live-streaming logs. The "Set up job" step always logs: + +``` +Runner name: 'hostinger-bytelyst-1' +Runner group name: 'Default' +Machine name: 'hostinger-vm' +``` + +This is how you confirm the right runner picked up the job. + +### b. Runner pool health + +- **Repo level:** `Settings → Actions → Runners` +- **Org level:** `Org settings → Actions → Runners` + +Shows: status (`Idle` / `Active` / `Offline`), labels, OS, last connection time. This is where you debug "is my runner alive?". + +### c. Scripted monitoring via `gh` CLI + +```bash +# Watch a specific run live +gh run watch --repo / + +# List recent runs +gh run list --repo / --limit 10 + +# View finished run with full logs +gh run view --log --repo / + +# List runners + their status (admin scope required) +gh api /repos///actions/runners \ + --jq '.runners[] | {name, status, busy, labels: [.labels[].name]}' + +# Or at org level: +gh api /orgs//actions/runners \ + --jq '.runners[] | {name, status, busy}' +``` + +### d. Host-side observability (on the Hostinger VM) + +```bash +SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service' + +# Live tail +sudo journalctl -u "$SVC_NAME" -f + +# Last 100 lines +sudo journalctl -u "$SVC_NAME" -n 100 --no-pager + +# Per-run diagnostic logs +ls -la /home/gha-runner/actions-runner/_diag/ + +# Current systemd state +sudo systemctl status "$SVC_NAME" +``` + +Use host-side logs when the runner shows "Offline" in GitHub UI but the VM is reachable — typically a daemon crash, expired registration, or network blip. + +--- + +## 10. Deliverables — report back to the human When complete: @@ -480,7 +670,7 @@ When complete: --- -## 10. Guardrails +## 11. Guardrails - **Do not** run the runner as root. - **Do not** persist the GitHub registration token to disk — memory only. @@ -492,7 +682,7 @@ When complete: --- -## 11. Rollback +## 12. Rollback ```bash SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service' @@ -510,7 +700,7 @@ sudo userdel -r gha-runner --- -## 12. Follow-up prompts (separate tasks) +## 13. Follow-up prompts (separate tasks) Once this runner is verified end-to-end, the next prompts to issue: @@ -520,9 +710,11 @@ Once this runner is verified end-to-end, the next prompts to issue: --- -## 13. Questions to ask the human BEFORE starting if anything is ambiguous +## 14. Questions to ask the human BEFORE starting if anything is ambiguous -- "Which GitHub repo am I registering this runner for? (default: `saravanakumardb1/learning_ai_common_plat`)" +- "Are we registering this runner at **repo level** (one repo only) or **org level** (after migrating 20+ repos to a GitHub org)? See §1 and §8." +- "If org-level: what is the org name? Has the migration in §8 already happened?" +- "If repo-level: which repo am I registering for? (default: `saravanakumardb1/learning_ai_common_plat`)" - "Is Docker required on the runner — i.e., does any planned workflow run `docker` commands? (default: no, only Gitea uses Docker)" - "What user currently owns `~/.gitea_npm_token` on this VM? (pre-flight check #9 will tell us)" - "Do you have a runner registration token, or should I fetch one via `gh api`?"