docs(devops): Hostinger runner prompt v2 — org migration + monitoring + hardening

Adds the missing pieces revealed during review:

§1 Multi-repo registration decision — choose repo-level vs org-level
   up-front. Default doc remains repo-level, but explicitly calls out
   org-level as the scaling path for 20+ repos.

§2 Pre-flight check additions:
  - Arch detection (x86_64 / aarch64) before downloading runner tarball
  - github.com + objects.githubusercontent.com reachability check
  - gh CLI auth status check (must be saravanakumardb1)

§4 Installation hardening:
  - Step 1 is now idempotent (getent guards on useradd/usermod)
  - Step 3 queries latest runner version via gh api (no more stale pin)
  - Step 3 includes SHA256 verification of the downloaded tarball
    against the release-notes manifest, with explicit STOP-if-mismatch
  - Step 3 has REGISTRATION_URL var with commented Option A/B for
    repo-level vs org-level scope

§5 Smoke test — added explicit git checkout/add/commit/push commands
   for creating the runner/smoke branch (was implicit before).

§8 (renamed) — comprehensive org migration guide:
  - Side-by-side table: personal account today vs under-an-org
  - Bash loop to transfer all 18 repos via gh api
  - git remote set-url commands for each local clone
  - Post-migration org-level registration token fetch
  - Workflow propagation strategies (reusable workflow vs sync script)

§9 (new) — Monitoring + observability:
  - GitHub Actions tab per-repo + per-org workflow views
  - Runner pool health (Settings → Actions → Runners) at repo + org level
  - gh CLI commands for scripted monitoring (run watch, list, view, runners)
  - Host-side journalctl + _diag/ inspection commands

§14 Questions — updated to ask about scope (repo vs org) first.

Section numbering shifted by +1 from §9 onward to make room for the
new Monitoring section.
This commit is contained in:
saravanakumardb1 2026-05-24 18:04:50 -07:00
parent d5e0778af6
commit 6bf15eae7a

View File

@ -9,10 +9,10 @@
Set up a GitHub Actions self-hosted runner on the Hostinger VM that can:
1. Receive workflow triggers from `saravanakumardb1/learning_ai_common_plat` (and, later, all `@bytelyst/*` repos).
2. Build `@bytelyst/*` npm packages from a tagged release.
1. Receive workflow triggers from **all 20+ `@bytelyst/*` repos** (see §8 for the org-vs-repo registration decision).
2. Build `@bytelyst/*` npm packages from tagged releases.
3. Publish them to **the local Gitea instance on this VM** (`http://localhost:3300/api/packages/bytelyst/npm/`).
4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via a separate `bytelyst-sync` script described in a follow-up prompt).
4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via the separate `bytelyst-sync` script described in a follow-up prompt).
Self-hosted on Hostinger beats GitHub-hosted runners because:
@ -20,6 +20,17 @@ Self-hosted on Hostinger beats GitHub-hosted runners because:
- Gitea is on `localhost` from this VM → zero-latency publish, no public TLS needed.
- VM is always on; runner is reachable indefinitely.
### Multi-repo registration decision
**Decide before Step 2** whether this runner serves a single repo or all 20+:
| Approach | Registration scope | Use when |
| ------------------------------------ | ------------------------- | -------------------------------------------------------------------------------------------------------------- |
| **Repo-level** (default in this doc) | Single repo URL | You're validating the runner first, or you only have 12 repos publishing packages |
| **Org-level** (recommended at scale) | A GitHub Organization URL | You've migrated 20+ repos under one org — see §8 for migration steps. One registration, all org repos eligible |
The install steps are identical — only the `--url` flag in `config.sh` (Step 3) changes.
---
## 2. Pre-flight checks (run first, do not skip)
@ -41,18 +52,34 @@ pnpm --version 2>/dev/null || echo "pnpm not installed (will install in Step 5)"
# 5. Confirm gh CLI exists
gh --version 2>/dev/null || echo "gh CLI not installed (will install in Step 5)"
# 6. Disk free
# 6a. Detect architecture (used for runner tarball selection in Step 3)
ARCH=$(uname -m)
case "$ARCH" in
x86_64) export RUNNER_ARCH="linux-x64";;
aarch64) export RUNNER_ARCH="linux-arm64";;
*) echo "Unsupported arch: $ARCH — STOP and report"; ;;
esac
echo "Will install runner for: $RUNNER_ARCH"
# 6b. Disk free
df -h / # Need ~5 GB headroom
# 7. Confirm no existing runner
ls -la ~/actions-runner 2>/dev/null && echo "Runner dir exists — STOP and confirm with human" || echo "No existing runner"
# 8. Confirm github.com reachable
curl -s -o /dev/null -w "%{http_code}\n" https://api.github.com/ # Expected: 200
# 8. Confirm github.com reachable (both API and download CDN)
curl -s -o /dev/null -w "api.github.com: %{http_code}\n" https://api.github.com/
curl -s -o /dev/null -w "objects.githubusercontent.com: %{http_code}\n" -L \
https://objects.githubusercontent.com/ # runner tarball download host
# Expected: 200 / 403 (403 from CDN root is normal; what matters is non-network-error)
# 9. Confirm the Gitea token file exists somewhere on this VM
sudo find /home /root -maxdepth 3 -name ".gitea_npm_token" 2>/dev/null | head -5
# Expected: at least one path. Note the owning user — needed in Step 6.
# 10. Confirm gh CLI is auth'd as saravanakumardb1 (needed for registration token)
gh auth status 2>&1 | grep -E "Logged in to|saravanakumardb1" | head -5
# If not logged in as saravanakumardb1, run: gh auth login (and pick saravanakumardb1)
```
If any check fails or surprises you, **stop and report back** before proceeding.
@ -75,11 +102,22 @@ If any check fails or surprises you, **stop and report back** before proceeding.
## 4. Installation
### Step 1 — Create the dedicated runner user
### Step 1 — Create the dedicated runner user (idempotent)
```bash
sudo useradd -m -s /bin/bash gha-runner
sudo usermod -aG docker gha-runner # only if any workflow uses Docker
if ! getent passwd gha-runner >/dev/null; then
sudo useradd -m -s /bin/bash gha-runner
echo "Created gha-runner user"
else
echo "gha-runner user already exists — skipping useradd"
fi
# Add to docker group only if docker is on this host
if getent group docker >/dev/null; then
sudo usermod -aG docker gha-runner
echo "Added gha-runner to docker group"
fi
id gha-runner
```
@ -102,28 +140,54 @@ read -s RUNNER_TOKEN # paste, press Enter (no echo)
### Step 3 — Download, verify, configure the runner
```bash
sudo -iu gha-runner bash <<'EOF'
# Query the latest runner version (don't hardcode)
LATEST=$(gh api /repos/actions/runner/releases/latest --jq '.tag_name' | sed 's/^v//')
echo "Latest runner version: $LATEST"
# As of writing this doc: 2.319.1. If LATEST is wildly different, STOP and confirm with human.
sudo -iu gha-runner bash <<EOF
mkdir -p ~/actions-runner && cd ~/actions-runner
RUNNER_VERSION="2.319.1"
RUNNER_VERSION="$LATEST"
RUNNER_ARCH="${RUNNER_ARCH}" # from pre-flight 5a
TARBALL="actions-runner-\${RUNNER_ARCH}-\${RUNNER_VERSION}.tar.gz"
# Download
curl -fSL -o "actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz" \
"https://github.com/actions/runner/releases/download/v${RUNNER_VERSION}/actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
# Download tarball
curl -fSL -o "\$TARBALL" \
"https://github.com/actions/runner/releases/download/v\${RUNNER_VERSION}/\$TARBALL"
# Verify SHA against the GitHub release page (https://github.com/actions/runner/releases/tag/v2.319.1).
# If the sha doesn't match, STOP and report.
# Download checksum manifest and verify (GitHub publishes SHA256 alongside each release)
EXPECTED_SHA=\$(gh api /repos/actions/runner/releases/tags/v\${RUNNER_VERSION} \
--jq ".body" | grep -oE "\b[0-9a-f]{64}\s+\$TARBALL\b" | awk '{print \$1}')
ACTUAL_SHA=\$(sha256sum "\$TARBALL" | awk '{print \$1}')
tar xzf "./actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
if [ "\$EXPECTED_SHA" != "\$ACTUAL_SHA" ]; then
echo "FAIL: SHA mismatch"
echo " Expected: \$EXPECTED_SHA"
echo " Actual: \$ACTUAL_SHA"
exit 1
fi
echo "PASS: tarball SHA verified"
tar xzf "./\$TARBALL"
EOF
```
Register:
Note: if `gh api` parsing of the SHA from the release body fails (GitHub sometimes changes release-note formatting), fall back to the official hashes page:
`https://github.com/actions/runner/releases/tag/v<version>`. If you can't verify the SHA, STOP and report — don't run unverified binaries.
Register the runner. **Choose the URL based on your scope decision (§1):**
```bash
# OPTION A — repo-level (default during validation)
REGISTRATION_URL="https://github.com/saravanakumardb1/learning_ai_common_plat"
# OPTION B — org-level (once you've migrated to an org per §8)
# REGISTRATION_URL="https://github.com/<your-org-name>"
sudo -u gha-runner -E -i bash -c "
cd ~/actions-runner && \
./config.sh \
--url https://github.com/saravanakumardb1/learning_ai_common_plat \
--url $REGISTRATION_URL \
--token $RUNNER_TOKEN \
--name hostinger-bytelyst-1 \
--labels self-hosted,linux,x64,hostinger,bytelyst \
@ -189,7 +253,17 @@ sudo -u gha-runner bash -c 'wc -c < ~/.gitea_npm_token && stat -c "%a %U:%G" ~/.
## 5. Smoke test (basic — runner picks up jobs)
Create branch `runner/smoke` in `learning_ai_common_plat` with this file:
Create branch `runner/smoke` in `learning_ai_common_plat` with the workflow below.
```bash
cd ~/code/mygh/learning_ai_common_plat
git checkout -b runner/smoke
mkdir -p .github/workflows
# (paste workflow below into .github/workflows/runner-smoke.yml)
git add .github/workflows/runner-smoke.yml
git commit -m "ci: add self-hosted runner smoke-test workflow"
git push origin runner/smoke
```
```yaml
# .github/workflows/runner-smoke.yml
@ -447,18 +521,134 @@ The runner's `GITHUB_TOKEN` (provided by GitHub Actions automatically) is scoped
---
## 8. Scaling to more repos later
## 8. Scaling to all 20+ repos — GitHub Organization migration
A single runner installation can serve multiple repos **only if** registered at org level. For your personal-account setup:
A single self-hosted runner can serve all 20+ `@bytelyst/*` repos **only if** registered at **GitHub Organization level**. The personal-account path (repo-level registration) doesn't scale beyond 13 repos.
- **Recommended:** Move all 20+ repos to a free GitHub organization. Register the runner once at org level. Single runner serves everyone.
- **Workaround for now:** Add the same runner to additional repos by re-running `config.sh` with each repo's URL and a fresh token (creates separate registrations sharing the same physical binary). Acceptable up to 23 repos.
### Why migrate to an org
Recommend evaluating the org migration before scaling beyond 2 actively-publishing repos.
| Concern | Personal account today | Under an org |
| -------------------------------- | ------------------------------------------ | -------------------------------------------------- |
| Self-hosted runner reuse | One registration per repo | One registration covers all org repos |
| Secrets management | Per-repo (duplicated) | Org-level secrets inherited by all repos |
| Visibility | Per-repo Actions tabs (no cross-repo view) | Org-level Actions dashboard across all repos |
| Permissions / team collaboration | Limited | Teams, code owners, etc. |
| Cost | Free | Free for unlimited public + private repos |
| Move cost | — | ~12 hours total for 20 repos (mostly automatable) |
### Migration steps (do these BEFORE Step 3 if going org-level from day 1)
```bash
# 1. Create the org via the GitHub UI:
# https://github.com/organizations/plan
# Choose "Free" plan. Suggested name: bytelyst-platform (or whatever fits).
# 2. Transfer each repo to the org (one-time, preserves all history + issues + stars)
for repo in learning_ai_common_plat learning_ai_clock learning_ai_notes \
learning_ai_flowmonk learning_ai_trails learning_ai_jarvis_jr \
learning_ai_fastgap learning_ai_peakpulse learning_ai_efforise \
learning_ai_auth_app learning_voice_ai_agent learning_multimodal_memory_agents \
learning_ai_local_memory_gpt learning_ai_local_llms learning_ai_talk2obsidian \
learning_ai_mac_tooling learning_ai_productivity_web learning_ai_smart_auth; do
echo "Transferring $repo..."
gh api -X POST "/repos/saravanakumardb1/$repo/transfer" -f new_owner="<your-org-name>"
done
# 3. Update your local clones to point to the new owner
# (run on each machine, in each repo dir)
cd ~/code/mygh/<repo>
git remote set-url origin https://github.com/<your-org-name>/<repo>.git
```
GitHub automatically sets up redirects from the old URLs, so external links won't break immediately — but you should update CI references, README badges, and any inter-repo URL references.
### After migration
- Get a runner registration token at the **org level**:
```bash
gh api -X POST /orgs/<your-org-name>/actions/runners/registration-token --jq .token
```
- Use the org URL in Step 3's `config.sh` (Option B above).
- The runner now picks up jobs from any repo in the org that targets `runs-on: [self-hosted, hostinger, bytelyst]`.
### Workflow propagation across 20+ repos
Once the runner is org-level, the next problem is propagating the `publish-packages.yml` workflow file to every repo that publishes packages. Two strategies:
1. **Reusable workflow** (preferred) — define `publish-packages.yml` once as a `workflow_call` reusable workflow in `learning_ai_common_plat/.github/workflows/`, then each consuming repo has a tiny stub that calls it.
2. **Per-repo copy maintained by a sync script** — follow the same pattern as the existing `sync-npmrc.sh` in `scripts/`. Less elegant but works fine for a small repo count.
Deliver as a separate follow-up prompt.
---
## 9. Deliverables — report back to the human
## 9. Monitoring + observability — how to track this runner
The GitHub Actions tab tracks runner state at three levels:
### a. Per-repo (or per-org) workflow runs
`https://github.com/<owner>/<repo>/actions` (or `/orgs/<org>/actions/` after migration) shows every workflow run with live-streaming logs. The "Set up job" step always logs:
```
Runner name: 'hostinger-bytelyst-1'
Runner group name: 'Default'
Machine name: 'hostinger-vm'
```
This is how you confirm the right runner picked up the job.
### b. Runner pool health
- **Repo level:** `Settings → Actions → Runners`
- **Org level:** `Org settings → Actions → Runners`
Shows: status (`Idle` / `Active` / `Offline`), labels, OS, last connection time. This is where you debug "is my runner alive?".
### c. Scripted monitoring via `gh` CLI
```bash
# Watch a specific run live
gh run watch --repo <owner>/<repo>
# List recent runs
gh run list --repo <owner>/<repo> --limit 10
# View finished run with full logs
gh run view <run-id> --log --repo <owner>/<repo>
# List runners + their status (admin scope required)
gh api /repos/<owner>/<repo>/actions/runners \
--jq '.runners[] | {name, status, busy, labels: [.labels[].name]}'
# Or at org level:
gh api /orgs/<org>/actions/runners \
--jq '.runners[] | {name, status, busy}'
```
### d. Host-side observability (on the Hostinger VM)
```bash
SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
# Live tail
sudo journalctl -u "$SVC_NAME" -f
# Last 100 lines
sudo journalctl -u "$SVC_NAME" -n 100 --no-pager
# Per-run diagnostic logs
ls -la /home/gha-runner/actions-runner/_diag/
# Current systemd state
sudo systemctl status "$SVC_NAME"
```
Use host-side logs when the runner shows "Offline" in GitHub UI but the VM is reachable — typically a daemon crash, expired registration, or network blip.
---
## 10. Deliverables — report back to the human
When complete:
@ -480,7 +670,7 @@ When complete:
---
## 10. Guardrails
## 11. Guardrails
- **Do not** run the runner as root.
- **Do not** persist the GitHub registration token to disk — memory only.
@ -492,7 +682,7 @@ When complete:
---
## 11. Rollback
## 12. Rollback
```bash
SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
@ -510,7 +700,7 @@ sudo userdel -r gha-runner
---
## 12. Follow-up prompts (separate tasks)
## 13. Follow-up prompts (separate tasks)
Once this runner is verified end-to-end, the next prompts to issue:
@ -520,9 +710,11 @@ Once this runner is verified end-to-end, the next prompts to issue:
---
## 13. Questions to ask the human BEFORE starting if anything is ambiguous
## 14. Questions to ask the human BEFORE starting if anything is ambiguous
- "Which GitHub repo am I registering this runner for? (default: `saravanakumardb1/learning_ai_common_plat`)"
- "Are we registering this runner at **repo level** (one repo only) or **org level** (after migrating 20+ repos to a GitHub org)? See §1 and §8."
- "If org-level: what is the org name? Has the migration in §8 already happened?"
- "If repo-level: which repo am I registering for? (default: `saravanakumardb1/learning_ai_common_plat`)"
- "Is Docker required on the runner — i.e., does any planned workflow run `docker` commands? (default: no, only Gitea uses Docker)"
- "What user currently owns `~/.gitea_npm_token` on this VM? (pre-flight check #9 will tell us)"
- "Do you have a runner registration token, or should I fetch one via `gh api`?"