docs(devops): Hostinger runner prompt v2 — org migration + monitoring + hardening
Adds the missing pieces revealed during review:
§1 Multi-repo registration decision — choose repo-level vs org-level
up-front. Default doc remains repo-level, but explicitly calls out
org-level as the scaling path for 20+ repos.
§2 Pre-flight check additions:
- Arch detection (x86_64 / aarch64) before downloading runner tarball
- github.com + objects.githubusercontent.com reachability check
- gh CLI auth status check (must be saravanakumardb1)
§4 Installation hardening:
- Step 1 is now idempotent (getent guards on useradd/usermod)
- Step 3 queries latest runner version via gh api (no more stale pin)
- Step 3 includes SHA256 verification of the downloaded tarball
against the release-notes manifest, with explicit STOP-if-mismatch
- Step 3 has REGISTRATION_URL var with commented Option A/B for
repo-level vs org-level scope
§5 Smoke test — added explicit git checkout/add/commit/push commands
for creating the runner/smoke branch (was implicit before).
§8 (renamed) — comprehensive org migration guide:
- Side-by-side table: personal account today vs under-an-org
- Bash loop to transfer all 18 repos via gh api
- git remote set-url commands for each local clone
- Post-migration org-level registration token fetch
- Workflow propagation strategies (reusable workflow vs sync script)
§9 (new) — Monitoring + observability:
- GitHub Actions tab per-repo + per-org workflow views
- Runner pool health (Settings → Actions → Runners) at repo + org level
- gh CLI commands for scripted monitoring (run watch, list, view, runners)
- Host-side journalctl + _diag/ inspection commands
§14 Questions — updated to ask about scope (repo vs org) first.
Section numbering shifted by +1 from §9 onward to make room for the
new Monitoring section.
This commit is contained in:
parent
d5e0778af6
commit
6bf15eae7a
@ -9,10 +9,10 @@
|
||||
|
||||
Set up a GitHub Actions self-hosted runner on the Hostinger VM that can:
|
||||
|
||||
1. Receive workflow triggers from `saravanakumardb1/learning_ai_common_plat` (and, later, all `@bytelyst/*` repos).
|
||||
2. Build `@bytelyst/*` npm packages from a tagged release.
|
||||
1. Receive workflow triggers from **all 20+ `@bytelyst/*` repos** (see §8 for the org-vs-repo registration decision).
|
||||
2. Build `@bytelyst/*` npm packages from tagged releases.
|
||||
3. Publish them to **the local Gitea instance on this VM** (`http://localhost:3300/api/packages/bytelyst/npm/`).
|
||||
4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via a separate `bytelyst-sync` script described in a follow-up prompt).
|
||||
4. Upload the same tarballs as **GitHub Release assets** so a corp-network Mac can sync them into its own local Gitea (via the separate `bytelyst-sync` script described in a follow-up prompt).
|
||||
|
||||
Self-hosted on Hostinger beats GitHub-hosted runners because:
|
||||
|
||||
@ -20,6 +20,17 @@ Self-hosted on Hostinger beats GitHub-hosted runners because:
|
||||
- Gitea is on `localhost` from this VM → zero-latency publish, no public TLS needed.
|
||||
- VM is always on; runner is reachable indefinitely.
|
||||
|
||||
### Multi-repo registration decision
|
||||
|
||||
**Decide before Step 2** whether this runner serves a single repo or all 20+:
|
||||
|
||||
| Approach | Registration scope | Use when |
|
||||
| ------------------------------------ | ------------------------- | -------------------------------------------------------------------------------------------------------------- |
|
||||
| **Repo-level** (default in this doc) | Single repo URL | You're validating the runner first, or you only have 1–2 repos publishing packages |
|
||||
| **Org-level** (recommended at scale) | A GitHub Organization URL | You've migrated 20+ repos under one org — see §8 for migration steps. One registration, all org repos eligible |
|
||||
|
||||
The install steps are identical — only the `--url` flag in `config.sh` (Step 3) changes.
|
||||
|
||||
---
|
||||
|
||||
## 2. Pre-flight checks (run first, do not skip)
|
||||
@ -41,18 +52,34 @@ pnpm --version 2>/dev/null || echo "pnpm not installed (will install in Step 5)"
|
||||
# 5. Confirm gh CLI exists
|
||||
gh --version 2>/dev/null || echo "gh CLI not installed (will install in Step 5)"
|
||||
|
||||
# 6. Disk free
|
||||
# 6a. Detect architecture (used for runner tarball selection in Step 3)
|
||||
ARCH=$(uname -m)
|
||||
case "$ARCH" in
|
||||
x86_64) export RUNNER_ARCH="linux-x64";;
|
||||
aarch64) export RUNNER_ARCH="linux-arm64";;
|
||||
*) echo "Unsupported arch: $ARCH — STOP and report"; ;;
|
||||
esac
|
||||
echo "Will install runner for: $RUNNER_ARCH"
|
||||
|
||||
# 6b. Disk free
|
||||
df -h / # Need ~5 GB headroom
|
||||
|
||||
# 7. Confirm no existing runner
|
||||
ls -la ~/actions-runner 2>/dev/null && echo "Runner dir exists — STOP and confirm with human" || echo "No existing runner"
|
||||
|
||||
# 8. Confirm github.com reachable
|
||||
curl -s -o /dev/null -w "%{http_code}\n" https://api.github.com/ # Expected: 200
|
||||
# 8. Confirm github.com reachable (both API and download CDN)
|
||||
curl -s -o /dev/null -w "api.github.com: %{http_code}\n" https://api.github.com/
|
||||
curl -s -o /dev/null -w "objects.githubusercontent.com: %{http_code}\n" -L \
|
||||
https://objects.githubusercontent.com/ # runner tarball download host
|
||||
# Expected: 200 / 403 (403 from CDN root is normal; what matters is non-network-error)
|
||||
|
||||
# 9. Confirm the Gitea token file exists somewhere on this VM
|
||||
sudo find /home /root -maxdepth 3 -name ".gitea_npm_token" 2>/dev/null | head -5
|
||||
# Expected: at least one path. Note the owning user — needed in Step 6.
|
||||
|
||||
# 10. Confirm gh CLI is auth'd as saravanakumardb1 (needed for registration token)
|
||||
gh auth status 2>&1 | grep -E "Logged in to|saravanakumardb1" | head -5
|
||||
# If not logged in as saravanakumardb1, run: gh auth login (and pick saravanakumardb1)
|
||||
```
|
||||
|
||||
If any check fails or surprises you, **stop and report back** before proceeding.
|
||||
@ -75,11 +102,22 @@ If any check fails or surprises you, **stop and report back** before proceeding.
|
||||
|
||||
## 4. Installation
|
||||
|
||||
### Step 1 — Create the dedicated runner user
|
||||
### Step 1 — Create the dedicated runner user (idempotent)
|
||||
|
||||
```bash
|
||||
sudo useradd -m -s /bin/bash gha-runner
|
||||
sudo usermod -aG docker gha-runner # only if any workflow uses Docker
|
||||
if ! getent passwd gha-runner >/dev/null; then
|
||||
sudo useradd -m -s /bin/bash gha-runner
|
||||
echo "Created gha-runner user"
|
||||
else
|
||||
echo "gha-runner user already exists — skipping useradd"
|
||||
fi
|
||||
|
||||
# Add to docker group only if docker is on this host
|
||||
if getent group docker >/dev/null; then
|
||||
sudo usermod -aG docker gha-runner
|
||||
echo "Added gha-runner to docker group"
|
||||
fi
|
||||
|
||||
id gha-runner
|
||||
```
|
||||
|
||||
@ -102,28 +140,54 @@ read -s RUNNER_TOKEN # paste, press Enter (no echo)
|
||||
### Step 3 — Download, verify, configure the runner
|
||||
|
||||
```bash
|
||||
sudo -iu gha-runner bash <<'EOF'
|
||||
# Query the latest runner version (don't hardcode)
|
||||
LATEST=$(gh api /repos/actions/runner/releases/latest --jq '.tag_name' | sed 's/^v//')
|
||||
echo "Latest runner version: $LATEST"
|
||||
# As of writing this doc: 2.319.1. If LATEST is wildly different, STOP and confirm with human.
|
||||
|
||||
sudo -iu gha-runner bash <<EOF
|
||||
mkdir -p ~/actions-runner && cd ~/actions-runner
|
||||
RUNNER_VERSION="2.319.1"
|
||||
RUNNER_VERSION="$LATEST"
|
||||
RUNNER_ARCH="${RUNNER_ARCH}" # from pre-flight 5a
|
||||
TARBALL="actions-runner-\${RUNNER_ARCH}-\${RUNNER_VERSION}.tar.gz"
|
||||
|
||||
# Download
|
||||
curl -fSL -o "actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz" \
|
||||
"https://github.com/actions/runner/releases/download/v${RUNNER_VERSION}/actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
|
||||
# Download tarball
|
||||
curl -fSL -o "\$TARBALL" \
|
||||
"https://github.com/actions/runner/releases/download/v\${RUNNER_VERSION}/\$TARBALL"
|
||||
|
||||
# Verify SHA against the GitHub release page (https://github.com/actions/runner/releases/tag/v2.319.1).
|
||||
# If the sha doesn't match, STOP and report.
|
||||
# Download checksum manifest and verify (GitHub publishes SHA256 alongside each release)
|
||||
EXPECTED_SHA=\$(gh api /repos/actions/runner/releases/tags/v\${RUNNER_VERSION} \
|
||||
--jq ".body" | grep -oE "\b[0-9a-f]{64}\s+\$TARBALL\b" | awk '{print \$1}')
|
||||
ACTUAL_SHA=\$(sha256sum "\$TARBALL" | awk '{print \$1}')
|
||||
|
||||
tar xzf "./actions-runner-linux-x64-${RUNNER_VERSION}.tar.gz"
|
||||
if [ "\$EXPECTED_SHA" != "\$ACTUAL_SHA" ]; then
|
||||
echo "FAIL: SHA mismatch"
|
||||
echo " Expected: \$EXPECTED_SHA"
|
||||
echo " Actual: \$ACTUAL_SHA"
|
||||
exit 1
|
||||
fi
|
||||
echo "PASS: tarball SHA verified"
|
||||
|
||||
tar xzf "./\$TARBALL"
|
||||
EOF
|
||||
```
|
||||
|
||||
Register:
|
||||
Note: if `gh api` parsing of the SHA from the release body fails (GitHub sometimes changes release-note formatting), fall back to the official hashes page:
|
||||
`https://github.com/actions/runner/releases/tag/v<version>`. If you can't verify the SHA, STOP and report — don't run unverified binaries.
|
||||
|
||||
Register the runner. **Choose the URL based on your scope decision (§1):**
|
||||
|
||||
```bash
|
||||
# OPTION A — repo-level (default during validation)
|
||||
REGISTRATION_URL="https://github.com/saravanakumardb1/learning_ai_common_plat"
|
||||
|
||||
# OPTION B — org-level (once you've migrated to an org per §8)
|
||||
# REGISTRATION_URL="https://github.com/<your-org-name>"
|
||||
|
||||
sudo -u gha-runner -E -i bash -c "
|
||||
cd ~/actions-runner && \
|
||||
./config.sh \
|
||||
--url https://github.com/saravanakumardb1/learning_ai_common_plat \
|
||||
--url $REGISTRATION_URL \
|
||||
--token $RUNNER_TOKEN \
|
||||
--name hostinger-bytelyst-1 \
|
||||
--labels self-hosted,linux,x64,hostinger,bytelyst \
|
||||
@ -189,7 +253,17 @@ sudo -u gha-runner bash -c 'wc -c < ~/.gitea_npm_token && stat -c "%a %U:%G" ~/.
|
||||
|
||||
## 5. Smoke test (basic — runner picks up jobs)
|
||||
|
||||
Create branch `runner/smoke` in `learning_ai_common_plat` with this file:
|
||||
Create branch `runner/smoke` in `learning_ai_common_plat` with the workflow below.
|
||||
|
||||
```bash
|
||||
cd ~/code/mygh/learning_ai_common_plat
|
||||
git checkout -b runner/smoke
|
||||
mkdir -p .github/workflows
|
||||
# (paste workflow below into .github/workflows/runner-smoke.yml)
|
||||
git add .github/workflows/runner-smoke.yml
|
||||
git commit -m "ci: add self-hosted runner smoke-test workflow"
|
||||
git push origin runner/smoke
|
||||
```
|
||||
|
||||
```yaml
|
||||
# .github/workflows/runner-smoke.yml
|
||||
@ -447,18 +521,134 @@ The runner's `GITHUB_TOKEN` (provided by GitHub Actions automatically) is scoped
|
||||
|
||||
---
|
||||
|
||||
## 8. Scaling to more repos later
|
||||
## 8. Scaling to all 20+ repos — GitHub Organization migration
|
||||
|
||||
A single runner installation can serve multiple repos **only if** registered at org level. For your personal-account setup:
|
||||
A single self-hosted runner can serve all 20+ `@bytelyst/*` repos **only if** registered at **GitHub Organization level**. The personal-account path (repo-level registration) doesn't scale beyond 1–3 repos.
|
||||
|
||||
- **Recommended:** Move all 20+ repos to a free GitHub organization. Register the runner once at org level. Single runner serves everyone.
|
||||
- **Workaround for now:** Add the same runner to additional repos by re-running `config.sh` with each repo's URL and a fresh token (creates separate registrations sharing the same physical binary). Acceptable up to 2–3 repos.
|
||||
### Why migrate to an org
|
||||
|
||||
Recommend evaluating the org migration before scaling beyond 2 actively-publishing repos.
|
||||
| Concern | Personal account today | Under an org |
|
||||
| -------------------------------- | ------------------------------------------ | -------------------------------------------------- |
|
||||
| Self-hosted runner reuse | One registration per repo | One registration covers all org repos |
|
||||
| Secrets management | Per-repo (duplicated) | Org-level secrets inherited by all repos |
|
||||
| Visibility | Per-repo Actions tabs (no cross-repo view) | Org-level Actions dashboard across all repos |
|
||||
| Permissions / team collaboration | Limited | Teams, code owners, etc. |
|
||||
| Cost | Free | Free for unlimited public + private repos |
|
||||
| Move cost | — | ~1–2 hours total for 20 repos (mostly automatable) |
|
||||
|
||||
### Migration steps (do these BEFORE Step 3 if going org-level from day 1)
|
||||
|
||||
```bash
|
||||
# 1. Create the org via the GitHub UI:
|
||||
# https://github.com/organizations/plan
|
||||
# Choose "Free" plan. Suggested name: bytelyst-platform (or whatever fits).
|
||||
|
||||
# 2. Transfer each repo to the org (one-time, preserves all history + issues + stars)
|
||||
for repo in learning_ai_common_plat learning_ai_clock learning_ai_notes \
|
||||
learning_ai_flowmonk learning_ai_trails learning_ai_jarvis_jr \
|
||||
learning_ai_fastgap learning_ai_peakpulse learning_ai_efforise \
|
||||
learning_ai_auth_app learning_voice_ai_agent learning_multimodal_memory_agents \
|
||||
learning_ai_local_memory_gpt learning_ai_local_llms learning_ai_talk2obsidian \
|
||||
learning_ai_mac_tooling learning_ai_productivity_web learning_ai_smart_auth; do
|
||||
echo "Transferring $repo..."
|
||||
gh api -X POST "/repos/saravanakumardb1/$repo/transfer" -f new_owner="<your-org-name>"
|
||||
done
|
||||
|
||||
# 3. Update your local clones to point to the new owner
|
||||
# (run on each machine, in each repo dir)
|
||||
cd ~/code/mygh/<repo>
|
||||
git remote set-url origin https://github.com/<your-org-name>/<repo>.git
|
||||
```
|
||||
|
||||
GitHub automatically sets up redirects from the old URLs, so external links won't break immediately — but you should update CI references, README badges, and any inter-repo URL references.
|
||||
|
||||
### After migration
|
||||
|
||||
- Get a runner registration token at the **org level**:
|
||||
```bash
|
||||
gh api -X POST /orgs/<your-org-name>/actions/runners/registration-token --jq .token
|
||||
```
|
||||
- Use the org URL in Step 3's `config.sh` (Option B above).
|
||||
- The runner now picks up jobs from any repo in the org that targets `runs-on: [self-hosted, hostinger, bytelyst]`.
|
||||
|
||||
### Workflow propagation across 20+ repos
|
||||
|
||||
Once the runner is org-level, the next problem is propagating the `publish-packages.yml` workflow file to every repo that publishes packages. Two strategies:
|
||||
|
||||
1. **Reusable workflow** (preferred) — define `publish-packages.yml` once as a `workflow_call` reusable workflow in `learning_ai_common_plat/.github/workflows/`, then each consuming repo has a tiny stub that calls it.
|
||||
2. **Per-repo copy maintained by a sync script** — follow the same pattern as the existing `sync-npmrc.sh` in `scripts/`. Less elegant but works fine for a small repo count.
|
||||
|
||||
Deliver as a separate follow-up prompt.
|
||||
|
||||
---
|
||||
|
||||
## 9. Deliverables — report back to the human
|
||||
## 9. Monitoring + observability — how to track this runner
|
||||
|
||||
The GitHub Actions tab tracks runner state at three levels:
|
||||
|
||||
### a. Per-repo (or per-org) workflow runs
|
||||
|
||||
`https://github.com/<owner>/<repo>/actions` (or `/orgs/<org>/actions/` after migration) shows every workflow run with live-streaming logs. The "Set up job" step always logs:
|
||||
|
||||
```
|
||||
Runner name: 'hostinger-bytelyst-1'
|
||||
Runner group name: 'Default'
|
||||
Machine name: 'hostinger-vm'
|
||||
```
|
||||
|
||||
This is how you confirm the right runner picked up the job.
|
||||
|
||||
### b. Runner pool health
|
||||
|
||||
- **Repo level:** `Settings → Actions → Runners`
|
||||
- **Org level:** `Org settings → Actions → Runners`
|
||||
|
||||
Shows: status (`Idle` / `Active` / `Offline`), labels, OS, last connection time. This is where you debug "is my runner alive?".
|
||||
|
||||
### c. Scripted monitoring via `gh` CLI
|
||||
|
||||
```bash
|
||||
# Watch a specific run live
|
||||
gh run watch --repo <owner>/<repo>
|
||||
|
||||
# List recent runs
|
||||
gh run list --repo <owner>/<repo> --limit 10
|
||||
|
||||
# View finished run with full logs
|
||||
gh run view <run-id> --log --repo <owner>/<repo>
|
||||
|
||||
# List runners + their status (admin scope required)
|
||||
gh api /repos/<owner>/<repo>/actions/runners \
|
||||
--jq '.runners[] | {name, status, busy, labels: [.labels[].name]}'
|
||||
|
||||
# Or at org level:
|
||||
gh api /orgs/<org>/actions/runners \
|
||||
--jq '.runners[] | {name, status, busy}'
|
||||
```
|
||||
|
||||
### d. Host-side observability (on the Hostinger VM)
|
||||
|
||||
```bash
|
||||
SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
|
||||
|
||||
# Live tail
|
||||
sudo journalctl -u "$SVC_NAME" -f
|
||||
|
||||
# Last 100 lines
|
||||
sudo journalctl -u "$SVC_NAME" -n 100 --no-pager
|
||||
|
||||
# Per-run diagnostic logs
|
||||
ls -la /home/gha-runner/actions-runner/_diag/
|
||||
|
||||
# Current systemd state
|
||||
sudo systemctl status "$SVC_NAME"
|
||||
```
|
||||
|
||||
Use host-side logs when the runner shows "Offline" in GitHub UI but the VM is reachable — typically a daemon crash, expired registration, or network blip.
|
||||
|
||||
---
|
||||
|
||||
## 10. Deliverables — report back to the human
|
||||
|
||||
When complete:
|
||||
|
||||
@ -480,7 +670,7 @@ When complete:
|
||||
|
||||
---
|
||||
|
||||
## 10. Guardrails
|
||||
## 11. Guardrails
|
||||
|
||||
- **Do not** run the runner as root.
|
||||
- **Do not** persist the GitHub registration token to disk — memory only.
|
||||
@ -492,7 +682,7 @@ When complete:
|
||||
|
||||
---
|
||||
|
||||
## 11. Rollback
|
||||
## 12. Rollback
|
||||
|
||||
```bash
|
||||
SVC_NAME='actions.runner.saravanakumardb1-learning_ai_common_plat.hostinger-bytelyst-1.service'
|
||||
@ -510,7 +700,7 @@ sudo userdel -r gha-runner
|
||||
|
||||
---
|
||||
|
||||
## 12. Follow-up prompts (separate tasks)
|
||||
## 13. Follow-up prompts (separate tasks)
|
||||
|
||||
Once this runner is verified end-to-end, the next prompts to issue:
|
||||
|
||||
@ -520,9 +710,11 @@ Once this runner is verified end-to-end, the next prompts to issue:
|
||||
|
||||
---
|
||||
|
||||
## 13. Questions to ask the human BEFORE starting if anything is ambiguous
|
||||
## 14. Questions to ask the human BEFORE starting if anything is ambiguous
|
||||
|
||||
- "Which GitHub repo am I registering this runner for? (default: `saravanakumardb1/learning_ai_common_plat`)"
|
||||
- "Are we registering this runner at **repo level** (one repo only) or **org level** (after migrating 20+ repos to a GitHub org)? See §1 and §8."
|
||||
- "If org-level: what is the org name? Has the migration in §8 already happened?"
|
||||
- "If repo-level: which repo am I registering for? (default: `saravanakumardb1/learning_ai_common_plat`)"
|
||||
- "Is Docker required on the runner — i.e., does any planned workflow run `docker` commands? (default: no, only Gitea uses Docker)"
|
||||
- "What user currently owns `~/.gitea_npm_token` on this VM? (pre-flight check #9 will tell us)"
|
||||
- "Do you have a runner registration token, or should I fetch one via `gh api`?"
|
||||
|
||||
Loading…
Reference in New Issue
Block a user