- scripts/gitea/add-host-runner.sh: stand up Nth independent host-mode runner as its own launchd service (separate config/.runner/workdir, shared runner.env token, admin-API registration token, idempotent reload) - GITEA_VM_SETUP.md 11.5: document multi-runner setup, fleet list/prune, and removal; 3 runners x capacity 2 ~= 6 parallel slots (verified) Live fleet: learning-ai-mac (brew) + 2 added runners, all online; stale offline registrations pruned.
19 KiB
Gitea Cloud VM Setup — Runbook
Status: Active runbook · Last verified: 2026-05-28 Use this when: You have provisioned a cloud VM (Azure / wherever), Gitea is installed and running on
:3300, repos are cloned, and you need to wire the npm registry end-to-end with your laptop.
Assumes you're SSH'd into the VM (or running commands on the VM) and the
sibling learning_ai_common_plat repo is at ~/code/mygh/learning_ai_common_plat/
on both the VM and your laptop. Adjust paths as needed.
Prerequisites checklist
Before starting, confirm all of these on the VM:
# 1. Gitea container is running and healthy
sudo docker ps | grep gitea
curl -fsS http://localhost:3300/api/v1/version
# Expected: {"version":"1.X.X"}
# 2. Port 3300 is reachable from your laptop
# (run this FROM YOUR LAPTOP, not the VM)
# curl -fsS http://<VM_HOST>:3300/api/v1/version
# 3. Repos cloned on the VM
ls ~/code/mygh/learning_ai_common_plat
# Expected: packages/ services/ scripts/ ...
If any of these fail, fix them first. Common gotchas:
- Port 3300 blocked: Azure NSG → VM → Networking → "Add inbound port rule" → TCP 3300 from your home IP
- Gitea registry disabled: Edit
app.iniinside/var/lib/gitea/conf/, add[packages]\nENABLED = true, thensudo docker restart gitea hostnameresolves tolocalhostin the VM but you reach it via public DNS — note both, the API only needslocalhostfrom inside the VM
Step 1 — Create Gitea admin user (skip if you already have one)
Run on the VM:
# Check if any admin exists
sudo docker exec gitea gitea admin user list
# If empty (or you don't have admin creds), create one:
ADMIN_USER="gitea-admin"
ADMIN_PASS="$(openssl rand -base64 24 | tr -dc 'A-Za-z0-9' | head -c 24)"
echo " Admin password (save this!): $ADMIN_PASS"
sudo docker exec gitea gitea admin user create \
--username "$ADMIN_USER" \
--password "$ADMIN_PASS" \
--email "admin@bytelyst.local" \
--admin \
--must-change-password=false
SAVE the admin password somewhere safe (1Password / Bitwarden / macOS Keychain). You'll need it only for token rotation and bootstrapping; day-to-day work uses the npm token.
Optionally store in macOS Keychain on your laptop:
# Run on your LAPTOP (Mac)
security add-generic-password \
-s 'gitea-admin' \
-a 'gitea-admin' \
-w '<paste-the-password>'
Then scripts/gitea/token.sh rotate can auto-discover it later.
Step 2 — Create the npm owner user
The npm registry is namespaced by owner. The canonical owner is learning_ai_user.
Run on the VM:
ADMIN_USER="gitea-admin"
ADMIN_PASS="<paste-admin-password-from-step-1>"
NPM_USER="learning_ai_user"
NPM_PASS="$(openssl rand -base64 24 | tr -dc 'A-Za-z0-9' | head -c 24)"
# Create the user via admin API
curl -fsS -u "$ADMIN_USER:$ADMIN_PASS" \
-X POST "http://localhost:3300/api/v1/admin/users" \
-H 'Content-Type: application/json' \
-d "{
\"username\": \"$NPM_USER\",
\"email\": \"npm@bytelyst.local\",
\"password\": \"$NPM_PASS\",
\"must_change_password\": false
}"
echo ""
echo " NPM user '$NPM_USER' created"
echo " NPM password (needed only to mint tokens): $NPM_PASS"
Save NPM_PASS temporarily — you need it for Step 3. After that the npm
token replaces it for all day-to-day use.
If you see {"message":"user already exists"} — that's fine, skip ahead.
Use the existing NPM_PASS (or reset it via sudo docker exec gitea gitea admin user change-password --username learning_ai_user --password <new>).
Step 3 — Mint the npm token
Run on the VM:
NPM_USER="learning_ai_user"
NPM_PASS="<paste-from-step-2>"
TOKEN_NAME="npm-$(date +%Y%m%d-%H%M%S)-$(hostname -s)"
RESPONSE=$(curl -fsS -u "$NPM_USER:$NPM_PASS" \
-X POST "http://localhost:3300/api/v1/users/$NPM_USER/tokens" \
-H 'Content-Type: application/json' \
-d "{
\"name\": \"$TOKEN_NAME\",
\"scopes\": [\"write:package\", \"read:package\"]
}")
echo "$RESPONSE"
# Extract token (works for both newer "token" and older "sha1" field names)
TOKEN=$(echo "$RESPONSE" | grep -oE '"(sha1|token)":"[^"]+' | head -1 | sed 's/.*":"//')
echo ""
echo "════════════════════════════════════════"
echo " NPM TOKEN (copy this NOW):"
echo " $TOKEN"
echo "════════════════════════════════════════"
Copy the token immediately. Gitea never displays a token's secret value after the first response.
Step 4 — Wire the token into your laptop
Run on your LAPTOP (not the VM):
# Paste the token from Step 3 here:
TOKEN="<paste-token-from-step-3>"
# Write to the home-network-specific token file
echo -n "$TOKEN" > ~/.gitea_npm_token_home
chmod 600 ~/.gitea_npm_token_home
# Also update the catch-all file (used by some scripts as fallback)
echo -n "$TOKEN" > ~/.gitea_npm_token
chmod 600 ~/.gitea_npm_token
ls -la ~/.gitea_npm_token*
Expected output: two files, both -rw-------, 40 chars.
Step 5 — Tell switch-network.sh about the VM hostname
Run on your LAPTOP:
# Replace with your VM's public DNS or IP
echo "<VM_HOST>" > ~/.gitea_vm_host
# Example: echo "bytelyst-vm.eastus.cloudapp.azure.com" > ~/.gitea_vm_host
cat ~/.gitea_vm_host
Then refresh your shell environment so GITEA_NPM_HOST is exported:
source ~/.zshrc
echo "NETWORK=$NETWORK HOST=$GITEA_NPM_HOST OWNER=$GITEA_NPM_OWNER"
If you're on home network, expected output:
NETWORK=home HOST=<VM_HOST> OWNER=learning_ai_user
If NETWORK=corp, the host will be localhost (SSH tunnel mode). That's
expected for corp; the VM workflow assumes you're on home network.
Step 6 — Pre-flight verification
Run on your LAPTOP:
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/doctor.sh --probe @bytelyst/errors
Expected output:
✓ NETWORK=home
✓ GITEA_NPM_HOST=<VM_HOST>
✓ GITEA_NPM_OWNER=learning_ai_user
✓ Token consistent (env matches ~/.gitea_npm_token_home, 40 chars)
✓ Registry HTTP 200 on @bytelyst/errors
✗ @bytelyst/errors not found in registry (HTTP 404)
The 404 on @bytelyst/errors is EXPECTED at this point — the registry is
empty. We fix that in Step 7.
If you see any other failure (token rejected, registry unreachable, owner 404), debug before moving on:
| Failure | Likely cause | Fix |
|---|---|---|
Registry unreachable |
Port 3300 not open to your laptop | Open NSG inbound rule for TCP 3300 |
Token rejected (HTTP 401) |
Token typo or scope missing | Re-run Step 3 |
Owner 'learning_ai_user' not found |
Step 2 was skipped or used wrong name | Re-run Step 2 |
DNS does not resolve |
~/.gitea_vm_host has typo |
Re-check value |
Step 7 — Publish @bytelyst/* packages to the new VM
Run on the VM (so we publish from canonical sources, not your laptop):
cd ~/code/mygh/learning_ai_common_plat
# Build all packages first
pnpm install --frozen-lockfile
pnpm build
# Set env so publish targets the local Gitea
export GITEA_NPM_HOST=localhost
export GITEA_NPM_OWNER=learning_ai_user
export GITEA_NPM_TOKEN="<paste-token-from-step-3>"
# Publish every @bytelyst/* package
bash scripts/gitea/publish-local-packages.sh
Takes ~2-3 min for ~60 packages. You'll see one + @bytelyst/<name>@<version>
line per package.
If you see npm ERR! 409 Conflict lines, that's fine — those packages were
already published (idempotent).
Step 8 — End-to-end verification
Run on your LAPTOP:
# Re-run doctor — package probe should now succeed
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/doctor.sh --probe @bytelyst/errors
Expected:
✓ @bytelyst/errors resolvable (latest versions: 0.1.10)
✅ All Gitea pre-flight checks passed
Then smoke-test a real product install:
cd ~/code/mygh/learning_ai_notes
rm -rf node_modules backend/node_modules web/node_modules
pnpm install
Should complete in ~15-30s. If you see ERR_PNPM_NO_MATCHING_VERSION, you've
hit the historical-version gap — proceed to Step 9.
Step 9 — (Optional) Backfill historical versions
Some @bytelyst/* packages pin older versions transitively (e.g.
@bytelyst/auth@0.1.5 pins @bytelyst/errors@0.1.5). The publish script
only publishes the current version of each package; older versions need
backfilling.
Run on the VM:
cd ~/code/mygh/learning_ai_common_plat
export GITEA_NPM_HOST=localhost
export GITEA_NPM_OWNER=learning_ai_user
export GITEA_NPM_TOKEN="<paste-token-from-step-3>"
bash scripts/gitea/publish-outdated-packages.sh
This walks pnpm view <pkg> versions --json for every @bytelyst/* package,
checks out the matching git tag, builds, and publishes any version not yet in
the registry. Slow (~10-15 min for full backfill) but only runs once.
After this, the learning_ai_notes install should complete without errors.
Step 10 — Persist environment for future shells
Add to your laptop's ~/.zshrc (or confirm these are already there from
switch-network.sh sourcing):
# Already in switch-network.sh, but verify they're picked up
echo $GITEA_NPM_HOST # should be your VM hostname on home network
echo $GITEA_NPM_OWNER # should be learning_ai_user
echo $GITEA_NPM_TOKEN # should be the 40-char token
If any are missing, ensure your ~/.zshrc has:
export NETWORK=home # or corp; switch-network.sh keys on this
source "$HOME/code/mygh/learning_ai_common_plat/scripts/switch-network.sh"
Step 11 — Gitea Actions runner (CI)
The npm registry (Steps 1–10) is independent of CI. To run the docker-lint
job (and the rest of each repo's .gitea/workflows/*.yml) you need an
Actions runner registered against Gitea. This section makes that
reproducible — the original runner was registered by hand.
11.1 — Enable Actions on the instance
In app.ini (inside the Gitea container/conf dir):
[actions]
ENABLED = true
Then sudo docker restart gitea. Confirm with
curl -fsS http://localhost:3300/api/v1/version and that the repo Settings →
Actions toggle is available.
11.2 — Install and register the runner
# macOS host runner (laptop) — install
brew install act_runner
# Register reproducibly (fetches a registration token via the admin API,
# registers with the agreed labels + capacity). Host mode is the default.
GITEA_ADMIN_USER=gitea-admin GITEA_ADMIN_PASS='<admin-pass>' \
bash scripts/gitea/register-runner.sh --name bytelyst-mac --capacity 2
# Containerized runner (better isolation; requires Docker on the host):
GITEA_ADMIN_USER=gitea-admin GITEA_ADMIN_PASS='<admin-pass>' \
bash scripts/gitea/register-runner.sh --mode docker --capacity 2
register-runner.sh is idempotent: if a runner is already registered it
prints the current identity and exits. Pass --force to re-register (this
invalidates the old runner row in Gitea).
11.3 — Host mode vs. containerized mode
Host mode (ubuntu-latest:host) |
Docker mode (ubuntu-latest:docker://…) |
|
|---|---|---|
| Isolation | None — jobs run directly on macOS | Each job in a fresh container |
| Speed | Fast (no image pull) | Slower first run (pulls catthehacker/ubuntu) |
| Reproducibility | Depends on host toolchain | Pinned image, matches GitHub closely |
| Best for | Single-operator laptop / corp proxy | Shared/VM runners, untrusted PRs |
We run host mode on the laptop because the corp proxy + Docker-in-Docker is fragile, and the jobs are trusted (own repos). Prefer docker mode on the VM.
11.4 — Secrets: never inline the token
The npm token must not live inline in config.yaml. Externalise it into a
gitignored runner.env referenced by runner.env_file:
# /opt/homebrew/etc/act_runner/config.yaml
runner:
capacity: 2 # parallel jobs
env_file: '/opt/homebrew/etc/act_runner/runner.env'
envs:
# GITEA_NPM_TOKEN is loaded from env_file — never inline here.
NODE_ENV: test
# /opt/homebrew/etc/act_runner/runner.env (chmod 600, never committed)
GITEA_NPM_TOKEN=<token>
After editing config, reload the daemon:
brew services restart act_runner # or, if brew name mismatches:
launchctl kickstart -k gui/$(id -u)/homebrew.mxcl.act_runner
tail -f /opt/homebrew/var/log/act_runner.log # expect "declare successfully"
11.5 — Concurrency
runner.capacity controls parallel jobs on one runner. With capacity: 1
the lightweight docker-lint job queues behind slow backend/web/mobile/E2E
jobs (observed: ~13 min wait). capacity: 2 lets docker-lint run alongside
one heavy job. For more parallelism, register additional runners rather
than pushing capacity high on a single laptop — each runner gets its own
workdir and process, so failures/timeouts stay isolated.
Add more host runners (reproducible):
# Stand up runners #2 and #3 (each capacity 2) as their own launchd services.
# Shares the canonical runner.env token; separate config/.runner/workdir.
bash scripts/gitea/add-host-runner.sh 2 2
bash scripts/gitea/add-host-runner.sh 3 2
add-host-runner.sh <N> [capacity]:
- derives a per-runner
config.yamlfrom the canonical one (preserves proxy env +env_file), overridingrunner.file,runner.capacity, and a uniquehost.workdir_parent(~/.cache/act-<N>) - fetches a one-time registration token via the admin API (
~/.gitea_c5_pat) - registers as
$(hostname -s)-<N>with host-mode labels - writes + loads
~/Library/LaunchAgents/com.bytelyst.act_runner-<N>.plist(RunAtLoad+KeepAlive) - idempotent: re-running just reloads the service
The Homebrew act_runner service is runner #1; add-host-runner.sh adds
#2, #3, … Three runners × capacity 2 ≈ 6 parallel job slots. Verified:
pushing a multi-job workflow lights up all three runners simultaneously.
List + prune the fleet:
PAT=$(cat ~/.gitea_c5_pat)
curl -s -H "Authorization: token $PAT" \
http://localhost:3300/api/v1/admin/actions/runners \
| python3 -c "import json,sys; [print(r['id'], r['name'], r['status']) for r in json.load(sys.stdin)['runners']]"
# Delete a stale/offline runner by id:
curl -s -X DELETE -H "Authorization: token $PAT" \
http://localhost:3300/api/v1/admin/actions/runners/<id>
Remove an extra runner entirely:
launchctl bootout "gui/$(id -u)/com.bytelyst.act_runner-2" 2>/dev/null || true
rm -f ~/Library/LaunchAgents/com.bytelyst.act_runner-2.plist
rm -rf "$HOME/Library/Application Support/act_runner-2" ~/.cache/act-2
# then DELETE its row via the admin API (above)
11.6 — Runner token rotation
The registration token (used once at register time) is separate from the npm token (used by jobs). To rotate:
# Registration token — just re-register; old token is single-use anyway:
bash scripts/gitea/register-runner.sh --force --name bytelyst-mac
# npm token — rotate via the existing helper, then update runner.env:
bash scripts/gitea/token.sh rotate
TOKEN=$(cat ~/.gitea_npm_token)
printf 'GITEA_NPM_TOKEN=%s\n' "$TOKEN" > /opt/homebrew/etc/act_runner/runner.env
chmod 600 /opt/homebrew/etc/act_runner/runner.env
brew services restart act_runner
11.7 — Verify CI end-to-end
Push any repo that has a docker-lint job, then:
PAT=$(cat ~/.gitea_c5_pat)
R=learning_ai_clock
RID=$(curl -s -H "Authorization: token $PAT" \
"http://localhost:3300/api/v1/repos/learning_ai_user/$R/actions/runs?limit=1" \
| python3 -c "import json,sys; print(json.load(sys.stdin)['workflow_runs'][0]['id'])")
curl -s -H "Authorization: token $PAT" \
"http://localhost:3300/api/v1/repos/learning_ai_user/$R/actions/runs/$RID/jobs" \
| python3 -c "import json,sys; [print(j['status'], j.get('conclusion'), j['name']) for j in json.load(sys.stdin)['jobs']]"
# Expect: completed success Docker lint — gitea-doctor + docker-doctor
Troubleshooting
Doctor reports STALE TOKEN: env GITEA_NPM_TOKEN ≠ file
Your shell has an old token cached. Fix:
source ~/.zshrc
# Or to refresh just the token without sourcing everything:
eval "$(bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/token.sh print --export)"
pnpm install fails with EAI_AGAIN or ETIMEDOUT
DNS or network. Verify:
ping <VM_HOST>
nslookup <VM_HOST>
curl -v http://<VM_HOST>:3300/api/v1/version
Need to rotate the token
# On laptop (assumes Keychain entry from Step 1):
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/token.sh rotate
Or manually re-run Step 3 on the VM.
Need to start over (nuke the Gitea data)
On the VM (destructive):
sudo docker stop gitea
sudo rm -rf /var/lib/gitea/* # or wherever your data volume is
sudo docker start gitea
# Then re-run Steps 1-7
What's persistent vs. ephemeral
| Item | Where | Survives VM reboot? | Survives VM rebuild? |
|---|---|---|---|
| Gitea database | /var/lib/gitea/data/gitea.db |
✅ | ❌ (snapshot the disk) |
| Published packages | /var/lib/gitea/data/packages/ |
✅ | ❌ (re-publish via Step 7) |
| Admin/npm users | inside Gitea DB | ✅ | ❌ (re-run Steps 1-2) |
| NPM tokens | inside Gitea DB + your ~/.gitea_npm_token_home |
✅ | ❌ (re-run Step 3) |
~/.gitea_vm_host |
your laptop | ✅ | n/a |
~/.gitea_npm_token_home |
your laptop | ✅ | n/a |
| Actions runner registration | Gitea DB + .runner file |
✅ | ❌ (re-run register-runner.sh) |
| Runner secrets | act_runner/runner.env (chmod 600) |
✅ | ❌ (recreate from token) |
For VM rebuilds: snapshot /var/lib/gitea to Azure Disk Snapshot weekly,
restore on rebuild. Avoids re-running Steps 1-7.
See also
scripts/gitea/doctor.sh— pre-flight validation (run before every deploy)scripts/gitea/token.sh— token rotation helperscripts/gitea/register-runner.sh— reproducible Actions runner registration (Step 11)scripts/gitea/bootstrap-vm.sh— automates Steps 1-3 on a fresh VMscripts/switch-network.sh— exportsGITEA_NPM_*env vars per networkdocker-build-optimization-roadmap.md(inlearning_ai_devops_tools/docs/) — ecosystem-wide Docker build hardening that depends on this setup