docs(runbooks): GITEA_VM_SETUP.md — step-by-step cloud VM wiring

Copy-pasteable runbook for the case where:
- VM is already provisioned
- Gitea is already installed and running on :3300
- Repos are already cloned on the VM
- User needs to wire admin + npm-user + token + laptop end-to-end

10 numbered steps with expected outputs and troubleshooting:
  1. Create Gitea admin user (idempotent skip if exists)
  2. Create npm owner user (learning_ai_user)
  3. Mint npm-scoped token via API
  4. Write token to ~/.gitea_npm_token_home on laptop
  5. Update ~/.gitea_vm_host with VM hostname
  6. Pre-flight verification via doctor.sh (expects 404 on probe)
  7. Publish @bytelyst/* via publish-local-packages.sh
  8. End-to-end verification (re-run doctor + smoke-test pnpm install)
  9. Optional: backfill historical versions
  10. Persist environment in ~/.zshrc

Includes troubleshooting table, persistence map (what survives VM reboot
vs rebuild), and Azure NSG/firewall guidance.

Companion to scripts/gitea/{bootstrap-vm,doctor,token}.sh.
This commit is contained in:
saravanakumardb1 2026-05-27 03:46:09 -07:00
parent 130883a7db
commit 925c081ce3

View File

@ -0,0 +1,405 @@
# Gitea Cloud VM Setup — Runbook
> **Status:** Active runbook · **Last verified:** 2026-05-27
> **Use this when:** You have provisioned a cloud VM (Azure / wherever), Gitea
> is installed and running on `:3300`, repos are cloned, and you need to wire
> the npm registry end-to-end with your laptop.
Assumes you're SSH'd into the VM (or running commands on the VM) and the
sibling `learning_ai_common_plat` repo is at `~/code/mygh/learning_ai_common_plat/`
on both the VM and your laptop. Adjust paths as needed.
---
## Prerequisites checklist
Before starting, confirm all of these on the **VM**:
```bash
# 1. Gitea container is running and healthy
sudo docker ps | grep gitea
curl -fsS http://localhost:3300/api/v1/version
# Expected: {"version":"1.X.X"}
# 2. Port 3300 is reachable from your laptop
# (run this FROM YOUR LAPTOP, not the VM)
# curl -fsS http://<VM_HOST>:3300/api/v1/version
# 3. Repos cloned on the VM
ls ~/code/mygh/learning_ai_common_plat
# Expected: packages/ services/ scripts/ ...
```
If any of these fail, fix them first. Common gotchas:
- **Port 3300 blocked:** Azure NSG → VM → Networking → "Add inbound port rule" → TCP 3300 from your home IP
- **Gitea registry disabled:** Edit `app.ini` inside `/var/lib/gitea/conf/`, add `[packages]\nENABLED = true`, then `sudo docker restart gitea`
- **`hostname` resolves to `localhost`** in the VM but you reach it via public DNS — note both, the API only needs `localhost` from inside the VM
---
## Step 1 — Create Gitea admin user (skip if you already have one)
Run **on the VM**:
```bash
# Check if any admin exists
sudo docker exec gitea gitea admin user list
# If empty (or you don't have admin creds), create one:
ADMIN_USER="gitea-admin"
ADMIN_PASS="$(openssl rand -base64 24 | tr -dc 'A-Za-z0-9' | head -c 24)"
echo " Admin password (save this!): $ADMIN_PASS"
sudo docker exec gitea gitea admin user create \
--username "$ADMIN_USER" \
--password "$ADMIN_PASS" \
--email "admin@bytelyst.local" \
--admin \
--must-change-password=false
```
**SAVE the admin password somewhere safe** (1Password / Bitwarden / macOS Keychain).
You'll need it only for token rotation and bootstrapping; day-to-day work uses the npm token.
Optionally store in macOS Keychain on your laptop:
```bash
# Run on your LAPTOP (Mac)
security add-generic-password \
-s 'gitea-admin' \
-a 'gitea-admin' \
-w '<paste-the-password>'
```
Then `scripts/gitea/token.sh rotate` can auto-discover it later.
---
## Step 2 — Create the npm owner user
The npm registry is namespaced by owner. The canonical owner is `learning_ai_user`.
Run **on the VM**:
```bash
ADMIN_USER="gitea-admin"
ADMIN_PASS="<paste-admin-password-from-step-1>"
NPM_USER="learning_ai_user"
NPM_PASS="$(openssl rand -base64 24 | tr -dc 'A-Za-z0-9' | head -c 24)"
# Create the user via admin API
curl -fsS -u "$ADMIN_USER:$ADMIN_PASS" \
-X POST "http://localhost:3300/api/v1/admin/users" \
-H 'Content-Type: application/json' \
-d "{
\"username\": \"$NPM_USER\",
\"email\": \"npm@bytelyst.local\",
\"password\": \"$NPM_PASS\",
\"must_change_password\": false
}"
echo ""
echo " NPM user '$NPM_USER' created"
echo " NPM password (needed only to mint tokens): $NPM_PASS"
```
Save `NPM_PASS` temporarily — you need it for Step 3. After that the npm
token replaces it for all day-to-day use.
If you see `{"message":"user already exists"}` — that's fine, skip ahead.
Use the existing NPM_PASS (or reset it via `sudo docker exec gitea gitea
admin user change-password --username learning_ai_user --password <new>`).
---
## Step 3 — Mint the npm token
Run **on the VM**:
```bash
NPM_USER="learning_ai_user"
NPM_PASS="<paste-from-step-2>"
TOKEN_NAME="npm-$(date +%Y%m%d-%H%M%S)-$(hostname -s)"
RESPONSE=$(curl -fsS -u "$NPM_USER:$NPM_PASS" \
-X POST "http://localhost:3300/api/v1/users/$NPM_USER/tokens" \
-H 'Content-Type: application/json' \
-d "{
\"name\": \"$TOKEN_NAME\",
\"scopes\": [\"write:package\", \"read:package\"]
}")
echo "$RESPONSE"
# Extract token (works for both newer "token" and older "sha1" field names)
TOKEN=$(echo "$RESPONSE" | grep -oE '"(sha1|token)":"[^"]+' | head -1 | sed 's/.*":"//')
echo ""
echo "════════════════════════════════════════"
echo " NPM TOKEN (copy this NOW):"
echo " $TOKEN"
echo "════════════════════════════════════════"
```
**Copy the token immediately.** Gitea never displays a token's secret value
after the first response.
---
## Step 4 — Wire the token into your laptop
Run **on your LAPTOP** (not the VM):
```bash
# Paste the token from Step 3 here:
TOKEN="<paste-token-from-step-3>"
# Write to the home-network-specific token file
echo -n "$TOKEN" > ~/.gitea_npm_token_home
chmod 600 ~/.gitea_npm_token_home
# Also update the catch-all file (used by some scripts as fallback)
echo -n "$TOKEN" > ~/.gitea_npm_token
chmod 600 ~/.gitea_npm_token
ls -la ~/.gitea_npm_token*
```
Expected output: two files, both `-rw-------`, 40 chars.
---
## Step 5 — Tell switch-network.sh about the VM hostname
Run **on your LAPTOP**:
```bash
# Replace with your VM's public DNS or IP
echo "<VM_HOST>" > ~/.gitea_vm_host
# Example: echo "bytelyst-vm.eastus.cloudapp.azure.com" > ~/.gitea_vm_host
cat ~/.gitea_vm_host
```
Then refresh your shell environment so `GITEA_NPM_HOST` is exported:
```bash
source ~/.zshrc
echo "NETWORK=$NETWORK HOST=$GITEA_NPM_HOST OWNER=$GITEA_NPM_OWNER"
```
If you're on home network, expected output:
```
NETWORK=home HOST=<VM_HOST> OWNER=learning_ai_user
```
If `NETWORK=corp`, the host will be `localhost` (SSH tunnel mode). That's
expected for corp; the VM workflow assumes you're on home network.
---
## Step 6 — Pre-flight verification
Run **on your LAPTOP**:
```bash
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/doctor.sh --probe @bytelyst/errors
```
Expected output:
```
✓ NETWORK=home
✓ GITEA_NPM_HOST=<VM_HOST>
✓ GITEA_NPM_OWNER=learning_ai_user
✓ Token consistent (env matches ~/.gitea_npm_token_home, 40 chars)
✓ Registry HTTP 200 on @bytelyst/errors
@bytelyst/errors not found in registry (HTTP 404)
```
**The 404 on `@bytelyst/errors` is EXPECTED** at this point — the registry is
empty. We fix that in Step 7.
If you see any other failure (token rejected, registry unreachable, owner
404), debug before moving on:
| Failure | Likely cause | Fix |
| ------------------------------------ | ------------------------------------- | ---------------------------------- |
| `Registry unreachable` | Port 3300 not open to your laptop | Open NSG inbound rule for TCP 3300 |
| `Token rejected (HTTP 401)` | Token typo or scope missing | Re-run Step 3 |
| `Owner 'learning_ai_user' not found` | Step 2 was skipped or used wrong name | Re-run Step 2 |
| `DNS does not resolve` | `~/.gitea_vm_host` has typo | Re-check value |
---
## Step 7 — Publish `@bytelyst/*` packages to the new VM
Run **on the VM** (so we publish from canonical sources, not your laptop):
```bash
cd ~/code/mygh/learning_ai_common_plat
# Build all packages first
pnpm install --frozen-lockfile
pnpm build
# Set env so publish targets the local Gitea
export GITEA_NPM_HOST=localhost
export GITEA_NPM_OWNER=learning_ai_user
export GITEA_NPM_TOKEN="<paste-token-from-step-3>"
# Publish every @bytelyst/* package
bash scripts/gitea/publish-local-packages.sh
```
Takes ~2-3 min for ~60 packages. You'll see one `+ @bytelyst/<name>@<version>`
line per package.
If you see `npm ERR! 409 Conflict` lines, that's fine — those packages were
already published (idempotent).
---
## Step 8 — End-to-end verification
Run **on your LAPTOP**:
```bash
# Re-run doctor — package probe should now succeed
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/doctor.sh --probe @bytelyst/errors
```
Expected:
```
@bytelyst/errors resolvable (latest versions: 0.1.10)
✅ All Gitea pre-flight checks passed
```
Then smoke-test a real product install:
```bash
cd ~/code/mygh/learning_ai_notes
rm -rf node_modules backend/node_modules web/node_modules
pnpm install
```
Should complete in ~15-30s. If you see `ERR_PNPM_NO_MATCHING_VERSION`, you've
hit the historical-version gap — proceed to Step 9.
---
## Step 9 — (Optional) Backfill historical versions
Some `@bytelyst/*` packages pin older versions transitively (e.g.
`@bytelyst/auth@0.1.5` pins `@bytelyst/errors@0.1.5`). The publish script
only publishes the current version of each package; older versions need
backfilling.
Run **on the VM**:
```bash
cd ~/code/mygh/learning_ai_common_plat
export GITEA_NPM_HOST=localhost
export GITEA_NPM_OWNER=learning_ai_user
export GITEA_NPM_TOKEN="<paste-token-from-step-3>"
bash scripts/gitea/publish-outdated-packages.sh
```
This walks `pnpm view <pkg> versions --json` for every `@bytelyst/*` package,
checks out the matching git tag, builds, and publishes any version not yet in
the registry. Slow (~10-15 min for full backfill) but only runs once.
After this, the `learning_ai_notes` install should complete without errors.
---
## Step 10 — Persist environment for future shells
Add to your laptop's `~/.zshrc` (or confirm these are already there from
`switch-network.sh` sourcing):
```bash
# Already in switch-network.sh, but verify they're picked up
echo $GITEA_NPM_HOST # should be your VM hostname on home network
echo $GITEA_NPM_OWNER # should be learning_ai_user
echo $GITEA_NPM_TOKEN # should be the 40-char token
```
If any are missing, ensure your `~/.zshrc` has:
```bash
export NETWORK=home # or corp; switch-network.sh keys on this
source "$HOME/code/mygh/learning_ai_common_plat/scripts/switch-network.sh"
```
---
## Troubleshooting
### Doctor reports `STALE TOKEN: env GITEA_NPM_TOKEN ≠ file`
Your shell has an old token cached. Fix:
```bash
source ~/.zshrc
# Or to refresh just the token without sourcing everything:
eval "$(bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/token.sh print --export)"
```
### `pnpm install` fails with `EAI_AGAIN` or `ETIMEDOUT`
DNS or network. Verify:
```bash
ping <VM_HOST>
nslookup <VM_HOST>
curl -v http://<VM_HOST>:3300/api/v1/version
```
### Need to rotate the token
```bash
# On laptop (assumes Keychain entry from Step 1):
bash ~/code/mygh/learning_ai_common_plat/scripts/gitea/token.sh rotate
```
Or manually re-run Step 3 on the VM.
### Need to start over (nuke the Gitea data)
**On the VM** (destructive):
```bash
sudo docker stop gitea
sudo rm -rf /var/lib/gitea/* # or wherever your data volume is
sudo docker start gitea
# Then re-run Steps 1-7
```
---
## What's persistent vs. ephemeral
| Item | Where | Survives VM reboot? | Survives VM rebuild? |
| ------------------------- | ------------------------------------------------ | ------------------- | -------------------------- |
| Gitea database | `/var/lib/gitea/data/gitea.db` | ✅ | ❌ (snapshot the disk) |
| Published packages | `/var/lib/gitea/data/packages/` | ✅ | ❌ (re-publish via Step 7) |
| Admin/npm users | inside Gitea DB | ✅ | ❌ (re-run Steps 1-2) |
| NPM tokens | inside Gitea DB + your `~/.gitea_npm_token_home` | ✅ | ❌ (re-run Step 3) |
| `~/.gitea_vm_host` | your laptop | ✅ | n/a |
| `~/.gitea_npm_token_home` | your laptop | ✅ | n/a |
For VM rebuilds: snapshot `/var/lib/gitea` to Azure Disk Snapshot weekly,
restore on rebuild. Avoids re-running Steps 1-7.
---
## See also
- `scripts/gitea/doctor.sh` — pre-flight validation (run before every deploy)
- `scripts/gitea/token.sh` — token rotation helper
- `scripts/gitea/bootstrap-vm.sh` — automates Steps 1-3 on a fresh VM
- `scripts/switch-network.sh` — exports `GITEA_NPM_*` env vars per network
- `docker-build-optimization-roadmap.md` (in `learning_ai_devops_tools/docs/`) —
ecosystem-wide Docker build hardening that depends on this setup