feat(gitea): docker-mode env hygiene + document containerized job migration

- add-host-runner.sh docker mode now strips host-specific envs (HOME, PATH,
  PNPM_HOME) that leak macOS paths into Linux containers and override workflow
  env (broke $HOME-relative writes)
- GITEA_VM_SETUP.md 11.5: reference pattern + 5 gotchas for migrating a real
  job (docker-lint) onto the docker runner: Actions secret (not token file),
  doctor.sh token-file requirement, host-env leakage, env_file token override,
  proxy bypass. Validated green on M-…-4.
This commit is contained in:
saravanakumardb1 2026-05-28 19:16:52 -07:00
parent 6429436935
commit 77b074f3c0
2 changed files with 61 additions and 1 deletions

View File

@ -476,6 +476,61 @@ so it does not hijack the host-mode `ubuntu-latest` jobs. Opt a job in with
Validated end-to-end: a `runs-on: docker` job runs in an `Ubuntu 24.04` Validated end-to-end: a `runs-on: docker` job runs in an `Ubuntu 24.04`
container and reaches Gitea (`GET /api/v1/version` → `{"version":"…"}`). container and reaches Gitea (`GET /api/v1/version` → `{"version":"…"}`).
**Migrating a real job onto the docker runner.** The host-clone CI model
(`working-directory: /Users/…` + `git pull`) does not work in a container.
A containerized job must instead clone what it needs and reach Gitea via
`host.docker.internal`. Reference pattern (the `docker-lint` job in
`learning_ai_clock`):
```yaml
docker-lint:
runs-on: docker
env:
GITEA_NPM_HOST: host.docker.internal
GITEA_NPM_OWNER: learning_ai_user
GITEA_NPM_TOKEN: ${{ secrets.NPM_REGISTRY_TOKEN }} # GITEA_-prefixed names are reserved
NO_PROXY: host.docker.internal,localhost,127.0.0.1
no_proxy: host.docker.internal,localhost,127.0.0.1
steps:
- name: Fetch repo + canonical doctor scripts
run: |
G="http://host.docker.internal:3300/learning_ai_user"
git clone --depth 1 "$G/${GITHUB_REPOSITORY##*/}.git" repo
git clone --depth 1 "$G/learning_ai_common_plat.git" common-plat
- name: gitea-doctor
run: |
printf '%s' "$GITEA_NPM_TOKEN" > "$HOME/.gitea_npm_token" # doctor.sh needs a token file
bash common-plat/scripts/gitea/doctor.sh --quiet
- name: docker-doctor
run: bash common-plat/scripts/docker-doctor.sh --repo "$PWD/repo" --quiet
```
Gotchas that cost real debugging time (all now handled by
`add-host-runner.sh` docker mode + this pattern):
1. **Secret, not file.** Job containers do **not** see the runner's
`~/.gitea_npm_token`. Provide the registry token as a Gitea Actions secret
(`NPM_REGISTRY_TOKEN`, set at repo or user level via
`PUT /api/v1/repos/{owner}/{repo}/actions/secrets/{name}`). `GITEA_`/`GITHUB_`
prefixes are reserved → secret create returns HTTP 400.
2. **`doctor.sh` requires a token _file_.** Its stale-shell check errors if
`~/.gitea_npm_token` is absent even when the env token is set — so the job
writes the secret to the file first.
3. **Host envs leak into the container.** The runner injects `HOME`, `PATH`,
`PNPM_HOME` (macOS paths) which override workflow `env:` and break
`$HOME`-relative writes. `add-host-runner.sh` docker mode strips these so the
container uses its image defaults (`/root`).
4. **`env_file` token overrides the secret.** The runner's `runner.env`
`GITEA_NPM_TOKEN` is injected into jobs and wins over `${{ secrets.* }}`. Keep
`runner.env` in sync with the current token (see §11.6) — a stale value there
causes registry **HTTP 401** even with a correct secret.
5. **Proxy bypass.** `host.docker.internal` must be in `NO_PROXY`/`no_proxy`
(handled by the runner config) or the corp proxy returns **HTTP 504**.
Validated: `docker-lint` runs green on the docker runner (`M-…-4`) —
`git clone` over `host.docker.internal`, `gitea-doctor` registry probe → 200,
`docker-doctor` lint pass.
List + prune the fleet: List + prune the fleet:
```bash ```bash

View File

@ -98,9 +98,14 @@ if mode == "docker":
c["privileged"] = False c["privileged"] = False
# Let job containers reach the host's Gitea on Docker Desktop. # Let job containers reach the host's Gitea on Docker Desktop.
c["options"] = "--add-host=host.docker.internal:host-gateway" c["options"] = "--add-host=host.docker.internal:host-gateway"
envs = cfg["runner"].setdefault("envs", {})
# Drop host-specific envs that are invalid inside a Linux container — these
# leak the macOS HOME/PATH/PNPM_HOME and override workflow env, breaking
# $HOME-relative writes. Let the container use its image defaults (/root).
for k in ("HOME", "PATH", "PNPM_HOME"):
envs.pop(k, None)
# Containerized jobs inherit the corp proxy env; without this they route # Containerized jobs inherit the corp proxy env; without this they route
# host.docker.internal:3300 through the proxy and get a 504. Bypass it. # host.docker.internal:3300 through the proxy and get a 504. Bypass it.
envs = cfg["runner"].setdefault("envs", {})
for k in ("NO_PROXY", "no_proxy", "NPM_CONFIG_NOPROXY"): for k in ("NO_PROXY", "no_proxy", "NPM_CONFIG_NOPROXY"):
v = envs.get(k, "") v = envs.get(k, "")
if "host.docker.internal" not in v: if "host.docker.internal" not in v: