# ADR-0001: Docker build lockfile policy > **Status:** Accepted (decision); Deferred (implementation) · **Date:** 2026-05-27 > **Context:** docker-build-optimization-roadmap §A3 · **Supersedes:** None > **Authors:** Platform DevOps --- ## 1. Context The pilot Phase A work in `docker-build-optimization-roadmap` standardized on `pnpm install --lockfile=false` inside Docker for both `learning_ai_clock` (web + backend) and `learning_ai_peakpulse` (backend). That choice unblocked Phase A by sidestepping a structural mismatch: - `pnpm-lock.yaml` is generated against the **outer pnpm workspace**, which includes `../learning_ai_common_plat/packages/*` as workspace members (sibling-repo path). - Inside the Docker build context, the sibling repo doesn't exist (a single-repo build context is intentionally used for hermeticity). - `--frozen-lockfile` therefore fails immediately with workspace resolution errors (finding F2 in the roadmap audit). `--lockfile=false` skips lockfile validation entirely and re-resolves all dependencies against the registry on every `pnpm install`. This is correct for the workspace-mismatch problem but introduces non-determinism: the **same Dockerfile + same source tree can produce a different lockset** across two builds if upstream `@bytelyst/*` versions move between them. Phase A2's BuildKit cache mount mitigates the *speed* cost of re-resolution but not the *determinism* cost. This ADR records the decision on which long-term policy to adopt for Docker builds. Implementation is deferred to a future Phase A3 sprint. --- ## 2. Options considered ### Option A — Keep `--lockfile=false` (status quo) **How it works.** Docker `pnpm install` re-resolves on every cold build. Cache mount preserves the pnpm content-addressed store across builds, so warm rebuilds don't pay re-resolution cost. **Pros:** - Zero churn — already shipped in Phase A. - Tolerates sibling-repo workspace mismatch for free. - Tolerates `*` semver across all `@bytelyst/*` deps without rework. - Compatible with the F17 fix (Gitea `host.docker.internal` URLs). **Cons:** - **Non-deterministic builds.** Same Dockerfile + same source can produce different `node_modules` if a dependency was published between two cold builds. CI runs days apart can ship divergent images for the same commit. - No supply-chain pinning. Any compromised upstream auto-rolls forward. - `pnpm audit` on the host can disagree with what's actually inside the image. ### Option B — Generate a Docker-only flat lockfile during build **How it works.** Add a build step that runs `pnpm install --lockfile-only` in a temp dir against a flattened `pnpm-workspace.yaml` that excludes sibling-repo paths, then `--frozen-lockfile` against that generated lock. **Pros:** - Deterministic *within a single build* — same registry state at the moment of the build always produces the same lockset. - Doesn't require changes to the source tree's `pnpm-workspace.yaml`. **Cons:** - Still non-deterministic across builds (the lock is regenerated each time unless cached separately). - Adds Dockerfile complexity and a non-trivial new failure mode (workspace-flattening logic). - Marginal value over Option A given the cache mount. ### Option C — Vendor a Docker-flattened lockfile in the repo **How it works.** Commit a `pnpm-lock.docker.yaml` (or similar) per repo that's generated against a flattened workspace. Dockerfile uses `pnpm install --frozen-lockfile --lockfile=pnpm-lock.docker.yaml`. **Pros:** - Fully deterministic. Same commit → same lockset → same image. - Supply chain pins enforced. - `pnpm audit` matches image contents. **Cons:** - Two lockfiles to maintain (the workspace one + the Docker one). - Drift risk between the two — solved only by a CI gate that regenerates the Docker lockfile on every PR that touches `package.json`. - Requires a tested regenerate-on-CI workflow per repo. - Workspace flattening logic must be encoded somewhere (script in `common-plat/scripts/regen-docker-lockfile.sh`). ### Option D — Restructure to single-repo workspace (eliminate sibling) **How it works.** Inline the consumed `@bytelyst/*` packages into each product repo (vendor them) so there is no sibling-workspace dependency. Then `--frozen-lockfile` works trivially. **Pros:** - Cleanest from a Docker-build-determinism standpoint. **Cons:** - Massive churn across 14+ product repos. - Defeats the entire `learning_ai_common_plat` shared-package model. - Multiplies maintenance cost of `@bytelyst/*` updates by the number of consumers. - Out of scope; would supersede the entire ecosystem architecture. --- ## 3. Decision **Adopt Option A (`--lockfile=false`) as the official short-term policy.** **Plan to migrate to Option C (`pnpm-lock.docker.yaml`) when supply-chain determinism becomes a hard requirement** (e.g., before any production deployment of a Docker-built image, or before SOC2-style attestation). **Reasoning:** 1. **Phase A is already shipped on Option A** with verified speed wins (warm rebuilds 2.7–5.4 s across all surfaces). Switching policies mid-rollout would invalidate metrics + add risk. 2. **The cache mount (Phase A2) addresses the speed concern** that Option A creates. The remaining concern is determinism, which is a correctness concern — but the actual blast radius is limited because: - All `@bytelyst/*` deps are first-party and pinned in source repos. - Third-party deps already have fixed semver in `package.json` (no loose `*` ranges to public registries). - The Gitea registry is the only `@bytelyst/*` source — no public supply-chain risk for the in-house deps. 3. **Option C is the right end state** but requires CI infrastructure that doesn't exist yet (auto-regen-on-PR). Building it inside this roadmap is scope creep. 4. **Option B is dominated by Option C** — same complexity, weaker guarantees. 5. **Option D is non-starter** — it would require redesigning the ByteLyst shared-package model. --- ## 4. Consequences ### Positive - Phase A speed wins are preserved with zero policy churn. - `pnpm-lock.yaml` continues to live in source repos for host development; it stays in `.dockerignore` for Docker builds. - The decision is reversible: switching to Option C in the future is additive (add a Docker lockfile + change one Dockerfile line). ### Negative - Same commit can produce different Docker images on different days. CI must not assume image hash stability for a given commit. - `pnpm audit` results from the host don't match Docker image contents. Workaround: run `pnpm audit` inside the built container as a separate CI job (cheap; no rebuild needed). - Supply-chain attestation (SOC2, SLSA) cannot be produced for these images today. Acceptable while there is no production traffic. ### Migration trigger Switch to Option C when **any** of the following becomes true: 1. A production environment (paid customers, real PII) deploys a Docker-built image from this codebase. 2. A regulatory/audit requirement demands reproducible builds. 3. A supply-chain incident occurs (compromised upstream package) and we need rollback granularity finer than "rebuild from current `*`". 4. The cache-mount speed win disappears (e.g., CI runner switch removes BuildKit cache persistence). ### Implementation sketch (when triggered) 1. In `learning_ai_common_plat`, add `scripts/regen-docker-lockfile.sh`: - Reads each product repo's `package.json`. - Generates a flattened `pnpm-workspace.yaml` (no sibling paths). - Runs `pnpm install --lockfile-only` against the Gitea registry. - Writes `pnpm-lock.docker.yaml` back to the product repo. 2. Each product repo gets a `.gitea/workflows/regen-docker-lockfile.yml` that runs the script on PR-touch of `package.json` and either: - commits the regenerated lockfile (auto-PR), or - fails the PR with a "run regen-docker-lockfile.sh and commit" message. 3. Each product Dockerfile changes one line: ```dockerfile # before RUN pnpm install --ignore-scripts --lockfile=false # after COPY pnpm-lock.docker.yaml ./pnpm-lock.yaml RUN pnpm install --ignore-scripts --frozen-lockfile ``` 4. `.dockerignore` removes `pnpm-lock.yaml` exclusion (or adds explicit include for `pnpm-lock.docker.yaml`). This work is **not scoped** in the current roadmap and should be its own small ADR-driven sprint. --- ## 5. Status tracking | Phase | State | Notes | |---|---|---| | Decision | ✅ Accepted | This ADR | | Implementation | ⏸ Deferred | Triggered by §4 conditions | | Trigger monitor | ⚳ Open | Re-evaluate when Phase D rollout begins | --- ## 6. References - `docker-build-optimization-roadmap.md` §0 F1, F2 (lockfile findings) - `docker-build-optimization-roadmap.md` §A3 (deferred phase) - `docker-build-optimization-roadmap.md` §A2 (BuildKit cache mount that mitigates the speed concern of Option A) - `learning_ai_common_plat/AGENTS.md` (canonical pnpm workspace config)