Per § 10 steps 9 + 10.
Step 9: Peakpulse backend Phase A complete.
cold 72.2 s, warm 2.7 s (96.3% reduction). Pattern from clock applied
verbatim plus .docker-deps/.gitkeep discoverability fix back-ported
to clock. Commits:
peakpulse@11a6bc5 feat(docker): Phase A on peakpulse backend
peakpulse@6523a1a fix(docker): track .docker-deps/.gitkeep
clock@1465e06b1 fix(docker): track .docker-deps/.gitkeep
clock@d69003c1f chore: dedupe .docker-deps in .gitignore
Step 10: A3 ADR accepted.
New file: docs/adr/0001-docker-build-lockfile-policy.md
Decision: short-term Option A (--lockfile=false) — already shipped in
Phase A; long-term Option C (vendored pnpm-lock.docker.yaml). Migration
triggered by production deployment, audit requirement, supply-chain
incident, or loss of BuildKit cache. Implementation sketch in ADR § 4.
Roadmap doc updates:
- § A7 metrics table: peakpulse row populated (72.2 s / 2.7 s).
- § A3: collapsed bullet list into decision-record summary linking ADR.
- § 10: steps 9 + 10 marked ✅; status banner v7 → v8.
Next per § 10: step 11 (Phase B docker-prep hardening) or step 12
(Phase E docker-doctor.sh linter). Phase E is higher-value as durable
insurance against F11/F13/F16/F17/F18 regressions across the ecosystem.
8.8 KiB
ADR-0001: Docker build lockfile policy
Status: Accepted (decision); Deferred (implementation) · Date: 2026-05-27 Context: docker-build-optimization-roadmap §A3 · Supersedes: None Authors: Platform DevOps
1. Context
The pilot Phase A work in docker-build-optimization-roadmap standardized
on pnpm install --lockfile=false inside Docker for both
learning_ai_clock (web + backend) and learning_ai_peakpulse (backend).
That choice unblocked Phase A by sidestepping a structural mismatch:
pnpm-lock.yamlis generated against the outer pnpm workspace, which includes../learning_ai_common_plat/packages/*as workspace members (sibling-repo path).- Inside the Docker build context, the sibling repo doesn't exist (a single-repo build context is intentionally used for hermeticity).
--frozen-lockfiletherefore fails immediately with workspace resolution errors (finding F2 in the roadmap audit).
--lockfile=false skips lockfile validation entirely and re-resolves all
dependencies against the registry on every pnpm install. This is
correct for the workspace-mismatch problem but introduces non-determinism:
the same Dockerfile + same source tree can produce a different lockset
across two builds if upstream @bytelyst/* versions move between them.
Phase A2's BuildKit cache mount mitigates the speed cost of re-resolution but not the determinism cost.
This ADR records the decision on which long-term policy to adopt for Docker builds. Implementation is deferred to a future Phase A3 sprint.
2. Options considered
Option A — Keep --lockfile=false (status quo)
How it works. Docker pnpm install re-resolves on every cold build.
Cache mount preserves the pnpm content-addressed store across builds, so
warm rebuilds don't pay re-resolution cost.
Pros:
- Zero churn — already shipped in Phase A.
- Tolerates sibling-repo workspace mismatch for free.
- Tolerates
*semver across all@bytelyst/*deps without rework. - Compatible with the F17 fix (Gitea
host.docker.internalURLs).
Cons:
- Non-deterministic builds. Same Dockerfile + same source can produce
different
node_modulesif a dependency was published between two cold builds. CI runs days apart can ship divergent images for the same commit. - No supply-chain pinning. Any compromised upstream auto-rolls forward.
pnpm auditon the host can disagree with what's actually inside the image.
Option B — Generate a Docker-only flat lockfile during build
How it works. Add a build step that runs pnpm install --lockfile-only
in a temp dir against a flattened pnpm-workspace.yaml that excludes
sibling-repo paths, then --frozen-lockfile against that generated lock.
Pros:
- Deterministic within a single build — same registry state at the moment of the build always produces the same lockset.
- Doesn't require changes to the source tree's
pnpm-workspace.yaml.
Cons:
- Still non-deterministic across builds (the lock is regenerated each time unless cached separately).
- Adds Dockerfile complexity and a non-trivial new failure mode (workspace-flattening logic).
- Marginal value over Option A given the cache mount.
Option C — Vendor a Docker-flattened lockfile in the repo
How it works. Commit a pnpm-lock.docker.yaml (or similar) per repo
that's generated against a flattened workspace. Dockerfile uses
pnpm install --frozen-lockfile --lockfile=pnpm-lock.docker.yaml.
Pros:
- Fully deterministic. Same commit → same lockset → same image.
- Supply chain pins enforced.
pnpm auditmatches image contents.
Cons:
- Two lockfiles to maintain (the workspace one + the Docker one).
- Drift risk between the two — solved only by a CI gate that regenerates
the Docker lockfile on every PR that touches
package.json. - Requires a tested regenerate-on-CI workflow per repo.
- Workspace flattening logic must be encoded somewhere (script in
common-plat/scripts/regen-docker-lockfile.sh).
Option D — Restructure to single-repo workspace (eliminate sibling)
How it works. Inline the consumed @bytelyst/* packages into each
product repo (vendor them) so there is no sibling-workspace dependency.
Then --frozen-lockfile works trivially.
Pros:
- Cleanest from a Docker-build-determinism standpoint.
Cons:
- Massive churn across 14+ product repos.
- Defeats the entire
learning_ai_common_platshared-package model. - Multiplies maintenance cost of
@bytelyst/*updates by the number of consumers. - Out of scope; would supersede the entire ecosystem architecture.
3. Decision
Adopt Option A (--lockfile=false) as the official short-term policy.
Plan to migrate to Option C (pnpm-lock.docker.yaml) when supply-chain
determinism becomes a hard requirement (e.g., before any production
deployment of a Docker-built image, or before SOC2-style attestation).
Reasoning:
- Phase A is already shipped on Option A with verified speed wins (warm rebuilds 2.7–5.4 s across all surfaces). Switching policies mid-rollout would invalidate metrics + add risk.
- The cache mount (Phase A2) addresses the speed concern that
Option A creates. The remaining concern is determinism, which is a
correctness concern — but the actual blast radius is limited because:
- All
@bytelyst/*deps are first-party and pinned in source repos. - Third-party deps already have fixed semver in
package.json(no loose*ranges to public registries). - The Gitea registry is the only
@bytelyst/*source — no public supply-chain risk for the in-house deps.
- All
- Option C is the right end state but requires CI infrastructure that doesn't exist yet (auto-regen-on-PR). Building it inside this roadmap is scope creep.
- Option B is dominated by Option C — same complexity, weaker guarantees.
- Option D is non-starter — it would require redesigning the ByteLyst shared-package model.
4. Consequences
Positive
- Phase A speed wins are preserved with zero policy churn.
pnpm-lock.yamlcontinues to live in source repos for host development; it stays in.dockerignorefor Docker builds.- The decision is reversible: switching to Option C in the future is additive (add a Docker lockfile + change one Dockerfile line).
Negative
- Same commit can produce different Docker images on different days. CI must not assume image hash stability for a given commit.
pnpm auditresults from the host don't match Docker image contents. Workaround: runpnpm auditinside the built container as a separate CI job (cheap; no rebuild needed).- Supply-chain attestation (SOC2, SLSA) cannot be produced for these images today. Acceptable while there is no production traffic.
Migration trigger
Switch to Option C when any of the following becomes true:
- A production environment (paid customers, real PII) deploys a Docker-built image from this codebase.
- A regulatory/audit requirement demands reproducible builds.
- A supply-chain incident occurs (compromised upstream package) and
we need rollback granularity finer than "rebuild from current
*". - The cache-mount speed win disappears (e.g., CI runner switch removes BuildKit cache persistence).
Implementation sketch (when triggered)
- In
learning_ai_common_plat, addscripts/regen-docker-lockfile.sh:- Reads each product repo's
package.json. - Generates a flattened
pnpm-workspace.yaml(no sibling paths). - Runs
pnpm install --lockfile-onlyagainst the Gitea registry. - Writes
pnpm-lock.docker.yamlback to the product repo.
- Reads each product repo's
- Each product repo gets a
.gitea/workflows/regen-docker-lockfile.ymlthat runs the script on PR-touch ofpackage.jsonand either:- commits the regenerated lockfile (auto-PR), or
- fails the PR with a "run regen-docker-lockfile.sh and commit" message.
- Each product Dockerfile changes one line:
# before RUN pnpm install --ignore-scripts --lockfile=false # after COPY pnpm-lock.docker.yaml ./pnpm-lock.yaml RUN pnpm install --ignore-scripts --frozen-lockfile .dockerignoreremovespnpm-lock.yamlexclusion (or adds explicit include forpnpm-lock.docker.yaml).
This work is not scoped in the current roadmap and should be its own small ADR-driven sprint.
5. Status tracking
| Phase | State | Notes |
|---|---|---|
| Decision | ✅ Accepted | This ADR |
| Implementation | ⏸ Deferred | Triggered by §4 conditions |
| Trigger monitor | ⚳ Open | Re-evaluate when Phase D rollout begins |
6. References
docker-build-optimization-roadmap.md§0 F1, F2 (lockfile findings)docker-build-optimization-roadmap.md§A3 (deferred phase)docker-build-optimization-roadmap.md§A2 (BuildKit cache mount that mitigates the speed concern of Option A)learning_ai_common_plat/AGENTS.md(canonical pnpm workspace config)