Wires bash scripts/gitea/doctor.sh --quiet as a pre-build step in all 4
jobs (backend, web, mobile, e2e). Catches the F15 class of failures —
stale env tokens, registry unreachable, owner-rename drift — in ~2s
instead of waiting for pnpm install to fail mid-build.
continue-on-error: true for now so warnings don't block the pipeline
during rollout. Flip to required once the doctor pattern is established
ecosystem-wide.
Pattern documented in:
learning_ai_devops_tools/docs/docker-build-optimization-roadmap.md § E0
Synced from learning_ai_common_plat npmrc.template. The Gitea owner is
now resolved at install time from ${GITEA_NPM_OWNER:-learning_ai_user},
so future owner renames don't require touching this file.
Without postcss.config.mjs, @tailwindcss/postcss never ran during
'next build', producing a CSS bundle with only @font-face rules (33KB)
and zero Tailwind utility classes. The UI rendered as unstyled HTML
(black background, white text, no spacing).
- Add web/postcss.config.mjs wiring @tailwindcss/postcss (matches the
pattern used by sibling repos like ChronoMind)
- Copy postcss.config.mjs into the web Docker build stage so 'pnpm run
build' can resolve it
The companion commit 47ee228 untracked the .last-run.json file but
the .gitignore line was not actually committed (same staging hiccup
hit in the parallel chronomind/mindlyst/fastgap cleanup commits).
Landing 'test-results/' alongside the existing 'playwright-report/'
entry here.
The web/.gitignore already excluded playwright-report/ but not the
test-results/ companion directory \u2014 so each 'npx playwright test'
run produced a one-line diff to web/test-results/.last-run.json
(timestamps + retry counts).
Changes (single atomic commit):
+ web/.gitignore: add 'test-results/' line
+ git rm --cached web/test-results/.last-run.json (keep on disk)
Verification: 'npx playwright test' \u2192 'git status' is clean.
The parent web/tsconfig.json explicitly excludes the e2e folder because
Next.js doesn't compile Playwright specs. As a result, IDE TypeScript
language servers had no project context for e2e/*.spec.ts files and
false-positive on Node globals like Buffer, process.env, and the
NodeJS namespace — which several specs use to sign fake JWTs
(Buffer.from(...).toString('base64url')) or to read NOTELETT_E2E_*
override env vars.
The new web/e2e/tsconfig.json:
- Extends the parent web/tsconfig.json so all path aliases and
react-jsx config stay consistent.
- Adds 'types': ['node', '@playwright/test'] so Node globals and
Playwright fixtures resolve.
- Resets exclude: [] so the parent's e2e exclusion doesn't recurse
in and re-exclude the very directory this config is meant to
cover (which would otherwise yield TS18003 'No inputs found').
Verified:
- npx tsc --noEmit -p web/e2e/tsconfig.json → no output (clean)
- pnpm --filter @notelett/web run typecheck → still passes
(e2e remains out of the main typecheck as before)
- Playwright run unaffected (it uses tsx, not tsc, for runtime)
The IntakeUrlBar URL field was a raw <input> with ~10 lines of inline
styling carrying its own border/radius/background/font-size. This was
the last component on the dashboard surface still using a raw input
after UI5\u2013UI7, so the ratchet caught it as remaining drift.
Migrated:
- Raw <input> + inline style block \u2192 common <Input> primitive.
- Preserved the absolute-positioned content-type badge overlay by
keeping the wrapper <div style={{ position: 'relative' }}> and
using Input's style prop to right-pad when a badge is present.
- All attributes (type=url, value, onChange, onKeyDown, placeholder,
aria-label) preserved as-is so no behavioral change.
Ratchet impact:
raw interactive controls: 14 \u2192 13 (\u20131)
Lowered scripts/ui-drift-baseline.json from 14 to 13 with this commit
so the CI gate now enforces that bound. The remaining 13 raw controls
are intentional and tracked:
- NoteEditor toolbar buttons (9) \u2014 icon-tight, deliberately raw
- ArtifactPanel hidden file input (1) \u2014 must remain <input hidden>
- Search-mode radios (2) \u2014 would change UX to migrate to Radix RadioGroup
- NoteVersionsPanel disclosure button (1) \u2014 tight inline styling
Verified:
- pnpm --filter @notelett/web run typecheck \u2014 ok
- pnpm --filter @notelett/web run test \u2014 96/96 pass
- bash scripts/ui-drift-ratchet.sh \u2014 all categories at new baseline
Root cause: docker-compose.yml hardcoded NEXT_PUBLIC_NOTES_API_URL to
https://api.bytelyst.com/notelett — a production URL that doesn't
exist on this network — as the *build arg* for the web image. The
docker-compose.override.yml correctly set localhost:4016/api but only
on the runtime environment, which has no effect because NEXT_PUBLIC_*
values are baked into the Next.js bundle at build time (pnpm run build
inside the Dockerfile), not read at runtime.
Symptom: every authenticated client-side fetch from the deployed web
container went to https://api.bytelyst.com/notelett/... which the
corporate proxy intercepted with a blockpage. The saved-views client
in particular fired on every (app)/ layout mount, surfacing a
'Failed to fetch' toast on dashboard load. 4 release-flows.spec
tests failed because page.route('**/api/**') couldn't match the
api.bytelyst.com URLs at all.
Discovery: inspected the deployed bundle inside the running container.
'grep -oE "api.bytelyst.com" /app/web/web/.next/static/chunks/*.js'
returned multiple hits across the (app)/ layout, (auth)/ pages, and
share page. The string was absent from the source tree, which proved
it had been injected at build time via the broken arg default.
Discovery debug pattern (kept for future use):
page.on('requestfailed', r => console.log(r.method(), r.url()));
page.route('**/api/**', route => route.fulfill({status:200,body:'{}'}));
await page.goto('/dashboard');
// FAILED REQUESTS will list any URL not under /api/** that the SPA
// attempted, exposing baked-in production URLs immediately.
Fix (three layers, defense in depth):
1. docker-compose.yml — replace hardcoded
'NEXT_PUBLIC_NOTES_API_URL: https://api.bytelyst.com/notelett'
in the build.args block with
'${NEXT_PUBLIC_NOTES_API_URL:-http://localhost:4016/api}'.
Same treatment for the runtime environment block. Add build args
for the four other NEXT_PUBLIC_* values (extraction, MCP,
diagnostics, product name/id, telemetry transport) so a single env
var on the host controls both build and runtime layers.
2. web/Dockerfile — declare ARG and ENV lines for all seven
NEXT_PUBLIC_* values so the build args reach 'pnpm run build'.
Previously only NOTES_API_URL and PLATFORM_SERVICE_URL were
declared, which meant overriding extraction/MCP/diagnostics via
docker compose silently had no effect on the bundle.
3. docker-compose.override.yml — add a build.args block mirroring the
four URL overrides so the local-only override also reaches build
time, not just runtime. Comment block explains the bake-time vs
runtime distinction so future contributors don't repeat the bug.
Verified end-to-end after the fix:
- docker compose build --no-cache web + up -d → grep of bundle now
shows 'localhost:4016/api', api.bytelyst.com fully gone.
- Debug interception test: zero requestfailed events on /dashboard.
- Playwright release-flows.spec.ts: 4 failed → 4 passed (after URL
fix; no test code changed for these four tests).
- Full Playwright suite (--ignore-snapshots): 43 passed.
- scripts/e2e-docker-test.sh: 9/9 backend API lifecycle steps pass.
- pnpm run verify: backend 380/380, web 96/96, mobile 97/97.
Root cause of bug: web Dockerfile copied .next/static to the wrong path
in the runtime stage. The Next.js 16 standalone server (CMD 'node
web/server.js' from /app/web) runs from /app/web/web/server.js because
'standalone' wraps the source directory. It serves /_next/static/* from
'./web/.next/static' (relative to the standalone server's location),
not from './.next/static' (which is what the previous COPY produced).
Symptom: in the deployed Docker stack at http://localhost:3050 every
client-side JS chunk under /_next/static/chunks/* returned HTTP 404
with content-type text/plain. The browser refused to execute the
chunks (strict MIME), so the SPA never hydrated. All Playwright tests
that ask for any dynamic UI text on a (app)/ page would time out
because AuthGuard never ran in the browser.
Discovery path: deployed compose stack via 'docker compose up -d
--build' + 'scripts/e2e-docker-test.sh' (backend API 9/9 ✓), then ran
Playwright against NOTELETT_WEB_PORT=3050. settings.spec failed with
'product configuration section' not visible. Page snapshot showed
just <skip-to-content link> + toast region — no other content. Console
logs revealed every /_next/static/chunks/* was 404 with text/plain.
'docker exec ls' showed BUILD_ID at /app/web/web/.next/BUILD_ID and
static at /app/web/.next/static — wrong path. Moved static into the
standalone tree and chunks now serve 200 with application/javascript.
Fix:
web/Dockerfile: change
COPY --from=builder /app/web/.next/static ./.next/static
to
COPY --from=builder /app/web/.next/static ./web/.next/static
with explanatory comment so this doesn't regress.
Test hardening (these tests were dev-server-only by accident — they
worked locally because Next.js dev did not enforce the same static
path layout; the bug above hid them in production builds too):
web/e2e/accessibility.spec.ts — 'focus-visible ring appears on tab
navigation' was navigating to /dashboard which AuthGuard correctly
redirects when unauthenticated, leaving the DOM empty (AuthGuard
returns null until verifySessionAndReadiness completes) so Tab
presses focused nothing. Switched to /login which is unauthenticated
by design and has known focusable form inputs.
web/e2e/settings.spec.ts — 'shows product configuration section'
expected /settings to render content without auth. Now obtains real
tokens from platform-service via API, seeds them via addInitScript,
and falls back to test.skip with a clear message if platform-service
is not reachable.
Verified:
- All 31 Playwright tests across navigation/accessibility/dashboard/
search/settings/smart-actions/reviews specs PASS against the
deployed Docker stack at :3050.
- 'pnpm run verify': backend 380/380, web 96/96, mobile 97/97.
- 'bash scripts/e2e-docker-test.sh': 9/9 backend API CRUD steps pass.
- 'curl -sI http://localhost:3050/_next/static/chunks/app/error-*.js'
now returns 200 + application/javascript.
Not migrated: e2e/release-flows.spec.ts and e2e/visual-regression.spec.ts
intentionally remain dev-server-targeted. release-flows.spec uses
page.route() to mock backend responses and is meant to test the UI in
isolation against a dev server. visual-regression.spec needs baseline
regeneration after the UI5-UI8 migration; this is a separate workstream
tracked in docs/UI_UX_PLATFORM_CORE_ROADMAP.md.
UI8 closes the migration cycle started by UI0. The four legacy global
classes (.surface-card, .surface-muted, .badge, .input-shell) are
removed from web/src/app/globals.css and the CI ratchet now enforces
zero new occurrences across three of the four drift categories.
Changes:
1. Audit regex precision (scripts/ui-drift-audit.sh, scripts/ui-drift-ratchet.sh)
The previous pattern 'className="[^"]*(badge|surface-card|surface-muted|input-shell)'
matched the literal token anywhere inside className, which caused 21
false positives against Tailwind arbitrary values like
'bg-[color:var(--nl-surface-muted)]' where the legacy name appears
inside a 'var(--nl-...)' reference.
New pattern requires the legacy class to be a whole class token —
either at the start of className, or preceded by a space, and
followed by a space or closing quote. Result: 21 false positives
eliminated; the ratchet now reports an honest 0 for the legacy
category.
2. globals.css cleanup (web/src/app/globals.css)
Removed .surface-card, .surface-muted, .badge, .input-shell rules.
Only truly global utilities remain (typography, focus-visible,
sr-only, skip-link, motion preferences, layout grids). A header
comment documents that re-introductions should be solved at the
call-site with a primitive, not by restoring the global rule.
3. Ratchet baseline (scripts/ui-drift-baseline.json)
Final counts after UI5–UI8 across the session:
raw interactive controls 14 (was 38 at start)
legacy global surface classes 0 (was 92 at start)
hardcoded color literals 0 (no change, was already 0)
direct @bytelyst/ui imports 0 (no change, was already 0)
The 14 remaining raw controls are intentional and tracked:
NoteEditor toolbar buttons (10)
ArtifactPanel hidden file input (1)
search/page radio inputs (2)
NoteVersionsPanel disclosure button (1)
4. CI gate (.github/workflows/ci.yml release-guards job)
Documented that the ratchet is the canonical gate post-UI8: because
legacy/colors/imports baselines are 0, any new occurrence in those
three categories now fails CI. The strict-audit script is kept as
a local diagnostic tool but not wired as a gate (would fail on the
14 intentional raw controls).
5. Roadmap (docs/UI_UX_PLATFORM_CORE_ROADMAP.md)
Marked UI5, UI6, UI7, UI8 all complete with per-phase commit hashes
and explicit deliverables.
Cumulative migration impact (from initial baseline):
raw interactive controls 38 → 14 (-24, -63%)
legacy global surface classes 92 → 0 (-92, -100%)
Verified:
- pnpm run verify: backend 380/380, web 96/96, mobile 97/97
- bash scripts/ui-drift-ratchet.sh: all four categories at baseline
- bash scripts/ui-drift-audit.sh: only "Raw interactive controls"
category has matches (intentional, tracked above)
- Live Docker stack at http://localhost:3050 still serves 200,
backend health 200
Implements the full E2E flow against the deployed docker stack and
documents it as a repeatable test playbook.
Surfaced and fixed three real issues while building the E2E:
1. JWT secret mismatch — docker-compose.override.yml backend was using
a NoteLett-only JWT_SECRET that platform-service did not share, so
every Authorization: Bearer call returned 'Invalid or expired token'.
Aligned the override to use platform-service's actual secret
(dev-ecosystem-secret-do-not-use-in-production).
2. CORS preflight missing PATCH/DELETE — @bytelyst/fastify-core registers
@fastify/cors with only { origin }, which leaves Access-Control-Allow-
Methods at the @fastify/cors default of 'GET,HEAD,POST'. Real browser
PATCH/DELETE preflights would fail. Added an onSend hook in
backend/src/server.ts that rewrites the header to
'GET,HEAD,POST,PATCH,PUT,DELETE,OPTIONS' on CORS preflight responses.
3. Product 'notelett' wasn't registered with platform-service — auth
register/login both error with 'Unknown or disabled product: notelett'.
The seed script now POSTs to /api/products idempotently.
Deliverables:
- scripts/e2e-docker-seed.sh — idempotent: registers the notelett product
and creates two test users (admin@notelett.app with role=admin who can
write, user@notelett.app with role=user who is read-only). Re-runs are
no-ops once seeded.
- scripts/e2e-docker-test.sh — 9-step E2E that drives the deployed stack
via HTTP only (no browser): login → CORS preflight for PATCH →
workspace create → note create → note read → note PATCH (status:
draft→active) → note list → note delete → workspace delete.
- docs/testing/E2E_DOCKER_TESTING.md — full playbook covering prereqs,
seed, automated E2E, manual UI smoke, stack architecture diagram,
troubleshooting (JWT mismatch, unknown product, role rejection,
CORS, port conflict, data loss), tear-down, CI wiring guidance.
- package.json — pnpm e2e:docker:seed and pnpm e2e:docker:test
shortcuts.
Verified live on this host's deployed stack:
$ bash scripts/e2e-docker-seed.sh
↷ product 'notelett' already exists
↷ admin user already registered + login works
✓ user created
🟢 Seed complete.
$ bash scripts/e2e-docker-test.sh
✓ user=usr_e094e0c2-... role=admin
✓ CORS allows PATCH
✓ workspace created
✓ note created
✓ note read matches
✓ note patched (status: draft → active)
✓ note list returned (1 item)
✓ note deleted (HTTP 204)
✓ workspace deleted (HTTP 204)
🟢 All 9 E2E steps passed.
Backend regression suite still green: 380/380.
Two changes that make 'docker compose up' actually work on this host
(and on any corporate network with TLS interception of npmjs.org):
1. backend/Dockerfile gains the same NODE_TLS_REJECT_UNAUTHORIZED=0 +
NPM_CONFIG_STRICT_SSL=false envs and 'npm config set strict-ssl false'
step that web/Dockerfile already had. Without this, the 'npm install
-g pnpm@10.6.5' step failed with UNABLE_TO_GET_ISSUER_CERT_LOCALLY
on corp networks. Build-time-only; production runtime image is
unaffected.
2. docker-compose.override.yml (new) is picked up automatically by
'docker compose up' and:
- remaps the web container's host port from 3000 to 3050 (port 3000
on this host is held by Grafana). Uses 'ports: !override' so the
base port mapping is replaced rather than appended.
- points the backend at the sibling platform-service (4003),
extraction-service (4005), and mcp-server (4007) running on the
host network via host.docker.internal.
- sets DB_PROVIDER=memory and a 32+ char JWT_SECRET so the backend
starts in dev mode without Cosmos credentials.
Verified live on this host:
docker compose up -d → both notelett-backend (healthy) and
notelett-web running.
curl http://localhost:4016/health → {status:ok,service:notelett-backend}
curl http://localhost:3050/dashboard → HTTP 200, '<title>NoteLett</title>'
Audit of the full E2E suite (43 specs) surfaced four issues that were
hiding behind 'all 96/96 web unit tests pass' but actually meant the
browser-level coverage was broken end-to-end. All four are fixed and
the suite now passes 43/43.
1. Port conflict silently testing wrong app. playwright.config.ts hard-
coded baseURL=http://localhost:3000 with reuseExistingServer:true on
non-CI hosts. When the dev host had ANY service on :3000 (Grafana,
chronomind, etc), Playwright happily ran the entire E2E suite
against the wrong app and reported the unrelated failures as
'real'. Now honors NOTELETT_WEB_PORT env (default 3000) so a
contributor can opt into any free port and Playwright drives both
baseURL and the dev-server PORT consistently.
2. Missing test dependency. web/e2e/accessibility.spec.ts imports
@axe-core/playwright but web/package.json never declared it.
The accessibility coverage was DOA — every CI run that included
this spec would module-not-found-error before a single check ran.
Added @axe-core/playwright to devDependencies.
3. Mock that never fires. smart-actions.spec.ts 'history API mock
returns items' used page.route() to mock /api/note-prompts/history
then bypassed the mock entirely with page.request.get() (which uses
Playwright's separate request context, not the browser context that
page.route intercepts). The request went to the dev server and got
404. Replaced with page.goto + page.evaluate(fetch(...)) so the
browser-side fetch hits the page.route mock as intended.
4. Missing visual-regression baselines. visual-regression.spec.ts had
no committed baseline screenshots for dashboard / workspaces /
search. First run on a clean host always reported 'snapshot doesn't
exist, writing actual'. Generated and committed darwin baselines.
Verified end-to-end (NOTELETT_WEB_PORT=3050 against this host's free
port):
43 passed (34.8s)
Total test-tier counts on main now:
backend unit + integration (memory) 380/380
backend cosmos emulator (live) 4/4
web vitest 96/96
mobile vitest 97/97
web playwright e2e 43/43
---
TOTAL 620/620
Previously P10.5 was marked complete with a deferral note because the
sibling services (platform-service 4003, extraction-service 4005,
mcp-server 4007) were not running on the audit host. Today they are
all running, so I executed the smoke and confirmed it passes.
Command:
JWT_SECRET="dev-secret-change-me-at-least-32-characters-long" \
bash scripts/local-smoke.sh
Output (exit 0, 11 ok lines):
info: starting NoteLett backend in memory mode
ok: NoteLett backend started at http://localhost:4016
ok: NoteLett health
ok: NoteLett bootstrap
ok: platform-service health
ok: extraction-service health
ok: mcp-server health
ok: authenticated workspace create
ok: authenticated note create
ok: authenticated note read
ok: smoke cleanup attempted
ok: local production-readiness smoke passed
Updates:
- §Post-Sprint-A Re-verification: replaces the blanket deferral note
with the actual verification details for live shared-service smoke
and a separate, narrower deferral note for Docker compose smoke
(which still fails on corp-network hosts due to TLS interception in
the backend/Dockerfile npm install step but succeeds on CI).
- §P10.5: replaces the historical deferral text with today's
end-to-end verification result.
The existing 380-test backend suite runs entirely against the in-memory
datastore provider, which treats every partition-key value as equivalent.
This hid one entire class of bug — partition-key mismatches — until
production. D7 closes that gap.
Implementation:
- backend/src/test-helpers.ts adds useCosmosDatastore() that swaps the
active provider for CosmosDatastoreProvider using COSMOS_ENDPOINT /
COSMOS_KEY / COSMOS_DATABASE. Throws synchronously when env is missing
so a misconfigured run fails loudly instead of silently falling back
to in-memory.
- backend/vitest.config.ts now excludes src/**/*.cosmos.test.ts so the
default 'pnpm test' run stays green for contributors without Docker.
- backend/vitest.cosmos.config.ts (new) includes ONLY *.cosmos.test.ts,
bumps testTimeout to 30s / hookTimeout to 60s for the real client
round-trips, and locks DB_PROVIDER=cosmos in test env.
- backend/src/cosmos.smoke.cosmos.test.ts (new) covers the four most
important partition-key contracts in NoteLett:
workspaces /userId
notes /workspaceId
note_tasks /workspaceId
note_shares /workspaceId (full create → resolve → delete → null)
Each test also asserts that a wrong-partition-key lookup returns null,
which is the failure mode the in-memory provider cannot simulate.
- backend/package.json adds 'test:cosmos' script.
- .github/workflows/ci.yml gains a backend-cosmos job that boots the
official mcr.microsoft.com/cosmosdb/linux/azure-cosmos-emulator
container as a service, waits for it to be ready (60 × 5s polls of
/_explorer/emulator.pem), then runs pnpm test:cosmos against it.
The job depends on the existing backend job so the emulator only
spins up after unit tests pass.
Verified locally:
- pnpm --filter @notelett/backend test: 380/380 (cosmos suite excluded)
- vitest list --config vitest.cosmos.config.ts: 4 tests under the cosmos
smoke suite, as designed
- pnpm run verify: end-to-end green (backend 380/380, web 96/96,
mobile 97/97)
- ci.yml passes Python yaml.safe_load
CI verification: the new job will execute on the next push. Local
verification against the emulator requires Docker on the dev host.
Web lint warnings reduced from 20 → 15 by fixing the categories that
flag real architectural smells rather than the canonical
fetch-on-mount setState pattern.
Real fixes:
1. web/src/lib/use-theme.ts — replace useEffect + setState mount-sync
pattern with React.useSyncExternalStore. The hook now subscribes to
browser storage events, returns a stable snapshot for SSR, and uses
a manual storage-event dispatch so same-document setters refresh
correctly. Eliminates the cascading-render advisory and gains free
cross-tab theme sync.
2. web/src/lib/use-keyboard-shortcuts.ts — move ref assignment from
render time into a useEffect. Fixes the 'Cannot access refs during
render' advisory without behavior change.
3. web/src/components/NoteEditor.tsx — move onSaveRef.current = onSave
from render time into a useEffect for the same reason.
4. web/src/app/(app)/reviews/page.tsx — wrap handleDecision and
handleBatchDecision in useCallback so the useEffect that depends
on them no longer re-subscribes the keydown listener on every
render. Fixes both react-hooks/exhaustive-deps warnings and the
underlying perf bug they pointed at.
5. web/src/app/(app)/prompts/page.tsx — wrap loadTemplates in
useCallback declared before the useEffect that calls it. Fixes
the 'Cannot access variable before it is declared' advisory.
Remaining 15 warnings are React-compiler runtime hints about
fetchData().then(setData) patterns inside useEffect, which is the
canonical fetch-on-mount pattern shown in React's own docs. Resolving
them properly requires Suspense + use() or risky startTransition
wraps; both are out of scope and tracked under future tech debt.
Verified:
- pnpm --filter @notelett/web run typecheck: passes
- pnpm --filter @notelett/web run lint: 0 errors, 15 warnings (down 5)
- pnpm run verify: backend 380/380, web 96/96, mobile 97/97
UI8 deferred deleting the legacy global classes (.surface-card,
.surface-muted, .input-shell, .badge) because 69+ call sites in UI6/UI7
territory (dashboard, search, workspaces, notes detail, chat, palace)
still depend on them. Removing the globals before those screens migrate
would visually break the app.
Instead, ship a one-way ratchet that solves the actually-important
problem: prevent NEW legacy usage from creeping in while existing
sites get migrated.
- scripts/ui-drift-ratchet.sh — reads scripts/ui-drift-baseline.json
and FAILS if any of the four UI drift categories regress above the
tracked baseline. Pure bash, no jq required, works with grep or
ripgrep. Uses the same patterns as scripts/ui-drift-audit.sh.
- scripts/ui-drift-baseline.json — checked-in baseline captured today:
raw controls 38, legacy classes 92, hardcoded colors 0, direct imports 0.
- package.json — adds pnpm run audit:ui:ratchet and
audit:ui:ratchet:update scripts.
- .github/workflows/ci.yml release-guards job — runs the ratchet as a
required step plus the existing audit in report mode.
- docs/UI_UX_PLATFORM_CORE_ROADMAP.md — marks the CI-guard checklist
item complete, documents the path to fully strict mode (drive
baseline to zero, then delete globals.css legacy classes, then flip
audit:ui:strict from advisory to required).
Verified:
- Ratchet at baseline: exits 0
- Synthetic regression (added a file with surface-card + raw <input>):
ratchet correctly exits 1, reporting +1 in each affected category
- pnpm run verify: backend 380/380, web 96/96, mobile 97/97 (no
behavior change)
Completes the high-leverage half of UI5 by migrating the most form-heavy
authenticated screens off the legacy 'input-shell' / inline-style pattern
onto Input, Textarea, Select, and AlertBanner primitives.
Migrated:
- web/src/app/(app)/settings/page.tsx — change-password form, feedback
form, MCP/API-tokens/offline-queue cards. Replaces 'surface-card'
sections with Card components, 'input-shell' inputs/selects/textareas
with Input/Select/Textarea, and inline error/success divs with
AlertBanner.
- web/src/components/CreateNoteModal.tsx — template/workspace/title/body/tags
fields. Select primitive uses options=[{value,label}].
- web/src/components/LinkNoteModal.tsx — search input + relationship-type
select + alert banner for errors.
- web/src/components/ShareDialog.tsx — user-id input, permission select,
collaborator/public-link rows now use AlertBanner (tone='neutral') for
the muted-surface look. Web Share API unsupported message is now a
proper tone='warning' banner.
- web/src/components/PromptTemplateEditor.tsx — full form (name, slug,
description, 3 selects, 2 textareas) migrated.
All existing tests continue to pass without modification because
@testing-library queries (getByLabel, getByPlaceholder, getByText) are
robust against the underlying HTML structure changes.
Verified:
- pnpm --filter @notelett/web run typecheck: passes
- pnpm --filter @notelett/web run test: 96/96 (existing CreateNoteModal,
LinkNoteModal, ShareDialog suites all green)
- pnpm run verify: end-to-end (backend 380/380, web 96/96, mobile 97/97)
- Legacy class matches in web/src dropped from 89 to 69 over the UI5
slice; remaining matches are in UI6/UI7 territory (dashboard, search,
workspaces list, notes detail, chat, palace, NoteEditor).
While migrating CreateNoteModal to use @bytelyst/ui Input/Select/Textarea
(which internally call React.useId), Vitest tests failed with:
TypeError: Cannot read properties of null (reading 'useId')
Root cause: the web package pins react@19.2.0 but @bytelyst/ui declared
react: '^19.0.0' as a peer, so pnpm resolved 19.2.6 for it from the
common-platform side. Two React copies coexisted (19.2.0 and 19.2.6),
the @bytelyst/ui components linked against one and react-dom test-rendered
against the other, and useId failed because the dispatcher belonged to
a different React instance than the consumer.
Fix: declare pnpm.overrides in the workspace root so the entire monorepo
resolves to a single react@19.2.0 / react-dom@19.2.0 pair. Verified via
'pnpm why react' (all transitive references now point at 19.2.0) and the
on-disk symlinks (web/node_modules/@bytelyst/ui/node_modules/react and
common-plat/packages/ui/node_modules/react both link to
.pnpm/react@19.2.0).
Three mechanical lint warnings in the web package are resolved with
zero behavior change:
- web/src/app/(app)/notes/[noteId]/page.tsx — rename onTagsAccepted
callback param to '_tags' to match the no-unused-vars allowlist
(the param is intentionally unused; we trigger a re-save regardless).
- web/src/lib/feedback-client.ts — drop the unused PRODUCT_ID import.
- web/src/lib/notes-client.ts — delete the dead toWorkspaceSummary()
helper. Workspace summaries are produced by listWorkspaceSummaries()
on the backend response now; the local helper had no callers.
Web lint goes from 23 → 20 warnings. Remaining 20 are React-compiler
advisories about setState-in-effect patterns; those require careful
per-component refactoring (useReducer, derive-from-props, or
startTransition) and are tracked under Sprint D / Q1 tech debt rather
than fixed mechanically.
The audit script silently passed on hosts without ripgrep installed
because 'rg -n ...' would fail, '|| true' swallowed the failure,
'matches' would be empty, and report() would print 'ok: no matches'.
This hid genuine UI drift from local 'pnpm run audit:ui' runs.
Changes:
- Detect ripgrep availability at startup and emit a stderr note when
falling back.
- Add a grep-based fallback that translates rg '--glob !path' exclusions
into 'grep --exclude=<basename>' so caller-side exclusions (e.g. the
@bytelyst/ui adapter file at Primitives.tsx) still apply.
- Guard the optional 'extra_excludes' array expansion against 'set -u'
when no exclusions are configured.
Result: on this host (no rg) the audit now correctly reports
2 categories with matches — raw interactive controls and legacy global
surface classes — instead of the false 'all green' it produced before.
'pnpm run audit:ui:strict' exits non-zero when matches remain, ready to
wire into CI once UI5–UI8 finish migrating the remaining call sites.
Sprint C / UI5 — migrate the highest-leverage user-facing forms off the
legacy 'input-shell' / inline-style pattern onto the @bytelyst/ui Input,
Textarea, and AlertBanner primitives via the local Primitives.tsx adapter.
Adapter additions (web/src/components/ui/Primitives.tsx):
- Re-export AlertBanner, FormSection, and FieldGrid from @bytelyst/ui so
product code never imports from the underlying package directly.
Migrated screens:
- web/src/app/(auth)/login/page.tsx
- web/src/app/(auth)/register/page.tsx
- web/src/app/(auth)/forgot-password/page.tsx
- web/src/components/CreateWorkspaceModal.tsx
Each migration replaces the ad-hoc 'input-shell' inputs and manual
label/error/success divs with the Input (label + hint props), Textarea,
and AlertBanner (tone='error'|'success') primitives. Inline style blocks
are replaced with Tailwind utility classes that read from the existing
--nl-* CSS custom properties so the visual tokens remain unchanged.
The 3 auth pages alone remove 9 input-shell call sites; the
CreateWorkspaceModal removes 2 more.
Verified:
- pnpm --filter @notelett/web run typecheck: passes
- pnpm --filter @notelett/web run test: 96/96 pass
- pnpm run verify: end-to-end green (backend 380/380, web 96/96, mobile 97/97)
Sprint B — closes audit item B7 (doc consolidation).
- docs/AGENT_TASK_ROADMAP.md, docs/ARCHITECTURE_REVIEW_AND_REUSE_ROADMAP.md,
docs/GAP_ANALYSIS.md were each self-marked as historical snapshots
but kept polluting the top of docs/. Moved them under docs/archive/
in the previous commit; this commit:
- Adds docs/archive/README.md explaining what's archived vs active
- Repoints cross-doc links in docs/IMPLEMENTATION_TRACKER.md,
docs/WEB_AI_FAST_ROADMAP.md, and docs/roadmaps/*.md to the new
archive paths
- Fixes relative links inside the archived files themselves so
historical readers can still navigate back to active docs
- AGENTS.md §1.1 refreshed: reflects the May 22 re-verified state
(382/96/97 tests), links the two new runbooks, and points readers
away from docs/archive/ as a work source.
Sprint B — closes audit items B4 and B5.
- docs/runbooks/MEK_ROTATION.md: step-by-step procedure for rotating
the field-encrypt master key in Azure Key Vault, including pre-flight
checks, rewrapAllDeks usage, verification queries, rollback, and lost-MEK
recovery. Replaces the previous gap where MEK rotation had no
documented operator path.
- docs/runbooks/SECRET_MANAGEMENT.md: inventory of every secret consumed
by NoteLett with its production source (AKV), two production-grade
patterns (workload identity vs K8s CSI), the compose-host pattern,
rotation flow per secret type, verification commands, and red-flag
triage.
Both docs cross-link each other and call out concrete open items
(automation, dual-JWT support, audit-log emission) for later sprints
rather than overstating current capabilities.
Sprint B — closes audit items B6 (event-bus completeness) and B3
(public-share revocation regression).
Event bus:
- note-tasks/repository.ts createNoteTask now emits task.created with
taskId, noteId, workspaceId, userId, title
- workspaces/repository.ts createWorkspace now emits workspace.created
with workspaceId, userId, name
The event-bus already declared these event types (event-bus.ts) and
webhook subscribers can target them, but they were never emitted —
making the contract dead. Emissions follow the same .catch(() => {})
pattern used by note.created/updated/deleted in notes/repository.ts so
a subscriber failure cannot break the create flow.
Regression tests:
- note-tasks/repository.test.ts and workspaces/repository.test.ts
exercise the emission paths end-to-end through the in-memory
datastore.
- note-shares/repository.integration.test.ts adds a 5-test integration
suite for the public-share revocation path: token resolves before
revocation; token returns null after deleteShare (hard delete);
expired token returns null; cross-product token rejected;
listSharesForNote does not include revoked shares.
Verified:
- pnpm --filter @notelett/backend run test: 380/380 (was 373, +7 new)
- pnpm run verify end-to-end green
- Commit previously untracked docs/NEXT_SPRINT_ROADMAP.md with refreshed
May 22 status; mark Sprint 1 (backend build) and Sprint 2 (lint) as
resolved by Sprint A workspace-path fix
- Add post-Sprint-A re-verification section to
docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md documenting the
workspace-path regression and the re-verified gates
- Update README quick-start to reference the canonical common-platform
checkout path with BYTELYST_COMMON_PLAT_ROOT override note
Restores green build after the May 12 Docker/UI regression.
Root cause: pnpm-workspace.yaml referenced a sibling path
(../learning_ai/learning_ai_common_plat/...) that did not exist on
dev/CI hosts. .pnpmfile.cjs fell back to ../learning_ai_common_plat for
some packages but missed others, so @bytelyst/ui was pulled from a
stale Gitea 0.1.0 tarball with zero exports (breaking web typecheck +
26 tests) and @bytelyst/monitoring was never linked into node_modules
(breaking backend typecheck + 2 test suites).
Changes:
- pnpm-workspace.yaml now references ../learning_ai_common_plat/packages/* directly
- .pnpmfile.cjs swaps DEFAULT/LEGACY common-plat roots so the canonical
path is the default and the older nested path is the fallback
- scripts/docker-prep.sh, scripts/local-smoke.sh, scripts/release-guard-audit.sh
follow the same canonical-first / legacy-fallback pattern
- .github/workflows/ci.yml symlinks directly to ../learning_ai_common_plat
- pnpm-lock.yaml regenerated with @bytelyst/ui@0.1.9 and
@bytelyst/monitoring@0.1.5 linked to the local common-plat checkout
Verified:
- pnpm run verify: backend 373/373, web 96/96, mobile 97/97
- pnpm run audit:release-guards: passes
- backend, web, mobile lint all exit 0 (advisory warnings retained)
- Fixed NEXT_PUBLIC_NOTES_API_URL to use public API endpoint
- Updated docker-compose.yml environment format to proper YAML
- Updated Dockerfiles to remove Gitea secrets and use .docker-deps
- Added docker-prep.sh script for dependency packaging
- Changed NODE_ENV back to development for compatibility with memory DB
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
The base image approach is too complex for the current pnpm workspace structure.
Products cannot easily use the base image's workspace because pnpm expects all
workspace packages to be present during install. Reverting to the proven
docker-prep.sh tarball approach for now.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
The base image only includes production dependencies, so we need to install
all dependencies (including devDependencies) in the builder stage to have
TypeScript and Next.js available for building.
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Update Dockerfiles to use bytelyst-common-base-backend and bytelyst-common-base-web
images instead of installing @bytelyst/* packages via tarballs.
Benefits:
- Smaller final images (~50MB vs ~250MB)
- Faster builds (base image cached)
- Consistent package versions across products
- No need for docker-prep.sh tarball packing
Generated with [Devin](https://cli.devin.ai/docs)
Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>