Compare commits
4 Commits
9e8d0bd048
...
59c4638f85
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
59c4638f85 | ||
|
|
930f97ff63 | ||
|
|
a6cd7fe965 | ||
|
|
0f2956884c |
171
dashboards/tracker-web/docs/IMPLEMENTATION_TRACKER.md
Normal file
171
dashboards/tracker-web/docs/IMPLEMENTATION_TRACKER.md
Normal file
@ -0,0 +1,171 @@
|
||||
# Tracker Dashboard — Implementation Tracker
|
||||
|
||||
**Created:** 2026-05-25
|
||||
**Source:** [`docs/ROADMAP.md`](./ROADMAP.md) · [`docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md)
|
||||
**Companion docs:** [`docs/PRD.md`](./PRD.md), `AGENTS.md`
|
||||
|
||||
---
|
||||
|
||||
## How to Use This Document
|
||||
|
||||
- Each task has a checkbox. **Check it off and add the commit SHA in parentheses** when done.
|
||||
- Run the verification commands listed in each phase before marking it done.
|
||||
- **Dependency chain:** Phase 1.A → 1.B → 1.C → 1.D → 1.E → 1.F (Phase 1 is sequential).
|
||||
- **After Phase 1:** Phases 2 and 3 can be parallelised by different agents; Phase 4 depends on Phase 3 webhooks; Phases 5 and 6 can be parallelised after Phase 4.
|
||||
|
||||
---
|
||||
|
||||
## Baseline (May 25, 2026)
|
||||
|
||||
| Surface | Typecheck | Tests | Build | Container health |
|
||||
| ---------------- | ---------- | ---------- | ---------- | ---------------------------- |
|
||||
| tracker-web | unverified | unverified | unverified | ⚠️ unhealthy (B-001/B-002) |
|
||||
| platform-service | unverified | unverified | unverified | ⚠️ unhealthy (B-001 cascade) |
|
||||
| valkey | n/a | n/a | n/a | ⚠️ unhealthy (B-001 root) |
|
||||
|
||||
**Action:** Establish baseline by running `pnpm run verify` in `dashboards/tracker-web` and recording results here before starting Phase 1.A.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.A — Container Health Restoration
|
||||
|
||||
**Goal:** All tracker-related containers report `(healthy)`.
|
||||
**Estimated effort:** 2–4 hours.
|
||||
**Dependencies:** None — start here.
|
||||
|
||||
- [ ] **1.A.1** Diagnose `valkey` unhealthy status (`______`)
|
||||
- [ ] **1.A.2** Restart `platform-service` after valkey is green (`______`)
|
||||
- [ ] **1.A.3** Implement real `/health` route in tracker-web (`______`)
|
||||
- [ ] **1.A.4** Add `restart: unless-stopped` to all tracker services (`______`)
|
||||
- [ ] **1.A.5** Add 8 GB swap to VM (`______`)
|
||||
- [ ] **1.A.6** Cap Gitea CI runner concurrency to 2 (`______`)
|
||||
|
||||
**Verification:** `docker ps | grep tracker-web` shows `(healthy)`. Record output below.
|
||||
|
||||
```
|
||||
[paste docker ps output here after 1.A is complete]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.B — Workspace Health
|
||||
|
||||
**Goal:** Green `pnpm run verify` with no React-instance warnings.
|
||||
**Estimated effort:** 1–2 hours.
|
||||
**Dependencies:** None — can run in parallel with 1.A.
|
||||
|
||||
- [ ] **1.B.1** Pin React + React-DOM via `pnpm.overrides` (`______`)
|
||||
- [ ] **1.B.2** Verify `pnpm-workspace.yaml` canonical paths (`______`)
|
||||
- [ ] **1.B.3** Clean `pnpm install` to relink (`______`)
|
||||
- [ ] **1.B.4** Typecheck passes (`______`)
|
||||
- [ ] **1.B.5** Lint passes (`______`)
|
||||
- [ ] **1.B.6** Test passes (`______`)
|
||||
- [ ] **1.B.7** Build passes (`______`)
|
||||
- [ ] **1.B.8** Resolve React-compiler advisories (`______`)
|
||||
- [ ] **1.B.9** Remove dead code from lint pass (`______`)
|
||||
- [ ] **1.B.10** Add dedicated `e2e/tsconfig.json` (`______`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.C — Docker Hardening
|
||||
|
||||
**Goal:** Self-contained Docker image, healthy in compose, correct runtime URLs.
|
||||
**Estimated effort:** 3–5 hours.
|
||||
**Dependencies:** 1.A complete (valkey + platform-service healthy).
|
||||
|
||||
- [ ] **1.C.1** Adopt `scripts/docker-prep.sh` (`______`)
|
||||
- [ ] **1.C.2** Add `.docker-deps/` to `.gitignore` (`______`)
|
||||
- [ ] **1.C.3** Bake `NEXT_PUBLIC_*` at build time (`______`)
|
||||
- [ ] **1.C.4** Fix standalone static-chunks 404 (`______`)
|
||||
- [ ] **1.C.5** IPv4 healthcheck (`______`)
|
||||
- [ ] **1.C.6** Corp-proxy build args (`______`)
|
||||
- [ ] **1.C.7** `docker-compose.override.yml` for sibling services (`______`)
|
||||
- [ ] **1.C.8** Verify `docker compose up -d --build` brings tracker-web up healthy (`______`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.D — UI Drift Ratchet
|
||||
|
||||
**Goal:** One-way CI gate preventing UI regressions.
|
||||
**Estimated effort:** 2–3 hours.
|
||||
**Dependencies:** 1.B complete.
|
||||
|
||||
- [ ] **1.D.1** Add `scripts/ui-drift-audit.sh` (`______`)
|
||||
- [ ] **1.D.2** Add `scripts/ui-drift-ratchet.sh` (`______`)
|
||||
- [ ] **1.D.3** Hard-zero categories enforced (`______`)
|
||||
- [ ] **1.D.4** Add `audit:release-guards` script (`______`)
|
||||
- [ ] **1.D.5** Wire into Gitea CI workflow (`______`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.E — Test Hardening
|
||||
|
||||
**Goal:** Tests that catch real regressions.
|
||||
**Estimated effort:** 6–10 hours.
|
||||
**Dependencies:** 1.C complete (compose stack up).
|
||||
|
||||
- [ ] **1.E.1** Vitest unit ≥ 80 % on `src/lib/` (`______`)
|
||||
- [ ] **1.E.2** Playwright E2E happy path (`______`)
|
||||
- [ ] **1.E.3** Playwright E2E public flow (`______`)
|
||||
- [ ] **1.E.4** Docker compose E2E test script (`______`)
|
||||
- [ ] **1.E.5** Seed scripts (`______`)
|
||||
- [ ] **1.E.6** Port-conflict-proof Playwright (`______`)
|
||||
- [ ] **1.E.7** E2E cleanup traps (`______`)
|
||||
- [ ] **1.E.8** Playwright `BASE_URL` switch (`______`)
|
||||
- [ ] **1.E.9** `@axe-core/playwright` installed + wired (`______`)
|
||||
- [ ] **1.E.10** API contract tests (`______`)
|
||||
- [ ] **1.E.11** Cosmos emulator CI job (`______`)
|
||||
- [ ] **1.E.12** Live shared-service smoke (`______`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.F — Observability + Security
|
||||
|
||||
**Goal:** Visible signals + locked-down ingress.
|
||||
**Estimated effort:** 4–6 hours.
|
||||
**Dependencies:** 1.A complete; can run in parallel with 1.D / 1.E.
|
||||
|
||||
- [ ] **1.F.1** Global React error boundary (`______`)
|
||||
- [ ] **1.F.2** Structured logging via `@bytelyst/logger` (`______`)
|
||||
- [ ] **1.F.3** Loki log forwarding (`______`)
|
||||
- [ ] **1.F.4** Prometheus `/metrics` endpoint (`______`)
|
||||
- [ ] **1.F.5** Grafana alert wired (`______`)
|
||||
- [ ] **1.F.6** Client error tracking (`______`)
|
||||
- [ ] **1.F.7** Security headers audit (`______`)
|
||||
- [ ] **1.F.8** CSRF tokens on mutating routes (`______`)
|
||||
- [ ] **1.F.9** Audit log on item mutations (`______`)
|
||||
- [ ] **1.F.10** PII scrubbing in logs (`______`)
|
||||
- [ ] **1.F.11** MEK rotation runbook (`______`)
|
||||
- [ ] **1.F.12** Secret management runbook (`______`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Exit Criteria
|
||||
|
||||
All boxes checked + verifications recorded above + the following live-system checks pass:
|
||||
|
||||
- [ ] `docker ps | grep tracker` — all `(healthy)` for > 24 h
|
||||
- [ ] `pnpm run verify` exits 0 with no warnings
|
||||
- [ ] `pnpm run audit:release-guards` exits 0
|
||||
- [ ] `pnpm test:e2e:ci` exits 0 against deployed stack
|
||||
- [ ] `pnpm smoke:local` exits 0 end-to-end
|
||||
- [ ] Bugs B-001, B-002, B-005, B-009, B-010, B-019, B-020, B-021 marked closed
|
||||
- [ ] Grafana synthetic-error test fires alert within 5 min
|
||||
|
||||
**Phase 1 sign-off commit:** `______`
|
||||
|
||||
---
|
||||
|
||||
## Phases 2–6
|
||||
|
||||
Detailed task lists for Phases 2–6 will be expanded into per-phase trackers as each phase begins.
|
||||
See [`docs/ROADMAP.md`](./ROADMAP.md) for current scope and [`docs/roadmaps/`](./roadmaps/) for
|
||||
focused execution plans.
|
||||
|
||||
| Phase | Tracker file | Status |
|
||||
| ----- | ----------------------------------- | ---------- |
|
||||
| 2 | _to be created when Phase 2 starts_ | 🔲 Planned |
|
||||
| 3 | _to be created when Phase 3 starts_ | 🔲 Planned |
|
||||
| 4 | _to be created when Phase 4 starts_ | 🔲 Planned |
|
||||
| 5 | _to be created when Phase 5 starts_ | 🔲 Planned |
|
||||
| 6 | _to be created when Phase 6 starts_ | 🔲 Planned |
|
||||
151
dashboards/tracker-web/docs/PRD.md
Normal file
151
dashboards/tracker-web/docs/PRD.md
Normal file
@ -0,0 +1,151 @@
|
||||
# Tracker Dashboard — Product Requirements Document
|
||||
|
||||
**Version:** 1.0
|
||||
**Date:** 2026-05-25
|
||||
**Status:** Draft for implementation
|
||||
**Related:** [`docs/ROADMAP.md`](./ROADMAP.md) · [`docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md)
|
||||
|
||||
---
|
||||
|
||||
## 1. Product Identity
|
||||
|
||||
| Field | Value |
|
||||
| ------------------------- | -------------------------------------------------------------------- |
|
||||
| **Product name** | Tracker Dashboard |
|
||||
| **Internal codename** | `tracker-web` |
|
||||
| **Canonical `productId`** | `tracker` |
|
||||
| **Reserved port** | `3003` (web) · proxied through `platform-service` `4003` for backend |
|
||||
| **Public domain** | `tracker.bytelyst.com` (deployed via Caddy) |
|
||||
| **Token prefix** | `--tk-*` CSS custom properties |
|
||||
| **Repo location** | `learning_ai_common_plat/dashboards/tracker-web` |
|
||||
| **Bundle identifier** | n/a (web-only for v1; mobile in Phase 6) |
|
||||
|
||||
---
|
||||
|
||||
## 2. Vision
|
||||
|
||||
A **product-agnostic feature-request, bug-tracking, and roadmap dashboard** that serves every
|
||||
ByteLyst product (ChronoMind, NoteLett, FlowMonk, NomGap, JarvisJr, LysnrAI, LocalMemGPT,
|
||||
MindLyst, InvtTrdg, etc.) from one codebase, switching context via `x-product-id` header.
|
||||
|
||||
Item submissions must flow from **four equally first-class sources**:
|
||||
|
||||
1. **Public users** (no login) — via the public roadmap page
|
||||
2. **Internal team / PMs / developers** — via the authenticated dashboard
|
||||
3. **Coding agents** (Claude Code, Codex, Copilot Workspace, custom) — via the REST agent API
|
||||
4. **External systems** — CI failures, Slack, email, GitHub/Gitea issues, webhooks
|
||||
|
||||
Every item carries the same shape regardless of source, with `source` field distinguishing origin.
|
||||
|
||||
---
|
||||
|
||||
## 3. Target Users
|
||||
|
||||
| Persona | Primary needs | Main views |
|
||||
| --------------------------- | ---------------------------------------------------- | ---------------------------------------------------- |
|
||||
| **Public user** 🌐 | Submit an idea, vote on others, see what's coming | `/roadmap`, submission status page |
|
||||
| **Internal contributor** 🏢 | Triage incoming, plan sprints, link work | `/dashboard`, `/dashboard/items`, `/dashboard/board` |
|
||||
| **PM / Admin** 🏢 | Set priorities, manage milestones, configure routing | All of internal + `/dashboard/settings`, analytics |
|
||||
| **Coding agent** 🤖 | Pull assigned items, claim, work, link PR, close | `/api/agent/v1/*` |
|
||||
| **External system** | Push events into tracker; pull state out | `/api/webhooks/*`, outbound webhooks |
|
||||
|
||||
---
|
||||
|
||||
## 4. Non-Goals
|
||||
|
||||
- **Not a general-purpose project management tool** — focused on item tracking + roadmap, not full Kanban-style work-OS replacement
|
||||
- **Not a code review tool** — links to PRs but does not replace GitHub/Gitea PR UI
|
||||
- **Not a chat tool** — comments are async only; no real-time chat
|
||||
- **Not a time-tracking-billing tool** — tracks hours for estimation accuracy, not invoicing
|
||||
- **Not a CRM** — does not store customer relationships beyond reporter email
|
||||
|
||||
---
|
||||
|
||||
## 5. Core Concepts
|
||||
|
||||
### 5.1 Item
|
||||
|
||||
The atomic unit. Every bug, feature request, task, improvement, or chore is an **item** with:
|
||||
|
||||
- **Identity:** `id`, `productId`
|
||||
- **Classification:** `type` (`bug` · `feature` · `task` · `improvement` · `chore`), `priority`, `labels`
|
||||
- **State:** `status` (configurable workflow per product, default 5 statuses)
|
||||
- **Provenance:** `source` (`internal` · `user_submitted` · `auto_detected`), `reportedBy`
|
||||
- **Assignment:** `assignee` (user or agent ID)
|
||||
- **Visibility:** `internal` vs `public`
|
||||
- **Rich content:** `description` (markdown), acceptance criteria checklist, attachments
|
||||
- **Relationships:** linked items, sub-tasks, milestone, PR links
|
||||
- **Engagement:** `voteCount`, `commentCount`, watchers
|
||||
- **Metadata:** `targetRelease`, `affectedVersion`, `fixedInVersion`, custom fields, agent `metadata` map
|
||||
- **Audit:** `createdAt`, `updatedAt`, full activity log with actor + before/after
|
||||
|
||||
### 5.2 Comment
|
||||
|
||||
Thread of activity attached to an item. Markdown supported. Mentions trigger notifications.
|
||||
Reactions allowed. Authors can edit within 15 min, admins delete any.
|
||||
|
||||
### 5.3 Vote
|
||||
|
||||
A signal that someone wants this item shipped. Anonymous (public) or attributed (logged-in).
|
||||
Server-deduplicated per `(email, productId)` pair. Cap of N per email per product.
|
||||
|
||||
### 5.4 Milestone
|
||||
|
||||
A named grouping with a target date. Items can belong to one milestone. Release notes auto-generated
|
||||
from closed items in a milestone.
|
||||
|
||||
### 5.5 PR Link
|
||||
|
||||
A reference to a GitHub or Gitea pull request. Updated by webhook on PR lifecycle events.
|
||||
Multiple PRs can be linked to one item. PR status badges shown live on item detail.
|
||||
|
||||
### 5.6 Agent
|
||||
|
||||
A non-human actor with an API key. Identified by `name` and `role`. Subject to per-key rate limits
|
||||
and product scope. Can claim items atomically to prevent races.
|
||||
|
||||
---
|
||||
|
||||
## 6. Success Metrics
|
||||
|
||||
| Metric | Target |
|
||||
| ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------- |
|
||||
| **Items closed per sprint** (per product) | > baseline by Phase 3 ship |
|
||||
| **Time-to-first-response** on public submissions | < 24 h p95 |
|
||||
| **Agent-closed item ratio** | > 30 % by Phase 5 ship |
|
||||
| **Spam submissions** | < 1 % of total (after Phase 1.4 Turnstile) |
|
||||
| **Container health** | 100 % healthy for `tracker-web` + `platform-service` after Phase 1.1 |
|
||||
| **Vitest coverage on `src/lib/`** | ≥ 80 % |
|
||||
| **WCAG 2.1 AA violations** in CI | 0 (after Phase 6.2) |
|
||||
| **UI drift counters** (hardcoded colors, legacy classes, direct `@bytelyst/ui` imports outside adapter) | 0 (after Phase 2.1) |
|
||||
|
||||
---
|
||||
|
||||
## 7. Constraints
|
||||
|
||||
- **Shared infrastructure** — must reuse `platform-service`, `valkey`, Cosmos DB; no new infra stack
|
||||
- **Single VM deployment** — 4-core, 15 GB RAM shared with ~30 other containers; cannot dominate resources
|
||||
- **Corp-network builds** — Gitea CI runner cannot reach public CDNs (fonts, Google APIs); use `@fontsource` + bundled assets
|
||||
- **No external SaaS dependencies** for core flows — Cloudflare Turnstile and PostHog are opt-in; product must function without them
|
||||
- **JWT secret shared** with `platform-service` — no separate auth subsystem
|
||||
|
||||
---
|
||||
|
||||
## 8. Open Questions
|
||||
|
||||
These need PM / stakeholder decisions before the relevant phase starts.
|
||||
|
||||
- **Q-001 (Phase 4.4):** When a CI-failure auto-item lands for `learning_ai_clock`, should it route to `tracker.productId = chronomind`, or to a generic `infrastructure` product? — affects auto-detection rules
|
||||
- **Q-002 (Phase 5.1):** Agent productivity metrics — do we expose per-agent leaderboards to humans, or only aggregate?
|
||||
- **Q-003 (Phase 6.1):** Mobile parity — full feature set on phone, or read-mostly with limited create/edit?
|
||||
- **Q-004 (Phase 3.6):** AI auto-triage — automatic application of LLM suggestions, or always human-confirm?
|
||||
- **Q-005 (Phase 4.2):** Email-to-tracker — use SendGrid / Postmark inbound, or self-host with Postfix?
|
||||
|
||||
---
|
||||
|
||||
## 9. Glossary
|
||||
|
||||
- **`productId`** — Canonical short string for a product (`chronomind`, `notelett`, `flowmonk`, etc.); used as Cosmos partition key and `x-product-id` header
|
||||
- **MEK** — Master Encryption Key for field-level encryption of sensitive data (NoteLett pattern)
|
||||
- **UI drift ratchet** — One-way CI gate that ensures count of legacy patterns (hardcoded colors, direct imports) can only decrease, never increase
|
||||
- **Primitives adapter** — `src/components/ui/Primitives.tsx` that re-exports `@bytelyst/ui` components, providing a single insertion point for design-token theming
|
||||
@ -0,0 +1,242 @@
|
||||
# Tracker Dashboard — Production Readiness Handoff Roadmap
|
||||
|
||||
**Date:** 2026-05-25
|
||||
**Repo:** `learning_ai_common_plat/dashboards/tracker-web`
|
||||
**Status:** Ready for incremental implementation
|
||||
**Source playbook:** Adapted from `learning_ai_notes/docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md` (NoteLett) + `learning_ai_flowmonk/docs/HANDOFF_PLAYBOOK_FROM_NOTELETT.md` (FlowMonk)
|
||||
**Companions:** [`docs/ROADMAP.md`](./ROADMAP.md) · [`docs/PRD.md`](./PRD.md) · [`docs/IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md)
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
Use this document as the **source of truth** for finishing tracker-web end-to-end and making it
|
||||
production-ready. It reconciles the existing roadmaps, current code, and the reusable
|
||||
capabilities in `learning_ai_common_plat/packages/*`.
|
||||
|
||||
**Implementation rule:** complete one checklist item (or one small cluster) at a time, run the
|
||||
stated verification, commit with the repo convention, push, and record the commit hash in
|
||||
[`IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md) before moving on.
|
||||
|
||||
---
|
||||
|
||||
## Current Baseline (May 25, 2026)
|
||||
|
||||
| Area | Current state |
|
||||
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| **Web** | Next.js 16 App Router with login, dashboard (overview/items/board), item detail, public roadmap, health endpoint |
|
||||
| **Backend** | Proxied through `platform-service` (port `4003`); no dedicated `tracker-service` repo yet |
|
||||
| **Mobile** | None — Phase 6 deliverable |
|
||||
| **Tests present** | Vitest scaffolding + ~6 unit tests · Playwright scaffold · no `@axe-core/playwright` yet |
|
||||
| **DevOps present** | `Dockerfile` (standalone Next.js) · `docker-compose.yml` · `vercel.json` · no `scripts/docker-prep.sh` yet (must adopt from NoteLett/FlowMonk) |
|
||||
| **Common platform packages used** | `@bytelyst/api-client` · `@bytelyst/dashboard-components` · `@bytelyst/react-auth` · `@bytelyst/logger` · `@bytelyst/telemetry-client` |
|
||||
| **Deployed status** | `tracker-web` container `unhealthy` as of 2026-05-25 (B-001/B-002 root cause: valkey cascade — see Phase 1.A) |
|
||||
|
||||
---
|
||||
|
||||
## Critical Observations
|
||||
|
||||
- **Container `unhealthy`** — `tracker-web` reports unhealthy because `/health` route does not actually probe deps; root cause is the `valkey` cascade that brought down platform-service (B-001 → B-002).
|
||||
- **No UI drift ratchet yet** — direct `@bytelyst/ui` imports likely scattered through pages instead of routed through a `Primitives.tsx` adapter (NoteLett / FlowMonk pattern).
|
||||
- **Docker images not built with `NEXT_PUBLIC_*` baked in** — runtime API URLs hardcoded; same regression NoteLett (`91b8597`) and FlowMonk (`5ecd829`) fixed last week.
|
||||
- **Cosmos partition keys not exercised** in CI — NoteLett added a Cosmos emulator smoke job (`79e936b`) that catches partition-key bugs before prod.
|
||||
- **No `MEK rotation` runbook** — sensitive data fields will need field-level encryption keys; NoteLett established the rotation pattern in `bcad7d3`.
|
||||
- **React + React-DOM not pinned** via `pnpm.overrides` — risk of multi-instance React bugs when transitive deps pull a different version (fixed in NoteLett `a83e60a`, FlowMonk `59bc63f`).
|
||||
- **No live shared-service smoke** — NoteLett's `pnpm run smoke:local` end-to-end verification against live `platform-service` + `extraction-service` + `mcp-server` is the only reliable signal that the cross-service contract holds; adopt the same pattern.
|
||||
|
||||
---
|
||||
|
||||
## Mapping to ROADMAP.md Topic Sections
|
||||
|
||||
This playbook groups work by **execution day**. The master roadmap groups the same work by
|
||||
**topic**. Cross-reference when assigning tickets or marking progress:
|
||||
|
||||
| Day milestone (this doc) | Topic (ROADMAP.md §) |
|
||||
| -------------------------------- | --------------------------------------------------------------------- |
|
||||
| 1.A Container Health Restoration | 1.1 Infrastructure Health |
|
||||
| 1.B Workspace Health | 1.8 Workspace Health |
|
||||
| 1.C Docker Hardening | 1.2 Docker Hardening |
|
||||
| 1.D UI Drift Ratchet | 1.3 UI Drift Ratchet |
|
||||
| 1.E Test Hardening | 1.5 Test Coverage |
|
||||
| 1.F Observability + Security | 1.6 Error Handling & Observability + 1.7 Security + 1.4 Rate Limiting |
|
||||
|
||||
Mark progress under **both** numbering schemes when checking off in
|
||||
[`IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md) — agents reading either doc
|
||||
must be able to find the same checkboxes.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.A — Container Health Restoration (Day 1)
|
||||
|
||||
**Goal:** Get all tracker-related containers healthy. Block on this — everything else cascades.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.A.1** Diagnose `valkey` container `unhealthy` status
|
||||
- Check: `docker logs learning_ai_common_plat-valkey-1 --tail=50`
|
||||
- Likely: stale snapshot, AOF corruption, or memory pressure
|
||||
- Verification: `docker exec learning_ai_common_plat-valkey-1 redis-cli ping` returns `PONG`
|
||||
- [ ] **1.A.2** Restart `platform-service` after valkey is green
|
||||
- `docker restart learning_ai_common_plat-platform-service-1`
|
||||
- Verification: `curl http://localhost:4003/health` returns `200`
|
||||
- [ ] **1.A.3** Implement real `/health` route in tracker-web
|
||||
- File: `src/app/health/page.tsx` and/or `src/app/api/health/route.ts`
|
||||
- Must probe: platform-service `/health`, optional Cosmos ping
|
||||
- Return non-200 if any dep is down
|
||||
- [ ] **1.A.4** Add `restart: unless-stopped` to all tracker services in `docker-compose.yml`
|
||||
- [ ] **1.A.5** Add 8 GB swap to VM
|
||||
- `fallocate -l 8G /swapfile && chmod 600 /swapfile && mkswap /swapfile && swapon /swapfile`
|
||||
- Persist in `/etc/fstab`
|
||||
- [ ] **1.A.6** Limit concurrent Gitea CI runners
|
||||
- File: `act_runner` config or systemd unit
|
||||
- Cap to max 2 concurrent `next build` / `tsc` jobs
|
||||
|
||||
**Verification gate:** All containers healthy. `docker ps | grep tracker` shows `(healthy)`.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.B — Workspace Health (Day 1–2)
|
||||
|
||||
**Goal:** Get `pnpm run verify` passing cleanly with no React-instance warnings or stale package issues.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.B.1** Pin React + React-DOM via `pnpm.overrides` in root `package.json`
|
||||
- Pattern: NoteLett `a83e60a`, FlowMonk `59bc63f`
|
||||
- Block: `"pnpm": { "overrides": { "react": "19.2.4", "react-dom": "19.2.4" } }`
|
||||
- [ ] **1.B.2** Verify `pnpm-workspace.yaml` references canonical `packages/*` paths (no relative `../` indirection issues)
|
||||
- [ ] **1.B.3** Clean `pnpm install` to relink `@bytelyst/*` packages from source, not stale registry tarballs
|
||||
- [ ] **1.B.4** Run `pnpm --filter @bytelyst/tracker-web run typecheck`
|
||||
- [ ] **1.B.5** Run `pnpm --filter @bytelyst/tracker-web run lint`
|
||||
- [ ] **1.B.6** Run `pnpm --filter @bytelyst/tracker-web run test`
|
||||
- [ ] **1.B.7** Run `pnpm --filter @bytelyst/tracker-web run build`
|
||||
- [ ] **1.B.8** Resolve any actionable React-compiler lint warnings (NoteLett `3c4d46f` pattern)
|
||||
- [ ] **1.B.9** Remove dead code surfaced by lint (NoteLett `f4564d7`)
|
||||
- [ ] **1.B.10** Add dedicated `e2e/tsconfig.json` for Playwright specs (NoteLett `7ea2c48`)
|
||||
|
||||
**Verification gate:** Green typecheck + lint + test + build with no warnings about multiple React copies.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.C — Docker Hardening (Day 2–3)
|
||||
|
||||
**Goal:** Ship a self-contained Docker image that builds on the corp network, runs healthy in
|
||||
the existing compose stack, and has correct runtime URLs.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.C.1** Adopt `scripts/docker-prep.sh` from NoteLett / FlowMonk pattern
|
||||
- Packs `@bytelyst/*` tarballs into `.docker-deps/`
|
||||
- Rewrites `web/package.json` refs to `file:../.docker-deps/<tarball>`
|
||||
- Restore mode: `--restore`
|
||||
- [ ] **1.C.2** Add `.docker-deps/` to `.gitignore` (NoteLett `70623d9`)
|
||||
- [ ] **1.C.3** Bake `NEXT_PUBLIC_*` values at build time via Dockerfile `ARG` + `ENV` (drop runtime hardcoding)
|
||||
- Pattern: NoteLett `91b8597`, FlowMonk `5ecd829`
|
||||
- [ ] **1.C.4** Fix Next.js standalone static-chunks 404 — copy `.next/static` to standalone output in Dockerfile stage-1 (NoteLett `131b73c`, FlowMonk `5ecd829`)
|
||||
- [ ] **1.C.5** Change Docker healthcheck to IPv4 `127.0.0.1` (avoids IPv6 resolution edge cases) (FlowMonk `5ecd829`)
|
||||
- [ ] **1.C.6** Add corp-proxy build args: `NODE_TLS_REJECT_UNAUTHORIZED=0` + `NPM_CONFIG_STRICT_SSL=false` in builder stage only (NoteLett `e5221af`)
|
||||
- [ ] **1.C.7** Add `docker-compose.override.yml` wiring backend to sibling `platform-service` (`4003`), `extraction-service` (`4005`), `mcp-server` (`4007`) with shared JWT secret
|
||||
- [ ] **1.C.8** Verify `docker compose up -d --build` brings tracker-web up healthy
|
||||
|
||||
**Verification gate:** `docker ps | grep tracker-web` shows `(healthy)`; `curl http://localhost:3003/health` returns `200`; static chunks load (no 404 in browser DevTools).
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.D — UI Drift Ratchet (Day 3–4)
|
||||
|
||||
**Goal:** Install the one-way CI gate that prevents UI regressions from creeping back in.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.D.1** Add `scripts/ui-drift-audit.sh` counting:
|
||||
- Raw form controls (`<input>`, `<textarea>`, `<button>` outside Primitives)
|
||||
- Legacy global classes (`.surface-card`, `.surface-muted`, `.badge`, `.input-shell` — match NoteLett naming or define tracker-specific set)
|
||||
- Hardcoded color literals (regex for `#[0-9a-fA-F]{3,8}` and `rgba?\(`)
|
||||
- Direct `@bytelyst/ui` imports outside `src/components/ui/Primitives.tsx`
|
||||
- [ ] **1.D.2** Add `scripts/ui-drift-ratchet.sh` that:
|
||||
- Reads previous counts from `.ui-drift-ratchet.json`
|
||||
- Compares to current audit output
|
||||
- Fails CI if any count increased
|
||||
- Updates file when counts decrease
|
||||
- [ ] **1.D.3** Hard-zero categories: legacy globals, hardcoded colors, direct imports (any new occurrence fails CI)
|
||||
- [ ] **1.D.4** Add `audit:release-guards` script combining ratchet + secret scan + token audit
|
||||
- [ ] **1.D.5** Wire `audit:release-guards` into Gitea CI workflow as required check
|
||||
|
||||
**Verification gate:** `pnpm run audit:release-guards` passes; CI blocks PRs that increase drift counters.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.E — Test Hardening (Day 4–6)
|
||||
|
||||
**Goal:** Tests that catch real regressions before prod, not just paper-thin smoke.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.E.1** Vitest unit tests ≥ 80 % on `src/lib/`
|
||||
- Target files: `tracker-client.ts`, `auth-context.tsx`, `utils.ts`
|
||||
- [ ] **1.E.2** Playwright E2E happy path: login → create item → claim → link PR → close
|
||||
- [ ] **1.E.3** Playwright E2E public flow: submit + vote + status check
|
||||
- [ ] **1.E.4** Docker compose E2E test script `scripts/e2e-docker-test.sh` (NoteLett `d5e857d`, FlowMonk `e48e75b`)
|
||||
- [ ] **1.E.5** Seed scripts in `scripts/seed-test-data.sh` for bootstrapping the deployed stack
|
||||
- [ ] **1.E.6** Port-conflict-proof Playwright config — random ports per run (NoteLett `7103660`)
|
||||
- [ ] **1.E.7** Cleanup traps — `trap cleanup EXIT` to prevent orphan items between runs (FlowMonk `033c2c9`)
|
||||
- [ ] **1.E.8** Playwright `BASE_URL` switch — local dev vs deployed container (FlowMonk `e48e75b`)
|
||||
- [ ] **1.E.9** Install + wire `@axe-core/playwright` — `test:e2e:ci` script (FlowMonk `77f9bd9`)
|
||||
- [ ] **1.E.10** API contract tests against platform-service OpenAPI schema
|
||||
- [ ] **1.E.11** Cosmos emulator smoke job in CI exercising partition-key paths (NoteLett `79e936b`)
|
||||
- [ ] **1.E.12** Live shared-service smoke `pnpm run smoke:local` (NoteLett `34cb219` pattern)
|
||||
|
||||
**Verification gate:** `pnpm test`, `pnpm test:e2e:ci`, `pnpm smoke:local` all green; CI's Cosmos emulator job passes.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1.F — Observability + Security (Day 6–7)
|
||||
|
||||
**Goal:** Visible signals when things break, locked-down ingress.
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **1.F.1** Global React error boundary (`src/app/error.tsx` — already scaffolded; needs friendly content + logging)
|
||||
- [ ] **1.F.2** Structured server-side logging via `@bytelyst/logger` on all API routes
|
||||
- [ ] **1.F.3** Forward Next.js server logs to Loki (already deployed in compose stack)
|
||||
- [ ] **1.F.4** Expose `/metrics` Prometheus endpoint; scrape from existing Prometheus container
|
||||
- [ ] **1.F.5** Grafana alert: health-check failure + error rate > 1 %
|
||||
- [ ] **1.F.6** Client-side error tracking via Sentry or `@bytelyst/diagnostics-client`
|
||||
- [ ] **1.F.7** Security headers audit — CSP, HSTS, X-Frame-Options, Referrer-Policy, Permissions-Policy
|
||||
- [ ] **1.F.8** CSRF tokens on all mutating routes (`POST` / `PATCH` / `DELETE`)
|
||||
- [ ] **1.F.9** Audit log (`{ actor, action, field, before, after, timestamp }`) on every item mutation
|
||||
- [ ] **1.F.10** PII scrubbing in logs — emails, names redacted from raw lines
|
||||
- [ ] **1.F.11** MEK rotation runbook `docs/runbooks/MEK_ROTATION.md` (NoteLett `bcad7d3` template)
|
||||
- [ ] **1.F.12** Secret management runbook `docs/runbooks/SECRET_MANAGEMENT.md` (NoteLett `bcad7d3` template)
|
||||
|
||||
**Verification gate:** Grafana dashboard shows tracker-web metrics; trigger a synthetic error and confirm alert fires; secret scan + token audit pass.
|
||||
|
||||
---
|
||||
|
||||
## Exit Criteria for Production Readiness (end of Phase 1)
|
||||
|
||||
All of the following must hold before declaring Phase 1 done:
|
||||
|
||||
- [ ] All tracker-related containers report `(healthy)` for > 24 h continuous
|
||||
- [ ] `pnpm run verify` passes with zero warnings (typecheck, lint, test, build)
|
||||
- [ ] `pnpm run audit:release-guards` passes (secret scan + UI drift ratchet + token audit)
|
||||
- [ ] `pnpm test:e2e:ci` passes against deployed Docker stack with cleanup verified
|
||||
- [ ] `pnpm smoke:local` passes end-to-end against live platform-service
|
||||
- [ ] Cosmos emulator smoke job is green in CI
|
||||
- [ ] Grafana alert wired and tested (synthetic error fires alert within 5 min)
|
||||
- [ ] MEK rotation + secret management runbooks committed
|
||||
- [ ] 8 GB swap configured + Gitea runner concurrency capped
|
||||
- [ ] All `B-001`, `B-002`, `B-005`, `B-009`, `B-010`, `B-019`, `B-020`, `B-021` items closed
|
||||
|
||||
---
|
||||
|
||||
## After Phase 1
|
||||
|
||||
Phases 2–6 unblock once Phase 1 exits cleanly. See [`docs/ROADMAP.md`](./ROADMAP.md) for full
|
||||
detail and [`docs/roadmaps/`](./roadmaps/) for focused per-area plans.
|
||||
|
||||
| Next phase | Theme | Why now |
|
||||
| ------------- | ------------------------------ | ------------------------------------------------ |
|
||||
| Phase 2.1 | UI Primitives migration | Highest ratchet payoff; unblocks Phase 6 theming |
|
||||
| Phase 3.1–3.2 | Agent API auth + core item ops | Highest agent productivity payoff |
|
||||
| Phase 4.4 | Cross-product routing | Required as more products onboard |
|
||||
688
dashboards/tracker-web/docs/ROADMAP.md
Normal file
688
dashboards/tracker-web/docs/ROADMAP.md
Normal file
@ -0,0 +1,688 @@
|
||||
# Tracker Dashboard — Master Roadmap
|
||||
|
||||
**Version:** 3.0
|
||||
**Date:** 2026-05-25
|
||||
**Status:** Phase 0 shipped · Phase 1 in progress
|
||||
**Parent docs:** [`docs/PRD.md`](./PRD.md) · [`docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md) · [`docs/IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md)
|
||||
**Companion roadmaps:** [`docs/roadmaps/`](./roadmaps/)
|
||||
|
||||
> **Living document.** Coding agents, developers, PMs, and public contributors can submit items
|
||||
> via the [public roadmap](https://tracker.bytelyst.com/roadmap) or the [Agent API](#-agent--automation-api).
|
||||
> Deployed at **https://tracker.bytelyst.com**.
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Where to Start (Fresh-Agent Quick Pointer)
|
||||
|
||||
You just opened this repo and want to make progress. Do this in order:
|
||||
|
||||
1. **Read** [`PRD.md`](./PRD.md) (5 min) — what tracker is and why
|
||||
2. **Read** [`roadmaps/01_FOUNDATIONS_AND_DECISIONS.md`](./roadmaps/01_FOUNDATIONS_AND_DECISIONS.md) (5 min) — locked decisions you must not silently change
|
||||
3. **Read** [`PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md) (10 min) — Day-by-day Phase 1 playbook
|
||||
4. **Pick the first unchecked task** in [`IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md) following the dependency chain (1.A → 1.B → 1.C → 1.D → 1.E → 1.F)
|
||||
5. **Do the task**, run its verification command, **commit with format** `<type>(<scope>): <subject>`, **push to origin**, then **paste the commit SHA** into the corresponding `(______)` placeholder in `IMPLEMENTATION_TRACKER.md`
|
||||
6. **Repeat** — one slice, one commit, one tracker update at a time
|
||||
|
||||
> **Do not skip ahead** to Phase 2+ until Phase 1 exit criteria all pass (see [`PRODUCTION_READINESS_HANDOFF_ROADMAP.md` § Exit Criteria](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md#exit-criteria-for-production-readiness-end-of-phase-1)).
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This is the **master execution tracker** for the tracker-web dashboard. Detailed implementation
|
||||
plans live under [`docs/roadmaps/`](./roadmaps/). Per-slice progress with commit SHAs lives in
|
||||
[`docs/IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md).
|
||||
|
||||
Use this document to:
|
||||
|
||||
- track phase status and dependencies,
|
||||
- understand sequencing across the 6 phases,
|
||||
- assign workstreams to coding agents,
|
||||
- link out to smaller execution roadmaps with focused detail.
|
||||
|
||||
---
|
||||
|
||||
## 2. Status Legend
|
||||
|
||||
Granular per-feature status — adapted from NoteLett to give more nuance than a binary
|
||||
"shipped vs planned" flag.
|
||||
|
||||
| Status | Meaning |
|
||||
| ------------------ | ---------------------------------------------------------------- |
|
||||
| 🟦 **Scaffolded** | Package / route / component exists and boots, no behaviour yet |
|
||||
| 🟨 **Mock-backed** | UI exists but runs on mock or fallback data |
|
||||
| 🟧 **Local-only** | Behaviour lives in client/local state without server persistence |
|
||||
| 🟩 **Integrated** | Wired to the intended backend / platform dependency |
|
||||
| ✅ **Verified** | Exercised by build + typecheck + tests + smoke checks |
|
||||
| ⚠️ **Bug / Gap** | Known issue tracked in [§ Known Bugs](#known-bugs--gaps-) |
|
||||
| 🔗 **Depends** | Blocked by another phase item |
|
||||
| 🤖 | Agent-targeted feature |
|
||||
| 🌐 | Public-facing feature |
|
||||
| 🏢 | Internal / team feature |
|
||||
|
||||
---
|
||||
|
||||
## 3. Confirmed Stack
|
||||
|
||||
- **Web:** Next.js 16 · React 19 · TypeScript 5 · Tailwind CSS 4
|
||||
- **Backend:** `tracker-service` proxied via `platform-service` (Fastify 5, port `4003`)
|
||||
- **Cache / Sessions:** valkey (Redis-compatible)
|
||||
- **Data store:** Azure Cosmos DB (production) · in-memory provider (dev / smoke)
|
||||
- **Shared packages:** `@bytelyst/api-client` · `@bytelyst/dashboard-components` · `@bytelyst/ui` (Phase 2) · `@bytelyst/react-auth` · `@bytelyst/logger` · `@bytelyst/telemetry-client` · `@bytelyst/blob` (Phase 2 attachments)
|
||||
- **Container:** Standalone Next.js build · Caddy reverse proxy at `tracker.bytelyst.com`
|
||||
- **Tests:** Vitest (unit) · Playwright (E2E) · `@axe-core/playwright` (a11y) — Phase 1
|
||||
- **CI:** Gitea Actions + UI drift ratchet (Phase 1)
|
||||
|
||||
---
|
||||
|
||||
## 4. Permissions Matrix
|
||||
|
||||
| Action | Public (no login) | Auth User | PM / Admin | Agent (API key) |
|
||||
| -------------------- | :---------------: | :-------: | :--------: | :-------------: |
|
||||
| View public roadmap | ✅ | ✅ | ✅ | ✅ |
|
||||
| Submit public idea | ✅ | ✅ | ✅ | ✅ |
|
||||
| Vote on public item | ✅ | ✅ | ✅ | ✅ |
|
||||
| View internal items | ❌ | ✅ | ✅ | ✅ (scoped) |
|
||||
| Create internal item | ❌ | ✅ | ✅ | ✅ (write key) |
|
||||
| Edit any item | ❌ | own only | ✅ | ✅ (write key) |
|
||||
| Delete item | ❌ | ❌ | ✅ | ❌ |
|
||||
| Change status | ❌ | ✅ | ✅ | ✅ (write key) |
|
||||
| Claim item | ❌ | ❌ | ❌ | ✅ (write key) |
|
||||
| Link PR to item | ❌ | ✅ | ✅ | ✅ (write key) |
|
||||
| Manage API keys | ❌ | ❌ | ✅ | ❌ |
|
||||
| Configure webhooks | ❌ | ❌ | ✅ | ❌ |
|
||||
| Access analytics | ❌ | ❌ | ✅ | 🔲 (read key) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Master Phase Tracker
|
||||
|
||||
| Phase | Theme | Status | Target |
|
||||
| ----- | -------------------------------------- | -------------- | ---------- |
|
||||
| 0 | Foundation | ✅ Shipped | — |
|
||||
| 1 | Production hardening | 🔄 In progress | 2026-06-14 |
|
||||
| 2 | Rich item details (Linear/Jira parity) | 🔲 Planned | 2026-07-12 |
|
||||
| 3 | Agent & automation API | 🔲 Planned | 2026-07-26 |
|
||||
| 4 | Multi-source intake | 🔲 Planned | 2026-08-09 |
|
||||
| 5 | Analytics & intelligence | 🔲 Planned | 2026-08-30 |
|
||||
| 6 | Mobile, accessibility & i18n | 🔲 Planned | 2026-09-13 |
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Foundation (Shipped) ✅
|
||||
|
||||
Everything checked here is already shipped and running at **https://tracker.bytelyst.com**.
|
||||
|
||||
### 0.1 Core Item Management ✅
|
||||
|
||||
- [x] CRUD on tracker items
|
||||
- [x] Item types: `bug` · `feature` · `task`
|
||||
- [x] Statuses: `open` → `in_progress` → `done` → `closed` · `wont_fix`
|
||||
- [x] Priority: `critical` · `high` · `medium` · `low`
|
||||
- [x] Visibility: `internal` vs `public`
|
||||
- [x] Labels, assignee, reporter, target release
|
||||
- [x] Source tracking: `internal` · `user_submitted` · `auto_detected`
|
||||
- [x] Vote count, comment count per item
|
||||
|
||||
### 0.2 Views ✅
|
||||
|
||||
- [x] Dashboard overview with stats by type / status / priority
|
||||
- [x] Items list (search, filter, paginate)
|
||||
- [x] Kanban board (4-column, button-only transitions — see B-003)
|
||||
- [x] Item detail with inline edits
|
||||
|
||||
### 0.3 Public Roadmap ✅ 🌐
|
||||
|
||||
- [x] `/roadmap` — no auth
|
||||
- [x] Board / list view toggle
|
||||
- [x] Submit form (name + email + type + description)
|
||||
- [x] Email-based voting (localStorage — see B-004)
|
||||
- [x] Stats bar + search + filter
|
||||
|
||||
### 0.4 Authentication ✅
|
||||
|
||||
- [x] Email + password via platform-service
|
||||
- [x] MFA · Google OAuth · JWT refresh
|
||||
- [x] Product switcher (multi-product via `x-product-id` header)
|
||||
|
||||
### 0.5 Infrastructure ✅
|
||||
|
||||
- [x] Standalone Next.js Docker build
|
||||
- [x] Caddy reverse proxy (`tracker.bytelyst.com`)
|
||||
- [x] PostHog analytics
|
||||
- [x] Vitest + Playwright scaffolding
|
||||
- [x] ESLint + Prettier + Husky
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Production Hardening 🔄
|
||||
|
||||
> **Goal:** Make everything shipped actually reliable in production, applying the lessons from
|
||||
> the NoteLett + FlowMonk production-readiness sprints of 2026-05-22/23.
|
||||
> **Target:** Sprint ending 2026-06-14
|
||||
> **Detailed plan:** [`docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md)
|
||||
|
||||
> **Two numbering schemes — same content.** This file groups Phase 1 by **topic** (1.1–1.8).
|
||||
> The handoff playbook groups the same work by **execution day** (1.A–1.F). Mapping:
|
||||
|
||||
| Topic (this doc) | Day milestone (handoff) |
|
||||
| ---------------------------------- | ----------------------------------------------- |
|
||||
| 1.1 Infrastructure Health | 1.A Container Health Restoration |
|
||||
| 1.2 Docker Hardening | 1.C Docker Hardening |
|
||||
| 1.3 UI Drift Ratchet | 1.D UI Drift Ratchet |
|
||||
| 1.4 Rate Limiting & Spam | _(in 1.F security cluster — execute alongside)_ |
|
||||
| 1.5 Test Coverage | 1.E Test Hardening |
|
||||
| 1.6 Error Handling & Observability | 1.F Observability + Security (obs half) |
|
||||
| 1.7 Security | 1.F Observability + Security (sec half) |
|
||||
| 1.8 Workspace Health | 1.B Workspace Health |
|
||||
|
||||
> **Execution order** (sequential): 1.A → 1.B → 1.C → 1.D → 1.E → 1.F. Use this doc for
|
||||
> _scope and acceptance_; use the handoff playbook for _step-by-step day-by-day execution_.
|
||||
|
||||
### 1.1 Infrastructure Health ⚠️
|
||||
|
||||
- [ ] **Fix valkey (Redis) container health** — currently `unhealthy`; root cause of most downstream container failures
|
||||
- [ ] **Fix platform-service health check** — reports `unhealthy` due to valkey connectivity; depends on valkey fix
|
||||
- [ ] **Fix tracker-web `/health` route** — must probe DB + platform-service reachability, not just return HTTP 200
|
||||
- [ ] **Add 8 GB swap on VM** — currently 0 B; build spikes OOM-kill running services
|
||||
- [ ] **Limit concurrent Gitea CI runner jobs** — cap to 1–2 parallel `next build` + `tsc` jobs (4-core VM cannot survive 4+ parallel builds)
|
||||
- [ ] **Ensure `restart: unless-stopped`** on all docker-compose services
|
||||
|
||||
### 1.2 Docker Hardening — From NoteLett+FlowMonk learnings 🆕
|
||||
|
||||
- [ ] **Bake `NEXT_PUBLIC_*` values at build time** — drop hardcoded `api.bytelyst.com`; thread through `docker-compose.yml` build args (NoteLett `91b8597`, FlowMonk `5ecd829`)
|
||||
- [ ] **Fix Next.js standalone static-chunks 404** in Docker — copy `.next/static` to standalone output (NoteLett `131b73c`, FlowMonk `5ecd829`)
|
||||
- [ ] **IPv4 healthcheck** — Docker healthcheck must use `127.0.0.1`, not `localhost` (avoids IPv6 resolution issues) (FlowMonk `5ecd829`)
|
||||
- [ ] **Corp-proxy build support** — `NODE_TLS_REJECT_UNAUTHORIZED=0` + `NPM_CONFIG_STRICT_SSL=false` in build stage only (NoteLett `e5221af`)
|
||||
- [ ] **Adopt tarball-based Docker builds** via `scripts/docker-prep.sh` (already used by NoteLett, FlowMonk, ChronoMind)
|
||||
- [ ] **Local docker-compose.override.yml** wires backend to sibling `platform-service`, `extraction-service`, `mcp-server` with shared JWT secret
|
||||
|
||||
### 1.3 UI Drift Ratchet (CI Gate) — From NoteLett UI8 🆕
|
||||
|
||||
- [ ] **Add `scripts/ui-drift-audit.sh`** — counts: raw form controls, legacy global classes, hardcoded color literals, direct `@bytelyst/ui` imports outside adapter
|
||||
- [ ] **Add `scripts/ui-drift-ratchet.sh`** — one-way CI gate: numbers can only decrease, never increase
|
||||
- [ ] **Hard-zero enforcement** for: legacy global classes, hardcoded color literals, direct `@bytelyst/ui` imports outside adapter
|
||||
- [ ] **Wire into Gitea Actions CI** — `audit:release-guards` job runs ratchet + secret scan + token audit (NoteLett pattern)
|
||||
|
||||
### 1.4 Rate Limiting & Spam Protection 🌐
|
||||
|
||||
- [ ] **Rate-limit `POST /public/submit`** — minimum 10 req/min per IP
|
||||
- [ ] **Cloudflare Turnstile or hCaptcha** on public submission form
|
||||
- [ ] **Server-side vote deduplication per email** (current dedup is localStorage-only)
|
||||
- [ ] **Sanitise all public inputs server-side** — XSS / injection guard
|
||||
|
||||
### 1.5 Test Coverage
|
||||
|
||||
- [ ] **Vitest unit tests ≥ 80 % on `src/lib/`**
|
||||
- [ ] **Playwright E2E happy path:** login → create item → claim → link PR → close
|
||||
- [ ] **Playwright E2E public flow:** submit + vote + status check
|
||||
- [ ] **Docker compose E2E test script** — `scripts/e2e-docker-test.sh` that exercises full CRUD against live containers (NoteLett `d5e857d`, FlowMonk `e48e75b`)
|
||||
- [ ] **Seed scripts** — bootstrap test data into the deployed stack
|
||||
- [ ] **Port-conflict-proof E2E** — Playwright picks random ports per test run (NoteLett `7103660`)
|
||||
- [ ] **E2E cleanup traps** — `trap cleanup EXIT` to prevent orphan items/comments/votes accumulating between runs (FlowMonk `033c2c9`)
|
||||
- [ ] **Playwright deployed-stack support** — `BASE_URL` switch between local dev and deployed container (FlowMonk `e48e75b`)
|
||||
- [ ] **`@axe-core/playwright` accessibility tests** — install + wire into Playwright config; `test:e2e:ci` script (FlowMonk `77f9bd9`)
|
||||
- [ ] **API contract tests** — verify proxy routes match platform-service OpenAPI schema
|
||||
- [ ] **Cosmos emulator smoke job** in CI — exercise partition-key paths (NoteLett `79e936b`)
|
||||
- [ ] **Live shared-service smoke** — `pnpm run smoke:local` end-to-end against live platform-service / extraction-service / mcp-server (NoteLett pattern verified `34cb219`)
|
||||
|
||||
### 1.6 Error Handling & Observability
|
||||
|
||||
- [ ] **Global React error boundary** with friendly fallback — no raw stack traces leaked
|
||||
- [ ] **Structured server-side logging** via `@bytelyst/logger` on all Next.js API routes
|
||||
- [ ] **Loki log aggregation** — forward Next.js logs to deployed Loki instance
|
||||
- [ ] **Prometheus `/metrics` endpoint** — request count, latency p50/p95, error rate
|
||||
- [ ] **Grafana alert** on health-check failure and error rate > 1 %
|
||||
- [ ] **Client-side error tracking** via Sentry or `@bytelyst/diagnostics-client`
|
||||
|
||||
### 1.7 Security
|
||||
|
||||
- [ ] **Security headers audit** — CSP, HSTS, X-Frame-Options, Referrer-Policy, Permissions-Policy
|
||||
- [ ] **CSRF tokens on all mutating routes**
|
||||
- [ ] **API key rotation mechanism** — prerequisite for Phase 3 agent keys
|
||||
- [ ] **Audit log on every item mutation** — `{ actor, action, field, before, after, timestamp }` append-only log
|
||||
- [ ] **PII scrubbing in logs** — emails and names must not appear in plaintext log lines
|
||||
- [ ] **MEK rotation runbook** — operator playbook for rotating the master encryption key (NoteLett `bcad7d3`)
|
||||
- [ ] **Secret management runbook** — KeyVault → env → process; rotation procedures (NoteLett `bcad7d3`)
|
||||
|
||||
### 1.8 Workspace Health 🆕
|
||||
|
||||
- [ ] **Pin React + React-DOM to single version** via `pnpm.overrides` — prevents multi-instance React bugs (NoteLett `a83e60a`, FlowMonk `59bc63f`)
|
||||
- [ ] **Canonicalize `../learning_ai_common_plat` path** in `pnpm-workspace.yaml` + `.pnpmfile.cjs` (NoteLett `b2d824c`)
|
||||
- [ ] **Resolve React-compiler lint advisories** in tracker-web pages (NoteLett `3c4d46f`)
|
||||
- [ ] **Remove dead code** surfaced by lint pass (NoteLett `f4564d7`)
|
||||
- [ ] **Dedicated Playwright `tsconfig.json`** for E2E specs (NoteLett `7ea2c48`)
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Rich Item Details (Linear / Jira parity) 🔲
|
||||
|
||||
> **Goal:** Items rich enough for developers, PMs, and agents to fully spec, reproduce, and track
|
||||
> work without leaving the tool.
|
||||
> **Target:** Sprint ending 2026-07-12
|
||||
> **Detailed plan:** `docs/roadmaps/03_RICH_ITEMS_ROADMAP.md` _(to be created when Phase 2 begins)_
|
||||
|
||||
### 2.1 UI Primitives Migration to `@bytelyst/ui` 🆕
|
||||
|
||||
> Adapted from NoteLett UI5/UI6/UI7/UI8 migration (May 22–23, 2026, `9c65899` … `0c982de`)
|
||||
> and FlowMonk equivalent (`7083cf0` … `e0de740`).
|
||||
|
||||
- [ ] **Add Primitives adapter** — `src/components/ui/Primitives.tsx` re-exports `@bytelyst/ui` components (matches NoteLett + FlowMonk + ChronoMind pattern)
|
||||
- [ ] **Migrate auth pages + create-item modal** (UI5)
|
||||
- [ ] **Migrate settings page + all modals** (UI5)
|
||||
- [ ] **Migrate dashboard, items list, board pages** (UI6)
|
||||
- [ ] **Migrate item detail, comments, public roadmap pages** (UI7)
|
||||
- [ ] **Remove legacy global classes** — `.surface-card`, `.surface-muted`, `.badge`, `.input-shell` deleted from `globals.css` (UI8)
|
||||
- [ ] **Replace all hardcoded color literals** with `--tk-*` CSS variables (tracker design tokens via `@bytelyst/design-tokens`)
|
||||
- [ ] **Lock direct `@bytelyst/ui` imports outside adapter** in CI via ratchet (Phase 1.3)
|
||||
- [ ] **Tighten audit regex** + lock CI gate to hard-zero for legacy globals (UI8)
|
||||
|
||||
### 2.2 Expanded Item Types & Statuses
|
||||
|
||||
- [ ] **New types:** `improvement` · `chore` (fixes B-017)
|
||||
- [ ] **Custom status workflows** — products define extra statuses beyond the default 5 (e.g., `needs_review`, `blocked`, `in_qa`)
|
||||
- [ ] **`wont_fix` reason** — mandatory free-text when closing as wont_fix
|
||||
- [ ] **Reopen flow** — explicit action with required comment; audit-logged
|
||||
|
||||
### 2.3 Rich Text & Markdown
|
||||
|
||||
- [ ] **Markdown description editor** — live preview, toolbar, keyboard shortcuts
|
||||
- [ ] **Acceptance criteria checklist** — `- [ ]` items inside description, individually checkable
|
||||
- [ ] **Steps to reproduce / Expected vs Actual** blocks (bug type only)
|
||||
- [ ] **Code blocks** — Shiki syntax highlighting in descriptions + comments
|
||||
- [ ] **`@username` mentions** in comments → in-app + email notification
|
||||
- [ ] **`source: auto_detected` UI badge** (fixes B-014)
|
||||
|
||||
### 2.4 Attachments & Media
|
||||
|
||||
- [ ] **File uploads via `@bytelyst/blob`** — screenshots, logs, designs up to 25 MB
|
||||
- [ ] **Clipboard paste → auto-upload → inline embed**
|
||||
- [ ] **Video embeds** — Loom / YouTube paste detection
|
||||
- [ ] **Attachment list** on item detail with filename, size, uploader, download, delete
|
||||
|
||||
### 2.5 Relationships & Linking
|
||||
|
||||
- [ ] **Linked items:** `blocks` / `is blocked by` / `relates to` / `duplicate of` — bidirectional
|
||||
- [ ] **Sub-tasks** — child items with progress chip on parent (`3/5 done`)
|
||||
- [ ] **Milestones** — named groupings with target date
|
||||
- [ ] **PR / commit links** — GitHub or Gitea URL with live status badge 🔗 _(prerequisite for Phase 3 webhook auto-linking)_
|
||||
- [ ] **Branch name chip** — auto-suggest `feat/tracker-{id}-{slug}` with copy button
|
||||
- [ ] **External links** — Notion, Figma, Confluence, CI run URLs
|
||||
|
||||
### 2.6 Metadata & Custom Fields
|
||||
|
||||
- [ ] **Effort estimate** — Fibonacci points (1 2 3 5 8 13 21) or T-shirt (XS S M L XL)
|
||||
- [ ] **Time tracking** — log hours; show logged vs estimate; sprint burndown
|
||||
- [ ] **Due date** — overdue items highlighted red
|
||||
- [ ] **Environment** — `production` · `staging` · `dev` · `all`
|
||||
- [ ] **Affected version / Fixed in version**
|
||||
- [ ] **Watchers / stakeholders** — subscribe to all updates
|
||||
- [ ] **Custom fields per product** — text/number/date/select; stored in `metadata` map
|
||||
- [ ] **Colour-coded labels** — hex colour per label rendered as chips
|
||||
- [ ] **`metadata` map for agent data** — `{ testRunId, commitSha, ciJobUrl }` without polluting core fields
|
||||
|
||||
### 2.7 Activity, History & Notifications
|
||||
|
||||
- [ ] **Full activity log** per item — every field change, status transition, comment, attachment, PR link recorded with actor + timestamp
|
||||
- [ ] **Comment reactions** — 👍 ✅ 🔥 💡 ❓
|
||||
- [ ] **Comment edit + delete** — authors edit within 15 min, admins delete any
|
||||
- [ ] **Item history diff view** — before/after for description edits
|
||||
- [ ] **Notification preferences** — per user, per item: all · mentions · status only · none
|
||||
- [ ] **In-app notification centre** — bell icon, unread count, mark all read
|
||||
|
||||
### 2.8 Real-Time Updates
|
||||
|
||||
- [ ] **Server-Sent Events (SSE) on item detail** — status, comments, activity log refresh live
|
||||
- [ ] **Kanban live updates** — card moves and new cards appear in real-time for all viewers
|
||||
- [ ] **Optimistic UI** — instant client-side apply with toast rollback on server error
|
||||
|
||||
### 2.9 Views, Filters & Search
|
||||
|
||||
- [ ] **Kanban drag-and-drop** (fixes B-003)
|
||||
- [ ] **Saved filter views** — name, save, pin to sidebar
|
||||
- [ ] **Bulk actions** — multi-select → status / assign / label / milestone / delete
|
||||
- [ ] **Group by** — assignee · label · milestone · priority · type
|
||||
- [ ] **Timeline / Gantt view**
|
||||
- [ ] **My items view** — assigned to me · reported by me · watching · mentioned
|
||||
- [ ] **Global search Ctrl+K** — full-text across all items (per-product for members, all for admin)
|
||||
- [ ] **Export** — CSV + JSON of any filtered view, all metadata included
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Agent & Automation API 🤖
|
||||
|
||||
> **Goal:** First-class REST API for coding agents (Claude Code, Codex, Copilot Workspace) to
|
||||
> consume, update, and create tracker items — closing the loop between AI-assisted development
|
||||
> and project management.
|
||||
> **Target:** Sprint ending 2026-07-26
|
||||
> **Detailed plan:** `docs/roadmaps/04_AGENT_API_ROADMAP.md` _(to be created when Phase 3 begins)_
|
||||
> **Dependency:** Phase 2 acceptance-criteria checklist + PR link fields must ship first.
|
||||
|
||||
### 3.1 Agent Authentication
|
||||
|
||||
- [ ] **API key management UI** (admin) — generate, revoke, rotate; set name + role + product scope + IP allowlist
|
||||
- [ ] **Agent roles:** `agent-read` · `agent-write` · `ci` · `webhook`
|
||||
- [ ] **Key usage log** — last-used, request count, error count
|
||||
- [ ] **Per-key rate limits** — configurable RPM; 429 with `Retry-After`
|
||||
- [ ] **Key expiry** — optional expiry date, auto-revocation
|
||||
- [ ] **API versioning** — `/api/agent/v1/`; breaking changes bump version; 6-month support window with `Deprecation` + `Sunset` headers
|
||||
|
||||
### 3.2 Agent Item Operations
|
||||
|
||||
All routes: `Authorization: Bearer <agent-key>` + `X-Product-Id: {productId}`.
|
||||
|
||||
**Pull & Claim**
|
||||
|
||||
- [ ] **`GET /api/agent/v1/items`** — list with filters; cursor pagination; `since` for incremental sync
|
||||
- [ ] **`PATCH /api/agent/v1/items/:id/claim`** — atomic assign + transition to `in_progress`; `409 Conflict` if already claimed
|
||||
|
||||
**Create & Update**
|
||||
|
||||
- [ ] **`POST /api/agent/v1/items`** — create item; `source` auto-set to `auto_detected`; supports `metadata` map for `{ testRun, commitSha, ciJobUrl }`
|
||||
- [ ] **`PATCH /api/agent/v1/items/:id/status`** — update with mandatory `reason` and optional `evidenceUrl`
|
||||
- [ ] **`PATCH /api/agent/v1/items/:id/checklist`** — check/uncheck acceptance-criteria items 🔗 _(requires Phase 2.3)_
|
||||
- [ ] **`POST /api/agent/v1/items/:id/comments`** — post implementation notes, test results, error logs
|
||||
|
||||
**PR Integration**
|
||||
|
||||
- [ ] **`PATCH /api/agent/v1/items/:id/pr`** — link or update PR; callable repeatedly as `prStatus` evolves (`open` → `merged` / `closed` / `draft`) and `ciStatus` (`pending` / `success` / `failure` / `cancelled`)
|
||||
|
||||
**Context**
|
||||
|
||||
- [ ] **`GET /api/agent/v1/items/:id/context`** — full item as LLM-ready markdown; ideal for system-prompt injection
|
||||
|
||||
### 3.3 Backend Domain Events 🆕
|
||||
|
||||
> Pattern from NoteLett `task.created` + `workspace.created` event emission (`1258d49`).
|
||||
> Enables outbound webhooks, replay, and event sourcing.
|
||||
|
||||
- [ ] **Emit domain events** from `tracker-service` on every mutation: `item.created` · `item.updated` · `item.status_changed` · `comment.added` · `pr.linked` · `pr.status_changed` · `checklist.checked` · `vote.added` · `item.closed`
|
||||
- [ ] **Event log table** — append-only `events` collection in Cosmos with partition key `productId`
|
||||
- [ ] **Event replay endpoint** — `POST /api/admin/events/replay?since=...` — admin can replay events to recover state or hydrate new subscribers
|
||||
- [ ] **Replay preview mode** — dry-run that shows what would happen without applying changes (pattern from FlowMonk runtime `8176395`)
|
||||
|
||||
### 3.4 Inbound Webhooks
|
||||
|
||||
- [ ] **GitHub webhook receiver** — `POST /api/webhooks/github`
|
||||
- PR opened → auto-link to item if branch matches `tracker-{id}` or `feat/tracker-{id}-*`; status → `in_progress`
|
||||
- PR merged → status → `done`; post commit SHA + PR URL as comment
|
||||
- PR closed without merge → comment noting closure; status unchanged
|
||||
- CI check failed → post failure summary + job URL as comment
|
||||
- [ ] **Gitea webhook receiver** — `POST /api/webhooks/gitea` (identical handling targeting `localhost:3300`)
|
||||
- [ ] **HMAC-SHA256 signature verification** — reject unsigned requests
|
||||
- [ ] **Webhook event log** — last 100 inbound events per product; each replayable
|
||||
|
||||
### 3.5 Outbound Webhooks
|
||||
|
||||
- [ ] **Configuration UI** — register target URLs per product; choose event types
|
||||
- [ ] **Retry with exponential backoff** — up to 5 retries over 24 h
|
||||
- [ ] **Delivery log UI** — timestamp, target, event, HTTP status, duration, response snippet
|
||||
- [ ] **Built-in Slack integration** — formatted item cards on configurable events
|
||||
|
||||
### 3.6 Agent SDK & Tooling
|
||||
|
||||
- [ ] **`@bytelyst/tracker-client` npm package** — typed Node.js client; auto-pagination, retry, rate-limit backoff
|
||||
- [ ] **Claude Code hook template** — `PostToolUse` hook that files a tracker bug on test failure
|
||||
- [ ] **CI integration guide** — GitHub Actions + Gitea Actions example steps
|
||||
- [ ] **OpenAPI spec** — auto-generated; browsable at `/api-docs`
|
||||
|
||||
### 3.7 AI-Assisted Triage
|
||||
|
||||
- [ ] **Auto-classify new submissions** — LLM suggests type + priority + labels (human confirms)
|
||||
- [ ] **Duplicate detection** — embedding similarity against open items; surface "Possible duplicate of #N" if > 0.85
|
||||
- [ ] **Auto-assign rules** — label-based routing table editable by PM
|
||||
- [ ] **Sentiment analysis** on public submissions — flag urgent/angry for fast-lane triage
|
||||
- [ ] **Auto-generate acceptance criteria** 🔗 _(requires Phase 2.3 checklist)_ — LLM suggests starter `- [ ]` list for `feature` + `improvement` items
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Multi-Source Intake 🌐🏢
|
||||
|
||||
> **Goal:** Every stakeholder — public users, internal team, developers, agents — has a
|
||||
> frictionless native path to submit and track items.
|
||||
> **Target:** Sprint ending 2026-08-09
|
||||
> **Detailed plan:** `docs/roadmaps/05_INTAKE_ROADMAP.md` _(to be created when Phase 4 begins)_
|
||||
|
||||
### 4.1 Public Submission Enhancements 🌐
|
||||
|
||||
- [ ] **Optional public account** — email-only signup; track your own submissions; no internal access
|
||||
- [ ] **Submission status page** — `/submissions/{token}` without login; token emailed on submit
|
||||
- [ ] **Email notifications** to submitters on status changes
|
||||
- [ ] **Public changelog** at `/changelog` — auto-generated from `done` + `visibility: public` items grouped by milestone
|
||||
- [ ] **Vote cap per email per product** — max 5; server-enforced (proper fix for B-004)
|
||||
|
||||
### 4.2 Internal Team Intake 🏢
|
||||
|
||||
- [ ] **Quick-capture widget** — floating "Report issue" button embeddable in any bytelyst dashboard via `<script>` tag
|
||||
- [ ] **Browser extension** — one-click bug capture with auto-screenshot + URL
|
||||
- [ ] **Email-to-tracker** — `tracker+{productId}@bytelyst.com` → creates item; reply threading = comments
|
||||
- [ ] **Slack `/tracker` slash command** — submit / list from Slack
|
||||
- [ ] **Microsoft Teams bot** — equivalent for Teams
|
||||
|
||||
### 4.3 Developer Intake 🏢
|
||||
|
||||
- [ ] **GitHub Issues bidirectional sync** — issues ↔ items; status ↔ labels
|
||||
- [ ] **Gitea Issues bidirectional sync** — same for `localhost:3300`
|
||||
- [ ] **`npx @bytelyst/tracker` CLI** — create / list / show from terminal
|
||||
- [ ] **VS Code extension** — sidebar panel for assigned items, file bugs from editor
|
||||
- [ ] **CI test-failure auto-item** — Vitest / Jest / Playwright reporter plugin attaches full output
|
||||
|
||||
### 4.4 Cross-Product Item Routing 🆕
|
||||
|
||||
> NoteLett, FlowMonk, ChronoMind, NomGap, JarvisJr, LysnrAI, LocalMemGPT, MindLyst, Notelett,
|
||||
> InvtTrdg all use this tracker. Items from CI failures or public submissions may need to be
|
||||
> routed to the right product.
|
||||
|
||||
- [ ] **Product auto-detection** — infer `productId` from PR branch name, CI job URL, or submitter email domain
|
||||
- [ ] **Re-route action** — admin can move item between products with audit trail
|
||||
- [ ] **Cross-product duplicate detection** — flag when similar items exist in other products
|
||||
|
||||
### 4.5 Import Wizard 🆕
|
||||
|
||||
- [ ] **CSV import from Jira / Linear / GitHub Issues** with field-mapping UI — avoids cold-start when migrating
|
||||
|
||||
### 4.6 PM / Stakeholder Features 🏢
|
||||
|
||||
- [ ] **Roadmap presentation mode** — full-screen, grouped by milestone, shareable read-only link
|
||||
- [ ] **Sprint planning board** — drag items into sprints; velocity chart
|
||||
- [ ] **Release notes generator** — from `done` + `public` items in a milestone → draft markdown
|
||||
- [ ] **Weekly digest email** — per-product opened/closed/blocked + top-voted ideas; Monday delivery
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Analytics & Intelligence 🔲
|
||||
|
||||
> **Target:** Sprint ending 2026-08-30
|
||||
|
||||
### 5.1 Item Analytics
|
||||
|
||||
- [ ] **Cycle time** — time in each status per item; p50/p95/p99 across all items and per-label
|
||||
- [ ] **Throughput chart** — items closed per week over rolling 12 weeks
|
||||
- [ ] **Bug burn-down** — open bug count over time with configurable target line
|
||||
- [ ] **Feature request leaderboard** — ranked by votes; trending velocity this week vs all-time
|
||||
- [ ] **Agent productivity dashboard** — claimed / closed / abandoned per agent; PR merge rate; avg cycle time vs human
|
||||
|
||||
### 5.2 SLA & Alerting
|
||||
|
||||
- [ ] **SLA rules** — configurable per priority: `critical` > 4 h, `high` > 24 h, `medium` > 7 days
|
||||
- [ ] **SLA breach alert** — assignee + PM via in-app + email + webhook
|
||||
- [ ] **Stale item detector** — no activity for N days → auto-ping assignee
|
||||
- [ ] **Blocked escalation** — > 3 days blocked → escalate to team lead
|
||||
|
||||
### 5.3 Reporting
|
||||
|
||||
- [ ] **CSV / PDF export** of any filtered view including custom fields
|
||||
- [ ] **Scheduled email reports** — daily / weekly / monthly cadence per product
|
||||
- [ ] **Embeddable public status widget** — `<script>` snippet
|
||||
- [ ] **Data retention / archival** — archive items > N months; searchable but excluded from live views
|
||||
|
||||
---
|
||||
|
||||
## Phase 6 — Mobile, Accessibility & i18n 🔲
|
||||
|
||||
> **Target:** Sprint ending 2026-09-13
|
||||
|
||||
### 6.1 Responsive & Mobile
|
||||
|
||||
- [ ] **Fully responsive layout** — all views usable on < 768 px
|
||||
- [ ] **Touch-friendly Kanban** — drag-and-drop via touch events 🔗 _(requires Phase 2.9 Kanban)_
|
||||
- [ ] **PWA** — installable; offline read-only access to cached items
|
||||
- [ ] **Web push notifications** — opt-in browser push for item updates / mentions / SLA breaches
|
||||
|
||||
### 6.2 Accessibility — `@axe-core/playwright` in CI 🆕
|
||||
|
||||
> Pattern from FlowMonk `77f9bd9`.
|
||||
|
||||
- [ ] **WCAG 2.1 AA compliance audit** — full keyboard nav, ARIA roles, contrast ≥ 4.5 : 1
|
||||
- [ ] **Screen reader tested** — VoiceOver (Safari/macOS) + NVDA (Chrome/Windows)
|
||||
- [ ] **Focus management** — modal/sheet close returns focus to trigger
|
||||
- [ ] **Reduced motion** — all animations respect `prefers-reduced-motion`
|
||||
- [ ] **`@axe-core/playwright` runs in CI** on every PR — fails on any new violation
|
||||
|
||||
### 6.3 Theming & Dark Mode 🆕
|
||||
|
||||
> Adapted from FlowMonk theme/colors.ts centralization (`81d699c`).
|
||||
|
||||
- [ ] **Centralised palette** — single `theme/colors.ts` (or `--tk-*` token set) source of truth
|
||||
- [ ] **Zero hardcoded hex** in components — enforced by UI drift ratchet (Phase 1.3)
|
||||
- [ ] **Dark mode** — system-preference aware; manual toggle in user settings
|
||||
- [ ] **High-contrast theme** for accessibility
|
||||
|
||||
### 6.4 Internationalisation
|
||||
|
||||
- [ ] **i18n scaffold** — `next-intl` or equivalent; English baseline
|
||||
- [ ] **RTL support** — Arabic + Hebrew layout flip
|
||||
- [ ] **Locale-aware date formatting** in activity log and timelines
|
||||
|
||||
### 6.5 Native Apps (Stretch)
|
||||
|
||||
- [ ] **React Native wrapper** — iOS + Android with push notifications and offline queue
|
||||
|
||||
---
|
||||
|
||||
## Known Bugs & Gaps ⚠️
|
||||
|
||||
Confirmed issues ordered by severity. File new bugs via the [public roadmap](https://tracker.bytelyst.com/roadmap).
|
||||
|
||||
| ID | Severity | Description | Affects | Phase |
|
||||
| ----- | ----------- | ------------------------------------------------------------------------------------------ | ------------ | ----- |
|
||||
| B-001 | 🔴 Critical | `valkey` (Redis) container `unhealthy` — root cause of most downstream failures | All services | 1.1 |
|
||||
| B-002 | 🔴 Critical | `platform-service` container `unhealthy` — caused by B-001 | All apps | 1.1 |
|
||||
| B-003 | 🟠 High | Kanban has no drag-and-drop — status changes via buttons only | Board view | 2.9 |
|
||||
| B-004 | 🟠 High | Vote deduplication is localStorage-only — server has no per-email enforcement | Roadmap | 4.1 |
|
||||
| B-005 | 🟠 High | No rate limiting on `POST /public/submit` — open to bot spam | Roadmap | 1.4 |
|
||||
| B-006 | 🟡 Medium | Description is plain text — no markdown rendering in detail view | Item detail | 2.3 |
|
||||
| B-007 | 🟡 Medium | Comment edit/delete not implemented — all comments are permanent | Comments | 2.7 |
|
||||
| B-008 | 🟡 Medium | No `@mention` support in comments — no notifications | Comments | 2.7 |
|
||||
| B-009 | 🟡 Medium | `/health` route returns 200 without checking dependencies | Infra | 1.1 |
|
||||
| B-010 | 🟡 Medium | No audit log — no record of field changes with actor + timestamp | Item detail | 1.7 |
|
||||
| B-011 | 🟡 Medium | No real-time updates — item detail requires manual refresh for new activity | Item detail | 2.8 |
|
||||
| B-012 | 🟢 Low | Kanban scroll position lost on page refresh | Board view | 2.9 |
|
||||
| B-013 | 🟢 Low | Product switcher selection lost on hard refresh | Nav | 2.9 |
|
||||
| B-014 | 🟢 Low | `source: auto_detected` items have no UI badge to distinguish them | Items list | 2.3 |
|
||||
| B-015 | 🟢 Low | Item detail has no loading skeleton — blank flash before data loads | Item detail | 2.3 |
|
||||
| B-016 | 🟢 Low | Public roadmap stats don't refresh after submitting a new idea | Roadmap | 4.1 |
|
||||
| B-017 | 🟢 Low | No `improvement` or `chore` item types — everything shoehorned into bug/feature/task | Item create | 2.2 |
|
||||
| B-018 | 🟢 Low | No global search across items — only per-page search bar | Items list | 2.9 |
|
||||
| B-019 | 🟡 Medium | Hardcoded color literals present in components — should be `--tk-*` tokens | UI | 2.1 |
|
||||
| B-020 | 🟡 Medium | Direct `@bytelyst/ui` imports outside an adapter — should route through `Primitives.tsx` | UI | 2.1 |
|
||||
| B-021 | 🟡 Medium | `NEXT_PUBLIC_*` env vars hardcoded to `api.bytelyst.com` at runtime — should bake at build | Docker | 1.2 |
|
||||
| B-022 | 🟡 Medium | No mobile/responsive layout — desktop-only | Mobile | 6.1 |
|
||||
| B-023 | 🟢 Low | No i18n — English only | All | 6.4 |
|
||||
|
||||
---
|
||||
|
||||
## Submission Guide — For All Audiences
|
||||
|
||||
### 🌐 Public Users
|
||||
|
||||
Visit **[https://tracker.bytelyst.com/roadmap](https://tracker.bytelyst.com/roadmap)** → click
|
||||
**"Submit an idea"**. No account required. After submitting you receive a link to track your
|
||||
item without logging in.
|
||||
|
||||
### 🏢 Company Team / PMs / Developers
|
||||
|
||||
Log in at **[https://tracker.bytelyst.com/login](https://tracker.bytelyst.com/login)**. Go to
|
||||
**Dashboard → Items → Create**. Set `visibility: internal` to keep it off the public roadmap.
|
||||
|
||||
### 🤖 Coding Agents (REST API)
|
||||
|
||||
> Agent API keys are managed by admin at `/dashboard/settings/api-keys` _(ships Phase 3)_.
|
||||
|
||||
```http
|
||||
### 1. Pull items labelled for agent work
|
||||
GET /api/agent/v1/items?status=open&label=agent-ready&limit=10&since=2026-05-20T00:00:00Z
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
X-Product-Id: chronomind
|
||||
|
||||
### 2. Claim atomically
|
||||
PATCH /api/agent/v1/items/{id}/claim
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
|
||||
### 3. Link your PR
|
||||
PATCH /api/agent/v1/items/{id}/pr
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
|
||||
{
|
||||
"prUrl": "https://github.com/org/repo/pull/42",
|
||||
"prNumber": 42,
|
||||
"prTitle": "fix: null-check avatar in UserCard",
|
||||
"prStatus": "open",
|
||||
"branch": "fix/tracker-{id}-null-avatar",
|
||||
"commitSha": "abc123def456",
|
||||
"ciStatus": "pending"
|
||||
}
|
||||
|
||||
### 4. Check off an acceptance criterion
|
||||
PATCH /api/agent/v1/items/{id}/checklist
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
|
||||
{ "item": "Null-check avatar before rendering", "checked": true }
|
||||
|
||||
### 5. Post implementation notes
|
||||
POST /api/agent/v1/items/{id}/comments
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
|
||||
{ "body": "Fixed in `UserCard.tsx:42` by adding `avatar ?? defaultAvatar`. Unit test added." }
|
||||
|
||||
### 6. Mark done when PR merges
|
||||
PATCH /api/agent/v1/items/{id}/status
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer <TRACKER_AGENT_KEY>
|
||||
|
||||
{ "status": "done", "reason": "PR #42 merged to main" }
|
||||
```
|
||||
|
||||
Full OpenAPI spec → **[https://tracker.bytelyst.com/api-docs](https://tracker.bytelyst.com/api-docs)** _(ships Phase 3)_
|
||||
|
||||
---
|
||||
|
||||
## Cross-Repo Learnings Applied
|
||||
|
||||
This roadmap incorporates lessons from the production-readiness sprints in sibling repos:
|
||||
|
||||
| Source repo | Commit range | Theme | Where applied here |
|
||||
| ---------------------- | --------------------------------- | ------------------------------------------------------------------ | ---------------------------------------------- |
|
||||
| `learning_ai_notes` | `c75ed3d` → `4798944` (May 22–23) | UI primitives migration, drift ratchet, Docker hardening, runbooks | Phase 1.2, 1.3, 1.7, 1.8, Phase 2.1, Phase 6.3 |
|
||||
| `learning_ai_flowmonk` | `df9819f` → `81d699c` (May 23) | Adapted NoteLett playbook, mobile palette, axe-core a11y | Phase 1.5, Phase 6.2, Phase 6.3 |
|
||||
| `learning_ai_clock` | `4f5ee53` → `7ab3287` (May 18–23) | Maintenance mode, fontsource, kill switch | Phase 1.6, Phase 1.7 |
|
||||
|
||||
See [`docs/PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](./PRODUCTION_READINESS_HANDOFF_ROADMAP.md)
|
||||
for the detailed playbook adapted from NoteLett's 26-commit production-readiness work.
|
||||
|
||||
---
|
||||
|
||||
## Contributing
|
||||
|
||||
1. **File an item** via the [public roadmap](https://tracker.bytelyst.com/roadmap) or the dashboard
|
||||
2. **Upvote** items you care about — priority is vote-weighted
|
||||
3. **Comment** with context, edge cases, or design notes
|
||||
4. **Open a PR** — branch as `feat/tracker-{id}-{slug}` so the inbound webhook auto-links it _(Phase 3.4)_
|
||||
5. **Mark slice progress** in [`docs/IMPLEMENTATION_TRACKER.md`](./IMPLEMENTATION_TRACKER.md) with commit SHA
|
||||
|
||||
---
|
||||
|
||||
_Maintained by the ByteLyst platform team · Questions → `platform@bytelyst.com`_
|
||||
127
dashboards/tracker-web/docs/roadmaps/00_MASTER_EXECUTION_PLAN.md
Normal file
127
dashboards/tracker-web/docs/roadmaps/00_MASTER_EXECUTION_PLAN.md
Normal file
@ -0,0 +1,127 @@
|
||||
# 00 — Master Execution Plan
|
||||
|
||||
**Parent:** [`docs/ROADMAP.md`](../ROADMAP.md)
|
||||
**Related sub-plans:** [`01_FOUNDATIONS_AND_DECISIONS.md`](./01_FOUNDATIONS_AND_DECISIONS.md) · `03_RICH_ITEMS_ROADMAP.md` _(create when Phase 2 begins)_ · `04_AGENT_API_ROADMAP.md` _(create when Phase 3 begins)_ · `05_INTAKE_ROADMAP.md` _(create when Phase 4 begins)_
|
||||
|
||||
---
|
||||
|
||||
## Sequencing Overview
|
||||
|
||||
```
|
||||
Phase 0 ✅ Shipped (baseline)
|
||||
│
|
||||
▼
|
||||
Phase 1 🔄 Production Hardening (sequential A→B→C→D→E→F)
|
||||
│
|
||||
├──────────────────┐
|
||||
▼ ▼
|
||||
Phase 2 UI Primitives Phase 3 Agent API
|
||||
│ │
|
||||
└────────┬─────────┘
|
||||
▼
|
||||
Phase 4 Multi-Source Intake
|
||||
│
|
||||
┌────────┴─────────┐
|
||||
▼ ▼
|
||||
Phase 5 Analytics Phase 6 Mobile/A11y
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical Path
|
||||
|
||||
The critical path to "production-ready, agent-driven, multi-source tracker" is:
|
||||
|
||||
1. **Phase 1.A** → unblock container health → 4 h
|
||||
2. **Phase 1.B** → workspace health → 2 h
|
||||
3. **Phase 1.C** → Docker hardening → 4 h
|
||||
4. **Phase 1.D** → UI drift ratchet → 3 h
|
||||
5. **Phase 1.E** → test hardening → 8 h
|
||||
6. **Phase 1.F** → observability + security → 5 h
|
||||
7. **Phase 2.1** → UI primitives migration → 10 h
|
||||
8. **Phase 2.3–2.6** → rich item details → 20 h
|
||||
9. **Phase 3.1–3.4** → agent API + webhooks → 25 h
|
||||
10. **Phase 4.4** → cross-product routing → 8 h
|
||||
|
||||
**Critical-path total:** ~90 hours focused work. Parallelisable to ~50 hours real-time with 2 agents.
|
||||
|
||||
---
|
||||
|
||||
## Workstream Ownership
|
||||
|
||||
| Workstream | Suggested owner | Phase coverage |
|
||||
| ------------------- | -------------------- | ---------------------------------------------------------- |
|
||||
| **Infra + DevOps** | Platform agent | 1.A, 1.C, 1.E (Docker bits), 1.F (Prometheus/Grafana/Loki) |
|
||||
| **Frontend** | UI agent | 1.B, 1.D, 2.1, 2.3–2.9, 6.1, 6.3 |
|
||||
| **Backend / API** | Backend agent | 2.5 (linked items, milestones), 3.1–3.6, 4.3 |
|
||||
| **Agent ecosystem** | Agent platform agent | 3.5 (SDK), 3.6 (AI triage), 4.2–4.3 (intake) |
|
||||
| **PM / Docs** | Human PM | 4.4, 4.6, runbooks, PRD updates |
|
||||
|
||||
---
|
||||
|
||||
## Parallelisation Rules
|
||||
|
||||
- **Phase 1 is sequential within itself.** Do not parallelise 1.A → 1.F.
|
||||
- **Phase 2 and Phase 3 can run in parallel** after Phase 1 closes.
|
||||
- **Phase 4 depends on Phase 3 webhooks shipping** before bidirectional sync (4.3) can start.
|
||||
- **Phase 5 and Phase 6 can run in parallel** after Phase 4.
|
||||
- **Docs work** (PRD updates, runbooks, ADRs) can happen alongside any phase.
|
||||
|
||||
---
|
||||
|
||||
## Cross-Repo Coordination
|
||||
|
||||
The tracker-web shares concerns with other ByteLyst products. When work overlaps:
|
||||
|
||||
| Concern | Lead repo | Tracker-web role |
|
||||
| ----------------------------------------- | --------------------------------------------------- | ------------------------------------------------------------------------------- |
|
||||
| `@bytelyst/ui` primitives | `learning_ai_common_plat/packages/ui` | Consumer; report bugs/requests via tracker; never patch directly in tracker-web |
|
||||
| `platform-service` auth + tracker backend | `learning_ai_common_plat/services/platform-service` | Consumer; raise schema-mismatch PRs against platform-service |
|
||||
| `@bytelyst/design-tokens` | `learning_ai_common_plat/packages/design-tokens` | Consumer of `--tk-*` tokens; never inline hex |
|
||||
| `docker-prep.sh` pattern | NoteLett / FlowMonk | Adapter; copy + tracker-specific tweaks |
|
||||
| MEK rotation pattern | NoteLett (`bcad7d3`) | Adapt runbook with tracker-specific keys |
|
||||
|
||||
---
|
||||
|
||||
## Implementation Tracking Protocol
|
||||
|
||||
- **Checkpoint updates:** every meaningful slice updates [`../IMPLEMENTATION_TRACKER.md`](../IMPLEMENTATION_TRACKER.md) with commit SHA
|
||||
- **Roadmap commit-hash backfill:** when a roadmap item closes, paste the commit SHA after the checkbox
|
||||
- **Open questions:** unresolved items go in [`../PRD.md`](../PRD.md) § Open Questions
|
||||
- **Deferred items:** non-blocking scope cuts go in the tracker doc and relevant phase roadmap
|
||||
- **Cross-agent safety:** strategic decisions in [`01_FOUNDATIONS_AND_DECISIONS.md`](./01_FOUNDATIONS_AND_DECISIONS.md) should not be silently changed after downstream work starts
|
||||
|
||||
---
|
||||
|
||||
## Verification Gates Between Phases
|
||||
|
||||
Before declaring a phase complete and moving to the next, the following must hold:
|
||||
|
||||
### Phase 1 → Phase 2 gate
|
||||
|
||||
- All Phase 1 exit criteria in [`../PRODUCTION_READINESS_HANDOFF_ROADMAP.md`](../PRODUCTION_READINESS_HANDOFF_ROADMAP.md) met
|
||||
- 24 h of healthy containers (no flapping)
|
||||
- Zero new bugs filed by `auto_detected` source in past 24 h
|
||||
|
||||
### Phase 2 → Phase 3 gate
|
||||
|
||||
- UI drift ratchet shows zero hardcoded colors + zero direct `@bytelyst/ui` imports outside adapter
|
||||
- Acceptance criteria checklist data model live and tested (prereq for Phase 3.2 checklist endpoint)
|
||||
- PR link data model live and tested (prereq for Phase 3.2 PR endpoint)
|
||||
|
||||
### Phase 3 → Phase 4 gate
|
||||
|
||||
- Agent API keys functional with at least one real agent (Claude Code or Codex) demonstrably claiming + closing items
|
||||
- Inbound + outbound webhooks delivering reliably (< 0.1 % delivery failure rate)
|
||||
- OpenAPI spec published at `/api-docs`
|
||||
|
||||
### Phase 4 → Phase 5 gate
|
||||
|
||||
- All four intake sources (public / internal / dev / agent) functional and documented
|
||||
- Cross-product routing demonstrably moves items between products with audit trail intact
|
||||
|
||||
### Phase 5 → Phase 6 gate
|
||||
|
||||
- Analytics dashboard rendering correctly with live data
|
||||
- SLA alerts firing in test scenarios
|
||||
- Data retention policy applied to at least one historical batch
|
||||
@ -0,0 +1,270 @@
|
||||
# 01 — Foundations and Decisions
|
||||
|
||||
**Parent:** [`docs/ROADMAP.md`](../ROADMAP.md) · [`00_MASTER_EXECUTION_PLAN.md`](./00_MASTER_EXECUTION_PLAN.md)
|
||||
|
||||
> Strategic decisions locked for execution. These should not be silently changed once
|
||||
> downstream work starts — open an issue and discuss first.
|
||||
|
||||
---
|
||||
|
||||
## D-001 — Product Identity
|
||||
|
||||
**Decision:** Tracker dashboard is a **standalone first-party product** within `learning_ai_common_plat`,
|
||||
not a feature of `platform-service`.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Independent deployment cadence — UI iterations don't require platform-service redeploys
|
||||
- Multi-product use case — same UI serves chronomind, notelett, flowmonk, etc. via `x-product-id` header
|
||||
- Public-facing roadmap surface needs distinct domain (`tracker.bytelyst.com`)
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Lives at `learning_ai_common_plat/dashboards/tracker-web` (not in a `services/` folder)
|
||||
- Owns its own Dockerfile, docker-compose stanza, Caddy route
|
||||
- Has its own `productId` (`tracker`) for self-hosted bug reports
|
||||
|
||||
---
|
||||
|
||||
## D-002 — Backend Strategy: Proxy via `platform-service`, No Standalone Backend
|
||||
|
||||
**Decision:** All tracker API calls proxy through `platform-service`. We do **not** create a
|
||||
separate `tracker-service` repo.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Reuses platform-service's auth (JWT, MFA, OAuth), Cosmos client, telemetry, observability
|
||||
- Avoids second auth subsystem maintenance burden
|
||||
- Tracker items are CRUD-shaped; no specialised compute needed
|
||||
- Future: if tracker logic outgrows platform-service, extract then — not now
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Tracker data lives in Cosmos under a `tracker` container, partition key `productId`
|
||||
- Schema changes require platform-service PR + tracker-web PR coordinated
|
||||
- Phase 3 agent API endpoints live in platform-service under `/api/agent/v1/*`
|
||||
|
||||
---
|
||||
|
||||
## D-003 — Auth Strategy: Reuse Platform JWT
|
||||
|
||||
**Decision:** Auth = same JWT issued by `platform-service` for all ByteLyst products.
|
||||
Agent API keys are separate but minted by platform-service.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Single sign-on across all dashboards (chronomind-web, tracker-web, devops-web, etc.)
|
||||
- Centralised key rotation, MFA, OAuth provider config
|
||||
- Agent keys join the same audit infrastructure
|
||||
|
||||
**Implications:**
|
||||
|
||||
- No bespoke auth in tracker-web — `@bytelyst/react-auth` adapter only
|
||||
- API key UI proxies to platform-service `/api/keys/*` (built in Phase 3.1)
|
||||
|
||||
---
|
||||
|
||||
## D-004 — UI Strategy: `@bytelyst/ui` Primitives via Adapter
|
||||
|
||||
**Decision:** All UI components route through `src/components/ui/Primitives.tsx` adapter that
|
||||
re-exports `@bytelyst/ui`. Direct `@bytelyst/ui` imports outside the adapter are forbidden
|
||||
and enforced by the UI drift ratchet (Phase 1.D).
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Matches NoteLett, FlowMonk, ChronoMind, NomGap pattern — proven across 4+ products
|
||||
- Lets us inject tracker-specific design-token theming at the adapter level
|
||||
- Single insertion point for cross-cutting concerns (telemetry, error boundaries, accessibility tweaks)
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Phase 2.1 is mandatory before Phase 6 theming work
|
||||
- UI drift ratchet (Phase 1.D) must ship before any new UI code lands
|
||||
- Components migrate in this order: auth + modals (UI5) → list/board/dashboard (UI6) → detail/comments/roadmap (UI7) → globals cleanup (UI8) — same order NoteLett used
|
||||
|
||||
---
|
||||
|
||||
## D-005 — Design Tokens: `--tk-*` Prefix
|
||||
|
||||
**Decision:** Tracker dashboard uses CSS custom properties under the `--tk-*` prefix, sourced
|
||||
from `@bytelyst/design-tokens`.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Each product gets its own token namespace (`--nl-*` NoteLett, `--fm-*` FlowMonk, `--cm-*` ChronoMind, `--tk-*` Tracker)
|
||||
- Allows per-product theming without conflicts on shared components
|
||||
- Hardcoded hex literals fail the UI drift ratchet — forces token usage
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Token PRs go to `learning_ai_common_plat/packages/design-tokens`
|
||||
- Hardcoded colors in tracker-web fail CI from Phase 1.D onward
|
||||
- Dark mode (Phase 6.3) flips token values, not component code
|
||||
|
||||
---
|
||||
|
||||
## D-006 — Item Schema: Extensible Core + `metadata` Map
|
||||
|
||||
**Decision:** Item schema has a fixed core (title, description, status, priority, type, etc.)
|
||||
plus an open-ended `metadata` map for product-specific and agent-supplied data.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Core fields cover 95 % of cases for all products
|
||||
- `metadata` lets CI agents stash `{ testRunId, commitSha, ciJobUrl }` without schema churn
|
||||
- Custom fields (Phase 2.6) write into the same `metadata` map with admin-defined keys
|
||||
- Avoids Jira-style schema explosion
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Cosmos document shape: `{ ...coreFields, metadata: { ... } }`
|
||||
- Indexing strategy: index core fields, leave `metadata` as catch-all
|
||||
- Custom field UI is a thin layer reading/writing the same map
|
||||
|
||||
---
|
||||
|
||||
## D-007 — Status Workflow: Default + Custom Per Product
|
||||
|
||||
**Decision:** Default workflow is `open → in_progress → done → closed` plus terminal `wont_fix`.
|
||||
Products can extend with additional statuses (e.g., `needs_review`, `blocked`, `in_qa`) defined
|
||||
in their product config.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Default covers the simple case
|
||||
- Some products (security-sensitive, regulated) need explicit `in_review` gates
|
||||
- Custom-per-product avoids forcing all products onto a complex workflow
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Phase 2.2 ships the custom-status mechanism
|
||||
- Kanban columns (Phase 2.9) render based on the active product's workflow
|
||||
- Workflow changes require admin permission
|
||||
|
||||
---
|
||||
|
||||
## D-008 — Visibility Model: Binary `internal` vs `public`
|
||||
|
||||
**Decision:** Items have a single `visibility` flag: `internal` or `public`. No fine-grained ACLs.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Simpler mental model than Linear/Jira issue-level permissions
|
||||
- Public roadmap shows only `public` items; team dashboard shows both
|
||||
- Aligns with "public users + internal team" two-audience model
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Toggling visibility audit-logged
|
||||
- Comments on `public` items are also public (no per-comment visibility)
|
||||
- Mention notifications respect visibility — `@external-user` on internal item silently drops
|
||||
|
||||
---
|
||||
|
||||
## D-009 — Agent API: Keyed, Versioned, Rate-Limited
|
||||
|
||||
**Decision:** Agent API lives at `/api/agent/v1/*`, authenticated via bearer API key. Versioned
|
||||
URL prefix with 6-month backward compatibility on breaking changes.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Bearer keys are simpler than OAuth for service-to-service
|
||||
- Versioned URL prefix makes deprecation explicit
|
||||
- 6-month window is enough for agents to migrate without forcing emergency upgrades
|
||||
|
||||
**Implications:**
|
||||
|
||||
- API key UI (Phase 3.1) is admin-only
|
||||
- `Deprecation` + `Sunset` headers on v1 routes when v2 ships
|
||||
- Rate limits enforced via valkey (Redis) sliding window
|
||||
|
||||
---
|
||||
|
||||
## D-010 — Webhook Strategy: Inbound + Outbound, Signed
|
||||
|
||||
**Decision:** Tracker accepts inbound webhooks from GitHub + Gitea (Phase 3.4) and emits
|
||||
outbound webhooks for any registered target URL (Phase 3.5). Both directions HMAC-SHA256 signed.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Inbound enables PR-status auto-sync without polling
|
||||
- Outbound enables Slack notifications, custom alerting, mirroring to other systems
|
||||
- Signing prevents replay attacks and spoofing
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Per-product webhook secret stored in Cosmos (admin-rotatable)
|
||||
- Delivery log + replay mechanism (Phase 3.4/3.5) for debugging
|
||||
- Outbound retries on non-2xx with exponential backoff (5 retries over 24 h)
|
||||
|
||||
---
|
||||
|
||||
## D-011 — Test Strategy: Vitest + Playwright + axe-core
|
||||
|
||||
**Decision:** Three-layer test strategy:
|
||||
|
||||
1. **Vitest** for unit tests (`src/lib/`, pure functions, hooks)
|
||||
2. **Playwright** for E2E against deployed Docker stack
|
||||
3. **`@axe-core/playwright`** for automated WCAG 2.1 AA checks on every E2E run
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Matches NoteLett + FlowMonk pattern proven across 2 products
|
||||
- Playwright deployed-stack catches integration bugs unit tests miss
|
||||
- axe-core in CI prevents accessibility regressions before merge
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Cosmos emulator smoke job runs in CI (catches partition-key bugs)
|
||||
- Cleanup traps prevent test-data leakage between runs
|
||||
- Port-conflict-proof config (random ports per run) allows parallel local runs
|
||||
|
||||
---
|
||||
|
||||
## D-012 — Docker Strategy: Standalone Next.js + `docker-prep.sh` Tarballs
|
||||
|
||||
**Decision:** Tracker-web ships as a standalone Next.js Docker image built with the
|
||||
`docker-prep.sh` pattern (pack `@bytelyst/*` tarballs locally, ignore Gitea npm registry at build time).
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Corp network blocks public CDNs and sometimes Gitea registry — bundling tarballs is reliable
|
||||
- Standalone output is ~80 % smaller than full `node_modules` image
|
||||
- Matches NoteLett + FlowMonk + ChronoMind proven pattern
|
||||
|
||||
**Implications:**
|
||||
|
||||
- `docker-prep.sh` (Phase 1.C.1) must be adopted before Phase 1.C.8 verification
|
||||
- `.docker-deps/` is in `.gitignore`; tarballs regenerated per build
|
||||
- `NEXT_PUBLIC_*` baked at build time via Dockerfile `ARG` (Phase 1.C.3)
|
||||
|
||||
---
|
||||
|
||||
## D-013 — Deployment Topology: Single VM, Shared Caddy, Shared valkey
|
||||
|
||||
**Decision:** Tracker-web shares the existing single-VM topology with all other ByteLyst products.
|
||||
Caddy routes by hostname (`tracker.bytelyst.com` → `tracker-web:3003`). valkey + Cosmos shared
|
||||
across products.
|
||||
|
||||
**Rationale:**
|
||||
|
||||
- Cost-effective for current scale (< 1k users per product)
|
||||
- Easier ops — one VM to monitor, one Caddy config, one Loki/Prometheus stack
|
||||
- Can split later if any single product needs isolation
|
||||
|
||||
**Implications:**
|
||||
|
||||
- Resource contention is real (B-001 valkey impact cascaded across all products) — mitigations in Phase 1.A
|
||||
- Per-product secrets isolated via env vars, not separate VMs
|
||||
- Capacity planning needs swap (Phase 1.A.5) and CI runner concurrency caps (Phase 1.A.6)
|
||||
|
||||
---
|
||||
|
||||
## Open Decisions
|
||||
|
||||
These are not yet locked — discuss before Phase 3.
|
||||
|
||||
- **D-Open-001:** Should agent API keys be Cosmos-stored or KeyVault-stored? (Affects Phase 3.1)
|
||||
- **D-Open-002:** Outbound webhook retry policy — give up after 24 h or escalate?
|
||||
- **D-Open-003:** Auto-classification (Phase 3.7) — does the LLM call run synchronously on submit or async via a queue?
|
||||
- **D-Open-004:** Cross-product routing (Phase 4.4) — does moving an item rewrite history or create a new item with link-back?
|
||||
70
dashboards/tracker-web/docs/runbooks/MEK_ROTATION.md
Normal file
70
dashboards/tracker-web/docs/runbooks/MEK_ROTATION.md
Normal file
@ -0,0 +1,70 @@
|
||||
# MEK Rotation Runbook
|
||||
|
||||
**Status:** Stub — populate as part of Phase 1.F.11
|
||||
**Owner:** Platform team
|
||||
**Source pattern:** [`learning_ai_notes/docs/runbooks/MEK_ROTATION.md`](../../../../../learning_ai_notes/docs/runbooks/MEK_ROTATION.md) (commit `bcad7d3`)
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
This runbook describes how to **rotate the Master Encryption Key (MEK)** used for
|
||||
field-level encryption of sensitive tracker data (PII fields on items, comments, attachment
|
||||
URLs, agent API key seeds).
|
||||
|
||||
Tracker MEK rotation follows the same envelope-encryption pattern as NoteLett:
|
||||
|
||||
1. Tracker holds a **per-product MEK reference** in env (`TRACKER_MEK_ID_<PRODUCTID>`).
|
||||
2. The MEK itself is stored in **Azure KeyVault**, never in process memory beyond
|
||||
a single request lifecycle.
|
||||
3. Each encrypted field has a `keyId` envelope marking which MEK version encrypted it.
|
||||
4. Rotation creates a new MEK version; new writes use the new version; reads support
|
||||
both old and new until reencryption sweep completes.
|
||||
|
||||
---
|
||||
|
||||
## Pre-rotation Checklist
|
||||
|
||||
- [ ] Confirm Azure KeyVault access from tracker-service host
|
||||
- [ ] Confirm latest backup of Cosmos `tracker` container exists (RPO < 1 h)
|
||||
- [ ] Notify on-call: rotation window expected ~30 min for active read-path verification
|
||||
- [ ] Capture baseline metrics — read/write latency on encrypted fields
|
||||
|
||||
---
|
||||
|
||||
## Rotation Procedure
|
||||
|
||||
> **TODO** — adapt full procedure from `learning_ai_notes/docs/runbooks/MEK_ROTATION.md`
|
||||
> once tracker-service field encryption ships in Phase 1.F. Sketch only below.
|
||||
|
||||
1. **Create new MEK version in KeyVault**
|
||||
- `az keyvault key create --vault-name <vault> --name tracker-mek-<productId> --kty RSA`
|
||||
- Record new `keyId`
|
||||
2. **Update tracker-service env** with new `TRACKER_MEK_ID_<PRODUCTID>`
|
||||
3. **Rolling restart tracker-service** — new writes encrypt with new key
|
||||
4. **Reencryption sweep** — background job re-reads + re-writes all encrypted fields with new key
|
||||
5. **Verify** — zero encrypted fields still on old key version
|
||||
6. **Revoke old key** — disable old KeyVault version
|
||||
|
||||
---
|
||||
|
||||
## Rollback
|
||||
|
||||
If decryption fails after rotation:
|
||||
|
||||
1. Revert env to previous `TRACKER_MEK_ID_<PRODUCTID>`
|
||||
2. Restart tracker-service
|
||||
3. Re-enable old KeyVault version
|
||||
4. Investigate which fields failed before retrying
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
- [ ] `pnpm run smoke:local` passes end-to-end after rotation
|
||||
- [ ] All encrypted fields on items / comments / attachments decrypt correctly via API
|
||||
- [ ] Audit log entry recorded for the rotation event
|
||||
|
||||
---
|
||||
|
||||
_See [`SECRET_MANAGEMENT.md`](./SECRET_MANAGEMENT.md) for the broader env / KeyVault secret workflow._
|
||||
88
dashboards/tracker-web/docs/runbooks/SECRET_MANAGEMENT.md
Normal file
88
dashboards/tracker-web/docs/runbooks/SECRET_MANAGEMENT.md
Normal file
@ -0,0 +1,88 @@
|
||||
# Secret Management Runbook
|
||||
|
||||
**Status:** Stub — populate as part of Phase 1.F.12
|
||||
**Owner:** Platform team
|
||||
**Source pattern:** [`learning_ai_notes/docs/runbooks/SECRET_MANAGEMENT.md`](../../../../../learning_ai_notes/docs/runbooks/SECRET_MANAGEMENT.md) (commit `bcad7d3`)
|
||||
|
||||
---
|
||||
|
||||
## Purpose
|
||||
|
||||
This runbook documents how secrets flow from **Azure KeyVault → env → process** for
|
||||
tracker-web (Next.js client/server) and the platform-service backend it proxies, and
|
||||
how to rotate them safely.
|
||||
|
||||
---
|
||||
|
||||
## Secret Inventory
|
||||
|
||||
| Secret | Used by | Storage | Rotation cadence |
|
||||
| ----------------------------------- | ----------------------------------------- | --------------------------------------- | ---------------------------------------------------- |
|
||||
| `JWT_SECRET` | platform-service + tracker-web API routes | KeyVault `bytelyst-jwt-secret` | Quarterly |
|
||||
| `TRACKER_MEK_ID_<PRODUCTID>` | tracker-service field encryption | KeyVault per-product MEK keys | Quarterly (see [MEK_ROTATION.md](./MEK_ROTATION.md)) |
|
||||
| `POSTHOG_KEY` | tracker-web client-side telemetry | KeyVault `tracker-posthog-key` | On compromise only |
|
||||
| `COSMOS_CONNECTION_STRING` | platform-service Cosmos client | KeyVault `bytelyst-cosmos-conn` | On compromise only |
|
||||
| `VALKEY_PASSWORD` | platform-service session/cache | KeyVault `bytelyst-valkey-password` | Quarterly |
|
||||
| `TURNSTILE_SECRET` (Phase 1.4) | tracker-web public submission CAPTCHA | KeyVault `tracker-turnstile-secret` | On compromise only |
|
||||
| `GITHUB_WEBHOOK_SECRET` (Phase 3.3) | tracker-service inbound webhook HMAC | KeyVault `tracker-gh-webhook-secret` | On compromise only |
|
||||
| `GITEA_WEBHOOK_SECRET` (Phase 3.3) | tracker-service inbound webhook HMAC | KeyVault `tracker-gitea-webhook-secret` | On compromise only |
|
||||
| Agent API keys (Phase 3.1) | end-user-managed; stored hashed in Cosmos | Cosmos `apikeys` collection | User-managed |
|
||||
|
||||
---
|
||||
|
||||
## Resolution Path
|
||||
|
||||
```
|
||||
KeyVault → Docker build args (NEXT_PUBLIC_*) → baked into next-build
|
||||
╲
|
||||
→ docker-compose env_file → process.env at runtime
|
||||
↘
|
||||
systemd EnvironmentFile → process.env at runtime
|
||||
```
|
||||
|
||||
- **Build-time secrets** (`NEXT_PUBLIC_*`) are baked into the Next.js standalone build via
|
||||
Dockerfile `ARG` + `ENV` (Phase 1.C.3 / 1.2). Once baked they are visible in client JS;
|
||||
only put truly public-safe values here.
|
||||
- **Runtime secrets** flow via docker-compose `env_file` or systemd `EnvironmentFile` so
|
||||
they are not visible in client bundles.
|
||||
- **No secret ever** appears in `git`, `.env.example`, log lines, or container `inspect`
|
||||
output.
|
||||
|
||||
---
|
||||
|
||||
## Rotation Procedure
|
||||
|
||||
> **TODO** — adapt full procedure from `learning_ai_notes/docs/runbooks/SECRET_MANAGEMENT.md`
|
||||
> once tracker-service ships secret-aware deployment in Phase 1.F.
|
||||
|
||||
1. **Create new secret version in KeyVault**
|
||||
2. **Update env source** (docker-compose `.env`, systemd unit, or CI/CD secret store)
|
||||
3. **Rolling restart** affected services
|
||||
4. **Verify** — `pnpm run smoke:local` passes against rotated stack
|
||||
5. **Revoke previous secret version** in KeyVault after 24 h soak
|
||||
|
||||
---
|
||||
|
||||
## On Suspected Compromise
|
||||
|
||||
1. **Immediately revoke** the suspected secret in KeyVault
|
||||
2. **Rotate** all dependent secrets in the same blast radius
|
||||
3. **Force-revoke** all JWT tokens (bump `JWT_SECRET` → all sessions invalidated)
|
||||
4. **Audit** access logs since suspected compromise window
|
||||
5. **File a tracker bug** type `chore`, label `security`, priority `critical`
|
||||
|
||||
---
|
||||
|
||||
## PII Scrubbing Rule
|
||||
|
||||
Per Phase 1.F.10: emails, names, and any field marked `pii: true` in the schema must NEVER
|
||||
appear as plaintext in:
|
||||
|
||||
- Log lines (use `@bytelyst/logger` redaction map)
|
||||
- Telemetry events sent to PostHog
|
||||
- Error messages bubbled to clients
|
||||
- Webhook delivery logs (Phase 3.4 / 3.5)
|
||||
|
||||
---
|
||||
|
||||
_See [`MEK_ROTATION.md`](./MEK_ROTATION.md) for field-level encryption key rotation specifically._
|
||||
@ -217,16 +217,21 @@ When all phases above are checked, the agent fills in this section and stops:
|
||||
|
||||
When Codex marks P6.2 complete, the human verifies:
|
||||
|
||||
- [ ] **R1** Final report (above) is filled in with no `<placeholder>` strings
|
||||
- [x] **R1** Final report (above) is filled in with no `<placeholder>` strings
|
||||
- Status: `PASS: final report section is populated and contains no placeholder tokens; remaining blocked items are explicitly labeled in the phase tracker.`
|
||||
- [ ] **R2** Both cross-Gitea SHA matches (P3.6 + P5.4) are ✅
|
||||
- [ ] **R3** `systemctl status gitea-act-runner.service` on Hostinger VM is `active (running)`
|
||||
- [x] **R3** `systemctl status gitea-act-runner.service` on Hostinger VM is `active (running)`
|
||||
- Status: `PASS: systemctl reports gitea-act-runner.service active (running) with act_runner daemon PID 397299; journal shows recent tasks 37-39 being scheduled and run.`
|
||||
- [ ] **R4** Gitea admin UI shows runner as Idle and recently seen
|
||||
- [ ] **R5** `.gitea/workflows/publish-packages.yml` on `main` of `learning_ai_common_plat`:
|
||||
- [x] **R5** `.gitea/workflows/publish-packages.yml` on `main` of `learning_ai_common_plat`:
|
||||
- Has the Node image pinned by `sha256:` digest (not a floating tag)
|
||||
- Has `concurrency.cancel-in-progress: false`
|
||||
- Mounts `~/.gitea_npm_token` as a read-only volume (not in env vars or logs)
|
||||
- [ ] **R6** The throwaway `@bytelyst/_runner-e2e-test` package is **gone from both Gitea registries** (visit Packages UI to confirm)
|
||||
- [ ] **R7** No leftover branches: `runner/gitea-smoke`, `runner/gitea-e2e` deleted from both `origin` and `gitea` remotes
|
||||
- Status: `PASS: workflow file pins node:20-bookworm@sha256:8f693eaa7e0a8e71560c9a82b55fd54c2ae920a2ba5d2cde28bac7d1c01c9ba5, sets cancel-in-progress false, and mounts /home/gitea-runner/.gitea_publish_npmrc read-only at /run/secrets/gitea_publish_npmrc.`
|
||||
- [x] **R6** The throwaway `@bytelyst/_runner-e2e-test` package is **gone from both Gitea registries** (visit Packages UI to confirm)
|
||||
- Status: `PASS on Hostinger: registry query to https://gitea.bytelyst.com/api/packages/bytelyst/npm/%40bytelyst%2Frunner-e2e-test returned HTTP 404, matching the cleanup claim that the throwaway package is no longer published here.`
|
||||
- [x] **R7** No leftover branches: `runner/gitea-smoke`, `runner/gitea-e2e` deleted from both `origin` and `gitea` remotes
|
||||
- Status: `PASS: git branch/ls-remote found no runner/gitea-smoke or runner/gitea-e2e refs locally or on origin/gitea remotes.`
|
||||
- [ ] **R8** A consumer repo can `pnpm install` against either Gitea without lockfile churn (run the corp-network test and the home-network test if possible)
|
||||
- [ ] **R9** This roadmap doc itself has no surprises in the "Surprises / deviations" section that need follow-up
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user