Commit Graph

319 Commits

Author SHA1 Message Date
Saravanakumar D
b061cc47f3 feat(tracker-web): fleet control plane UI — overview, jobs, budget, detail pages (Phase 3 Slice 4)
- Fleet overview page with factory cards + recent jobs polling
- Job table with stage filter tabs
- Job detail page with events timeline, runs, artifacts, DAG subtree, SHIP action
- Budget page with usage bar, pause/resume controls
- API proxy route forwarding /api/fleet/* to platform-service
- Typed fleet-client.ts with graceful 404 degradation
- 16 unit tests for fleet-client (198 total tracker-web tests green)
- Added Fleet nav item to dashboard layout
- Full monorepo build + test green

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 09:49:24 -07:00
Saravanakumar D
f4ea7b4a5b feat(fleet): per-product budgets with pause/resume (Phase 3 Slice 3)
- FleetBudgetDoc: ceilingUsd, window, spentUsd, status (active/paused)
- FLEET_BUDGETS flag (default OFF = no enforcement, unchanged behavior)
- Enforcement in claimNextJob: paused or ceiling-exceeded → null
- accrueSpend(): monotonic spend accumulation, auto-pause at ceiling
- Budget routes: GET/PUT /fleet/budgets/:productId, pause, resume
- UpsertBudgetSchema for route validation
- 7 new coordinator tests (ceiling, auto-pause, manual pause/resume,
  flag OFF bypass, monotonic accounting, cross-product isolation)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 09:49:23 -07:00
Saravanakumar D
484ed05c1f feat(fleet): DAG job decomposition — parent/child + fan-out (Phase 3 Slice 2)
- SubmitJobSchema accepts inline children[] for atomic fan-out creation
- Parent blocked until all children reach dep-satisfying terminal stage
- patchJobFenced triggers maybeUnblockParent on child stage transitions
- submitChildren() for POST /fleet/jobs/:id/children (add children later)
- getDagSubtree() for GET /fleet/jobs/:id/dag (recursive subtree query)
- listChildrenByParent() repository helper
- SubmitChildrenSchema for route validation
- 8 new coordinator tests (fan-out, blocking, unblocking, cycle, DAG query)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 09:49:23 -07:00
Saravanakumar D
4468a69526 feat(fleet): tunable scoring weights + preemption (Phase 3 Slice 1)
- Add FleetWeightRegistry + resolveWeights() for per-product/per-request
  weight tunability with defaults fallback (backward compatible)
- Add selectPreemptionVictim() pure function: only critical jobs may
  trigger, never evicts equal/higher priority, picks lowest-priority victim
- Wire preemption into coordinator behind FLEET_PREEMPTION flag (default OFF)
- Seat-limit enforcement: at seatLimit factories skip normal selection and
  attempt preemption of lower-priority running jobs for critical newcomers
- Eviction preserves checkpoint, bumps leaseEpoch (fences zombie), requeues
- 18 new tests (pure scheduler + coordinator integration)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-30 09:49:23 -07:00
saravanakumardb1
b2ce22d81c feat(platform-service): direct tracker -> fleet module wiring (§10 round-trip)
In-process tracker<->fleet bridge — no shell hop. Closes the §10 "direct
tracker->module calls" box.

- tracker-bridge.ts (new):
  * ingestItemAsJob(productId, itemId, opts?) — reads the Item via the items
    repository (foreign/unknown → NotFoundError), maps title/description → bodyMd
    (verbatim) + labels (engine-class:/profile:/priority:/cap:) → manifest hints,
    sets trackerItemId + a stable idempotency-key `tracker-<itemId>`, and submits
    through coordinator.submitJob — so re-ingest dedupes and the job is scheduled by
    the §7 router via the unchanged claim path.
  * echoJobToItem(productId, jobId, log?) — mirrors stage → Item status
    (queued/assigned/building/review/testing → in_progress; shipped → done;
    failed/dead_letter → wont_fix) + a metrics-ONLY comment (attempts/duration/
    tokens/cost — never the prompt body/secrets). Idempotent via the job's
    `trackerEchoedStatus`; best-effort + non-fatal (items-write failure →
    { echoed: null, error }, never thrown into the job lifecycle). productId-scoped.
- Auto-echo wired into the PATCH + lease/release transitions, GATED by
  FLEET_TRACKER_ECHO (default OFF → behavior byte-for-byte unchanged); never blocks
  or fails the transition.
- Routes (additive): POST /fleet/tracker/ingest, POST /fleet/tracker/echo
  (auth + getRequestProductId, productId-scoped).
- types.ts: optional FleetJobDoc.trackerEchoedStatus (reuses the existing
  trackerItemId field; no parallel schema) + Ingest/Echo request schemas.
- repository.ts: setTrackerEchoedStatus (no rev bump — never interferes with the
  fenced claim CAS).

Reuses the items + comments contracts directly (no HTTP). Does not touch
claimNextJob or the scheduler. productId on every doc; no any/console.log.
2026-05-30 01:32:12 -07:00
saravanakumardb1
32328247ad test(platform-service): update fleet artifact tests for productId-scoped listing
listArtifactsByJob now requires productId; thread it through the existing
repository/artifacts test callers (signature update, assertions unchanged).
2026-05-30 00:06:10 -07:00
saravanakumardb1
e06b730161 feat(platform-service): fleet factory enrollment + scoped rotatable tokens (§12)
Adds factory enrollment + a scoped, rotatable credential model for the fleet
coordinator (trust boundary, §12/§18). Tokens are stored HASHED at rest (sha256 —
the same primitive the auth module uses for verify/magic-link tokens); the
high-entropy plaintext is returned exactly once at enroll/rotate and never persisted.

- enrollment.ts: enrollFactory (create/link factory + issue token), rotateToken
  (new active token; prior marked `rotating` with a grace overlap so an in-flight
  worker isn't cut off), revokeToken (immediate), verifyToken (constant-time hash
  compare; revoked/expired-grace → null; updates lastUsedAt). Scope = {productId,
  factoryId, capabilities[]}.
- Gated enforcement: enforceFactoryToken() on POST /fleet/factories/heartbeat and
  POST /fleet/claim, active only when FLEET_REQUIRE_FACTORY_TOKEN is on (default
  OFF — existing behavior/tests unchanged). When on: missing/invalid/revoked → 401;
  out-of-scope productId/capability/factory → 403; and the claim is CONSTRAINED to
  the verified token scope. Does not touch scheduler scoring or the claim CAS.
- types.ts: FleetFactoryTokenDoc + Enroll/Rotate/Revoke request schemas.
- repository.ts: fleet_factory_tokens collection + CRUD + findByHash.
- routes.ts (additive): POST /fleet/factories/enroll, /:id/token/rotate,
  /:id/token/revoke (user auth + productId + Zod).
- cosmos-init.ts: register fleet_factory_tokens (/productId).

Also hardens the artifact routes (review fixes): listArtifactsByJob is now
productId-scoped (GET /fleet/jobs/:id/artifacts threads the request productId), and
artifact upload uses the request/auth productId authoritatively (a spoofed
body.productId no longer overrides it).

Tokens hashed at rest; plaintext shown once; no new crypto schemes; productId on
every doc; no any/console.log; enforcement default OFF.
2026-05-30 00:05:52 -07:00
saravanakumardb1
b65e818f3d feat(platform-service): fleet artifacts + blob wiring (§13)
Artifact pointers in fleet_artifacts; large outputs in @bytelyst/blob (never
Cosmos). Routes: POST/GET /fleet/jobs/:id/artifacts, GET/DELETE
/fleet/artifacts/:id with short-lived SAS. 7 artifact tests.
2026-05-29 23:11:45 -07:00
saravanakumardb1
7930e8b0bd feat(platform-service): Phase 2 scheduler/router core (§7) + wire into atomic claim
Add a pure, fixed-weight scoring engine that decides WHICH queued job a claiming
factory gets, and wire it into coordinator.claimNextJob (the atomic rev-CAS claim
in tryClaimJob is unchanged).

scheduler.ts (pure, synchronous, no I/O):
- scoreCandidate(job, factory, ctx, weights?) -> { score, breakdown }
  score = w1*capabilityFit + w2*affinity + w3*(1/(1+load)) + w4*costFit(budget)
        + w5*health - w6*starvationPenalty(age); breakdown is per-weighted-term
        and sums to score (explainability / Phase-3 readiness).
- selectJob(candidates, factory, ctx, weights?) -> FleetJobDoc | null
  filters to stage-eligible + deps-satisfied (injected pure predicate) +
  capability-subset (+ down-health floor), ranks by score, deterministic
  tie-break: higher priority -> older createdAt -> lower cost class.
- Fixed default weights + bucketed anti-starvation aging (Phase 3 = tunable
  weights + preemption; intentionally NOT built here).

coordinator.ts (candidate-ranking section only):
- claimNextJob now resolves deps (store-backed) into a pure predicate, builds the
  factory view + authoritative now, and selects via selectJob; tryClaimJob CAS /
  lease / fence logic untouched. ClaimContext gains additive optional scheduler
  inputs (health/load/seatLimit/factoryEngines/warmScopes/costCeilingUsd). The
  pure capability-subset predicate moved into scheduler.ts and is re-exported.

Tests: scheduler.test.ts (16) covers capability hard-filter, priority/age
tie-breaks, load, health (+ down floor), starvation, cost fit, affinity, breakdown
sum, determinism, empty/no-eligible. coordinator.test.ts adds score-driven
selection, health floor, and ordered drain; all prior fleet tests stay green.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-29 23:03:40 -07:00
saravanakumardb1
33c1d8d5fa fix(platform-service): make fleet job claim truly atomic via datastore updateIfMatch
The foundation's revUpdateJob/revUpdateLease did a read -> rev-check -> write with
await points between them, so two CONCURRENT claims could both read the same rev,
both pass the check, and both write — a double-assignment the old (sequential) race
test could not catch.

Rewire revUpdateJob/revUpdateLease to delegate to the datastore's updateIfMatch,
which performs the compare and the write as one indivisible operation (Cosmos
If-Match; synchronous compare-set on memory). The coordinator's tryClaimJob keeps
identical external behavior (ok/conflict) but is now genuinely single-winner.

Upgrades the coordinator tests to prove atomicity under TRUE concurrency:
- two contenders via Promise.all -> exactly one ok, one conflict; assigned once;
  one run; one lease; leaseEpoch 1.
- N-claimer (15) stress via Promise.all -> one ok, N-1 conflicts, no double-assignment.
- N concurrent claimNextJob for one job -> exactly one non-null claim.
- N concurrent lease renewals -> exactly one wins.

Verified these concurrent tests FAIL against the old read-check-write (double-assign)
and pass after the fix.
2026-05-29 20:59:08 -07:00
saravanakumardb1
95dd7aa1d0 docs(platform-service): fleet module README (containers, claim/lease/fence protocol, reaper) 2026-05-29 20:20:54 -07:00
saravanakumardb1
8eb02c48aa feat(platform-service): fleet REST routes + module registration (P2 foundation)
Guarded REST under /api (auth + productId, like items): POST /fleet/jobs (idempotent
submit), GET /fleet/jobs (by stage/idempotencyKey), GET /fleet/jobs/:id, PATCH
/fleet/jobs/:id (fenced transition), POST /fleet/claim, lease renew/release,
factories/heartbeat, and runs/events streams. Every body validated with the Zod
schemas; fenced/conflict map to 409, missing to 404, invalid to 400. Registers
fleetRoutes in server.ts next to itemRoutes. Routes tested via Fastify inject on
the memory provider (real coordinator).
2026-05-29 20:20:46 -07:00
saravanakumardb1
8f51570da7 feat(platform-service): fleet coordinator — claim/lease/fence/heartbeat/reaper (P2 foundation)
The concurrency core (§4/§7/§8/§18/§25):
- claimNextJob: priority+age selection over queued/dep-satisfied jobs whose caps
  are a subset of the factory's, then tryClaimJob does a rev CAS to flip to
  assigned + acquire the lease — exactly one contender wins, no double-assignment.
- leases + fencing: acquire/reclaim bumps leaseEpoch; patchJobFenced/renew/release
  reject a call whose leaseEpoch < job.leaseEpoch (zombie worker can't overwrite).
- heartbeat + isFactoryStale for factory liveness.
- reapExpiredLeases: returns expired-lease jobs to queued/blocked, bumps the epoch
  (fencing the dead holder), preserves the checkpoint pointer (resume), marks the
  lease expired; idempotent. Documents why Cosmos TTL cannot do this.
- submit: idempotent (dedup/supersede/409) + submit-time dependency cycle
  detection; deps gating (shipped, or testing when depsMode:soft).

Tests drive the atomic-claim race, fencing, and reaper deterministically via the
rev CAS (no real threads).
2026-05-29 20:20:30 -07:00
saravanakumardb1
fada354df8 feat(platform-service): fleet repositories with rev compare-and-swap (P2 foundation)
One repository per fleet_* container on the @bytelyst/datastore abstraction
(memory + cosmos): create/getById/list (by productId, stage, idempotencyKey),
partition-aware single-partition queries, ordered append-only appendEvent, and
runs/leases/factories/profiles/artifacts CRUD. Adds revUpdateJob/revUpdateLease —
a `rev`-token compare-and-swap that writes only when the stored rev still matches
(the optimistic-concurrency primitive for atomic claim + fenced transitions;
maps to Cosmos _etag/If-Match in production).
2026-05-29 20:20:15 -07:00
saravanakumardb1
721d3fcb48 feat(platform-service): fleet data model + container registration (P2 foundation)
Adds the agent-gigafactory fleet data model (modules/fleet/types.ts): Zod schemas
as the source of truth with inferred types (no `any`) for the 7 durable containers
— FleetJobDoc, FleetRunDoc, FleetLeaseDoc, FleetFactoryDoc, FleetProfileDoc,
FleetEventDoc, FleetArtifactDoc — each carrying productId. Lifecycle stages mirror
the agent-queue gigafactory spec (queued|blocked|assigned|building|review|testing|
shipped|failed|dead_letter). Registers fleet_* containers with their partition keys
(/productId for jobs/factories/profiles, /jobId for runs/leases/events/artifacts).
2026-05-29 20:19:59 -07:00
saravanakumardb1
9d405952e2 fix(platform-service): TODO-4 \u2014 typed cast for request.auth augmentation
The DevOps admin preHandler read 'auth' as '(request as any).auth'.
The proper Fastify pattern is 'declare module' augmentation in
@bytelyst/fastify-auth, but the inline cast through 'unknown' is
sufficient for now and avoids touching the shared auth package.

Changed:
  - 'const auth = (request as any).auth;' \u2192
    'const auth = (request as unknown as { auth?: { role?: string } }).auth;'

Inline comment notes the cleaner 'declare module' alternative.

Final ecosystem state:
  scripts/check-rule-violations.sh: 0 findings across all rules \u2713

  web-hardcoded-hex:         0  \u2713
  b5-hardcoded-product-id:   0  \u2713
  b4-console-log:            0  \u2713
  b4-swift-print:            0  \u2713
  b4-python-print:           0  \u2713
  ts-any-type:               0  \u2713
  b7-emoji-in-code:          0  \u2713
2026-05-23 19:29:26 -07:00
saravanakumardb1
cde1a0b73c test(platform-service): align getAllProducts test with invttrdg fallback
CI run 67 surfaced a real test failure:

  src/modules/products/cache.test.ts:104
    getAllProducts > returns all cached products
    expected [ { id: 'lysnrai', …(11) }, …(2) ] to have a length of 2
    but got 3

Root cause: cache.ts has a TEMPORARY_FALLBACK_PRODUCTS map (currently
just 'invttrdg') that getAllProducts() merges into its return value
on top of the loaded cache. The test fixture loads 2 products
(lysnrai, mindlyst), so the actual return is 3 — the test was
written before the fallback shim landed and never got updated.

Two ways to reconcile: (a) make the test reflect today's behaviour,
or (b) gut the fallback. The cache.ts comment explicitly marks
the fallback as 'TODO(platform): remove after creating the real
product …', so the right move is (a): keep the shim in place and
make the test enforce the documented contract.

  - assertion now: toHaveLength(3) + .toContain('invttrdg')
  - inline comment ties the expectation back to cache.ts so a
    future cleanup removing the fallback will obviously need to
    drop it back to 2

Verified locally:
  pnpm vitest run cache.test.ts   -> 8/8 pass
2026-05-23 17:23:16 -07:00
saravanakumardb1
191b81756d fix(platform-service): resolve 3 TS errors in /devops/info handler
The platform-service build was failing with 3 unrelated TS errors,
surfaced while running the Gitea outdated-package detector earlier
in this session:

  src/server.ts(18,8):   Cannot find module '@bytelyst/devops/server'
  src/server.ts(318,61): Property 'cosmosEndpoint' does not exist on type 'ProductIdentity'
  src/server.ts(321,42): Property 'platformServiceUrl' does not exist on type 'ProductIdentity'

Root causes (two distinct bugs):

1. Stale install. '@bytelyst/devops' was already declared as
   'workspace:*' in services/platform-service/package.json (line 24),
   but node_modules/@bytelyst/devops/ did not exist. Re-running
   'pnpm install' at the workspace root materialised the symlink.

2. Variable shadowing. In the GET /devops/info handler the code
   declared a local 'const config' from loadProductIdentity() that
   shadowed the module-level 'config' (env vars) imported from
   './lib/config.js' at line 112. The author then tried to read
   'config.cosmosEndpoint' and 'config.platformServiceUrl' off the
   ProductIdentity, where those keys never exist:

     ProductIdentity = {
       productId, displayName, licensePrefix, configDirName,
       envVarPrefix, bundleIdSuffix, packageName
     }

   The intended values live on the env config:
     config.COSMOS_ENDPOINT  (Zod-validated, required at boot)
     config.HOST + config.PORT (defaults '0.0.0.0' / 4003)

   There is no 'platformServiceUrl' field anywhere in the codebase —
   it only appeared in this single buggy line. Reconstructed as
   '\${HOST}:\${PORT}' which is the URL admins would use to reach
   this service for the devops/info diagnostic dashboard.

Fix (services/platform-service/src/server.ts:310-339):
  - Rename local 'const config' to 'const productIdentity' to break
    the shadowing.
  - Use productIdentity.productId for the devops productId field.
  - Use config.COSMOS_ENDPOINT (the env config) for the cosmos
    dependency health check URL.
  - Use `http://${config.HOST}:${config.PORT}` for the extra
    platformServiceUrl field.
  - Add a doc comment block explaining the two-config distinction
    so future contributors don't reintroduce the shadow.

Verified:
  pnpm --filter @lysnrai/platform-service build    OK (0 errors)
  pnpm --filter @lysnrai/platform-service test     1511/1512 pass

The 1 remaining failure (src/modules/products/cache.test.ts line 104,
'returns all cached products' expects 2 products but got 3) is a
PRE-EXISTING product-registry test drift on main, verified by
stashing this commit's changes and re-running the same test against
the unmodified tree. It will be addressed separately.
2026-05-23 12:46:10 -07:00
root
c39da91588 feat(platform): add /devops page with platform common devops package
- Add @bytelyst/devops backend endpoints to platform-service
- Add /api/devops/version (public) and /api/devops/info (admin) endpoints
- Add /devops page to admin-web using @bytelyst/devops/ui DevopsPanel
- Add devops link to admin web sidebar navigation
- Add build metadata and runtime information display
- Follow trading web devops pattern

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-11 03:38:06 +00:00
root
8f449a5ee5 fix(platform): add temporary invttrdg product fallback 2026-05-05 21:07:26 +00:00
663dcdecab chore(platform): replace runtime console diagnostics
What changed:
- Replaced platform-service runtime console diagnostics with structured stderr logging helpers.
- Preserved push notification, AI diagnostics, auto-trigger, and diagnostics repository error handling semantics.

Warning impact:
- Platform runtime no-console warnings: 12 -> 0.
- Workspace lint: 12 -> 0 warnings.

Verification:
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint <runtime files> --ext .ts
- pnpm lint
2026-05-04 16:48:25 -07:00
2c9dc1870d chore(platform): document script CLI output
What changed:
- Added explicit no-console policy for platform-service CLI/codegen scripts.
- Replaced the remaining migrate-referrals catch any with unknown narrowing.
- Added a TODO for making migrate-referrals --help independent of service env loading.

Warning impact:
- Platform script lint warnings: 78 -> 0.
- Workspace lint: 90 -> 12 warnings.

Verification:
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service exec eslint scripts --ext .ts
- pnpm lint

Note:
- migrate-referrals --help still requires service env due eager config import; TODO added for a follow-up behavior-safe refactor.
2026-05-04 16:45:42 -07:00
b4403300fa chore(nomgap): finalize product deployment config
What changed:
- Remove nomgap-web from the ecosystem Docker stack now that web is Vercel-hosted.
- Add a TODO for deciding whether local Docker smoke tests still need a NomGap web service.
- Update NomGap product containers and feature flags.
- Seed the NomGap push trigger flag without duplicating the common encryption flag.

Safety notes:
- Dropped unrelated pnpm-lock.yaml formatting churn instead of committing it.

Verification:
- node JSON.parse products/nomgap/product.json
- ruby Psych.safe_load docker-compose.ecosystem.yml
- pnpm --filter @bytelyst/admin-web typecheck
- pnpm --filter @bytelyst/admin-web test
- pnpm --filter @bytelyst/admin-web exec eslint . --ext .ts,.tsx
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint . --ext .ts,.tsx
- pnpm typecheck
- pnpm lint
2026-05-04 16:29:20 -07:00
021f053143 chore(platform): type predictive campaign events
What changed:
- Replaced predictive campaign-engine event-bus any casts with typed bus.emit calls.
- Preserved existing fire-and-forget event dispatch behavior with void emit calls.

Warning impact:
- services/platform-service/src/modules/predictive-analytics/campaign-engine.ts: no-explicit-any 5 -> 0.
- Workspace lint baseline: 189 warnings -> 184 warnings.

Verification:
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service exec eslint src/modules/predictive-analytics/campaign-engine.ts
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint . --ext .ts,.tsx
- pnpm lint
2026-05-04 15:58:44 -07:00
9625999ad4 chore(platform-service): remove stale lint disables
Drop unused eslint-disable comments now that the ignored destructured names use the underscore convention.

Verification: pnpm --filter @lysnrai/platform-service build; pnpm --filter @lysnrai/platform-service exec vitest run src/lib/declarative-loader.test.ts --pool forks; pnpm --filter @lysnrai/platform-service exec eslint src/lib/declarative-loader.test.ts --ext .ts,.tsx; pnpm lint.
2026-05-04 15:22:37 -07:00
saravanakumardb1
7d266bfcc0 fix(docker): INFRA-gap-02 unblock full-stack docker compose up
Three coordinated fixes so 'docker compose up cosmos-emulator platform-service
cowork-service --wait' completes end-to-end (pre-existing blocker surfaced by
W1 post-push review).

1. Remove harmful prepare:tsc from @bytelyst/react-native-platform-sdk
   package.json. The hook fires during pnpm install --frozen-lockfile against
   an empty src/ tree (because Dockerfiles COPY package.jsons before
   sources), tsc aborts, install fails. Canonical monorepo build flow is
   pnpm -r build using the existing build:tsc script; prepare only runs for
   git+ URL installs (which this published package doesn't use), so removing
   it is lossless.

2. Add --ignore-scripts to platform-service + mcp-server Dockerfile install
   steps. Mirrors the pattern already used by extraction-service/Dockerfile,
   dashboards/admin-web/Dockerfile, dashboards/tracker-web/Dockerfile.
   Belt-and-braces against future prepare-hook regressions in any workspace
   package.

3. Expand .dockerignore node_modules/dist/.next/coverage to **/ globs.
   Docker's .dockerignore with bare 'node_modules' only matches root-level;
   nested packages/*/node_modules/ were being COPY'd into images, poisoning
   them with host-absolute-path .bin shims (e.g. @bytelyst/storage's tsc
   shim resolved to /learning_voice_ai_agent/node_modules/.pnpm/... which
   doesn't exist in the container → MODULE_NOT_FOUND). The glob fix makes
   COPY packages/ packages/ deliver source only.

Gap: INFRA-gap-02
Verified:
  pnpm install --frozen-lockfile                              
  pnpm --filter @bytelyst/react-native-platform-sdk build     
  pnpm --filter @bytelyst/react-native-platform-sdk typecheck 
  docker compose build platform-service                        (previously failed)
  docker compose build mcp-server                             
  docker compose build extraction-service                     
2026-04-16 15:48:32 -07:00
saravanakumardb1
a954f434ef fix(lint): repair pre-existing baseline lint errors blocking W1 gates
Baseline origin/main pnpm -r lint failed with 90+ errors across
platform-service, extraction-service, and tracker-web. These block the
shared W1 quality gates (prompts/README.md §4) which require all of
typecheck + lint + build + test to be green before committing W1 infra
work. Fixes are strictly scoped to unblock gates:

- eslint.config.js: extend @typescript-eslint/no-unused-vars with
  varsIgnorePattern / caughtErrorsIgnorePattern / destructuredArrayIgnorePattern
  all honouring the existing `^_` convention already used for args.
- platform-service: add file-level eslint-disable for
  @typescript-eslint/no-unused-vars, no-redeclare, no-useless-escape on
  the 33 legacy files failing lint (ab-testing, ai-diagnostics,
  diagnostics, predictive-analytics, broadcasts/types, surveys/types,
  lib/push-notifications).
- extraction-service tests: drop unused vitest imports (beforeEach,
  afterEach, HealthCheck).
- tracker-web tracker-proxy.test.ts: prefix unused url with _.
- Applied eslint --fix on platform-service which normalised a handful
  of `let` → `const` and removed one redundant disable comment.

Scope creep vs W1 "Files You Own" is acknowledged — user explicitly
approved this path when baseline rot was surfaced.

Verified: pnpm -r typecheck, lint, build, test all green.
2026-04-16 13:06:37 -07:00
saravanakumardb1
05594a334f feat(jobs): register devintelli-daily-sync cron job (0 6 * * *)
- Add devintelli-daily-sync handler: POST to DevIntelli backend /api/sync/daily
- Uses x-internal-key header for service-to-service auth
- Add DEVINTELLI_BACKEND_URL + DEVINTELLI_INTERNAL_API_KEY env vars
- Cron: 0 6 * * * (6am UTC daily), timeout: 5 min
- Returns triggered/skipped/totalConnections metrics from DevIntelli response
2026-04-04 23:37:25 -07:00
saravanakumardb1
89e200fa9f feat(flags): seed devintelli feature flags (11 flags)
- devintelli_enabled, devintelli_scan_enabled, devintelli_daily_sync
- 6 analytics panel flags (overview, commit, pr, review, language, productivity, repo)
- devintelli_billing_enabled (disabled by default)

Aligns with backend/src/lib/feature-flags.ts defaults
2026-04-04 23:37:25 -07:00
fdf9286e34 fix(audit): preserve source event timestamps 2026-04-04 11:27:21 -07:00
ff8c5eb704 fix(runtime): add queued agent run state 2026-04-04 11:11:45 -07:00
fe36296196 feat(runtime): add platform runtime projection api 2026-04-04 01:14:37 -07:00
e377351842 feat(timeline): add platform timeline ingest and query api 2026-04-04 00:54:07 -07:00
705f58c5c5 chore(deploy): remove railway deployment artifacts 2026-04-03 17:05:15 -07:00
saravanakumardb1
a87c533fd3 feat(cowork-service): scaffold Fastify bridge + seed clawcowork feature flags (H.1 + H.2)
H.1: Product registration
- Added 12 clawcowork feature flags to platform-service flags/seed.ts
  (sandbox, plugins, mcp, scheduling, computer_use, parallel_agents,
   marketplace, wasm, llm_multi_model, audit, platform_auth, dispatch_api)

H.2: cowork-service scaffold (services/cowork-service/)
- @lysnrai/cowork-service on port 4009, productId clawcowork
- createServiceApp + startService from @bytelyst/fastify-core
- Modules: health (dependency check), tasks (submit/list/get/cancel)
- Zod-validated config, Swagger, readiness endpoint
- 8 tests passing (1 bootstrap + 7 task routes), typecheck clean
2026-04-02 20:39:22 -07:00
saravanakumardb1
46ee14371c fix(ci): add --pool forks to all vitest test scripts to fix kill EPERM on Node v25
Root cause: tinypool worker teardown calls kill() which returns EPERM
in the act_runner host environment on Node.js v25.2.1. Tests pass but
the vitest process crashes during cleanup, causing CI failure.

Fix: --pool forks CLI flag on every package/service test script, plus
pool: 'forks' in all vitest.config.ts files. This uses child_process.fork()
worker management which handles termination cleanly.

60 package.json files updated, 10 vitest.config.ts files updated.
2026-03-27 23:23:38 -07:00
saravanakumardb1
0628f5b3bf test(platform): add 4 impersonation business rule tests (6→10) 2026-03-27 13:22:12 -07:00
saravanakumardb1
3cda7190fb feat(platform): add i18n translations module (P3.20) 2026-03-27 11:32:39 -07:00
saravanakumardb1
85aca5534b fix(docker): sync all 3 service Dockerfiles with complete workspace package.json list
platform-service had 16/60, extraction-service had 14/60, mcp-server had 34/60.
All three now list all 57 packages + 4 services + 2 dashboards + scripts.
Required for pnpm install --frozen-lockfile to resolve the full workspace.
2026-03-24 11:55:47 -07:00
saravanakumardb1
59f6ac1b9a fix(ai-diagnostics): keep cluster filters numeric 2026-03-23 16:21:08 -07:00
saravanakumardb1
cd811114e5 fix(devops): harden local shared-service docker bring-up 2026-03-22 12:34:38 -07:00
saravanakumardb1
67ef6a6068 fix(exports): preserve processing state on async export failures 2026-03-22 11:58:54 -07:00
saravanakumardb1
265599d005 fix(platform-service): harden broadcast metrics and export job lifecycle 2026-03-22 11:57:47 -07:00
saravanakumardb1
dda38aa009 fix(exports): strip data payload from list endpoint + update audit doc
- exports/routes: exclude inline data from GET /exports list response
  to prevent returning megabytes of serialized export data (perf+security)
- Update WORKSPACE_TODO_AUDIT.md: add post-audit review section with
  9 bugs found and fixed across 2 commits (73b07c2, 841cdf3), mark
  all action plan sprints complete
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:23:08 -07:00
saravanakumardb1
841cdf3a16 fix(platform-service+events): 3 more gaps in diagnostics + delivery
- diagnostics/subscribers: wire session.created email notification to
  target user using existing 'diagnostics-session-created' template
  (was just logging instead of sending the email)
- events/types: add missing 'currency' field to payment.failed schema
  (payment.succeeded had it, payment.failed did not — inconsistency)
- delivery/subscribers: use event.payload.currency instead of hardcoded
  empty string in payment-failed email variables
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:20:24 -07:00
saravanakumardb1
73b07c2c3a fix(platform-service): 5 bugs in recent P2/P3 implementations
- diagnostics/subscribers: use correct template IDs
  'diagnostics-session-cancelled' and 'diagnostics-session-completed'
  instead of non-existent 'generic' (would throw at runtime)
- delivery/templates: add missing 'broadcast' email template used by
  broadcast delivery route (dispatchEmail would throw on unknown ID)
- broadcasts/routes: replace broken dot-path 'metrics.sent' update
  with proper updateBroadcastMetrics() call, add productName variable
- exports/routes: store serialized data on job doc, add download
  endpoint GET /exports/:id/download with content-type headers,
  exclude data payload from metadata GET endpoint
- waitlist/routes: store invitation doc ID (inv_...) instead of
  code string (WL-...) in invitationCodeId field
- delivery/delivery.test.ts: update template count 12 -> 13
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:14:55 -07:00
saravanakumardb1
1576b699b0 feat(platform-service): resolve all P3 TODOs — diagnostics notifications + test cleanup
- diagnostics/subscribers: notify admin via email when debug session is
  cancelled (looks up session creator via getSession + getUserById)
- diagnostics/subscribers: email session summary (logs/traces/screenshots)
  to admin when debug session completes
- diagnostics/subscribers: send Slack alert via dispatchSlack for FATAL
  logs ingested during debug sessions (on-call engineer notification)
- feedback-client/integration.test.ts: replace TODO-4 with clear NOTE,
  fix unused var lint errors
- feedback-client/gdpr.test.ts: mark lifecycle policy as accepted,
  remove console.log + unused blobPath variable
- Update WORKSPACE_TODO_AUDIT.md — P3 section: all 5 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:03:51 -07:00
saravanakumardb1
6f03a74a76 feat(platform-service): resolve P2 TODOs — exports, broadcasts, telemetry, waitlist
- telemetry/repository: group upsertEventsBatch by pk — same-partition
  writes sequential, different partitions parallel (reduces contention)
- exports/routes: wire async export processing via process.nextTick —
  queries users/audit/telemetry/usage/subscriptions/licenses, serializes
  to CSV or JSON, updates job status with rowCount and fileSizeBytes
- broadcasts/repository: replace mock estimateTargetReach with real user
  count query from auth module, respects percentageRollout
- broadcasts/routes: wire async broadcast delivery — fetches target users,
  dispatches email per recipient, updates metrics on completion
- waitlist/routes: auto-generate invitation codes via invitations module
  when batch-inviting waitlist entries (WL-XXXXXXXX format, 14-day trial)
- CAPTCHA (item 12) deferred — requires external API keys
- Update WORKSPACE_TODO_AUDIT.md — P2 section: 5/6 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:41:11 -07:00
saravanakumardb1
09525f671f fix(platform-service): 3 bugs in delivery subscribers + survey incentives
- delivery/subscribers: welcome email used raw productId as productName,
  now uses resolveProductName() for proper display name
- delivery/subscribers: remove redundant String(daysLeft) in trial_expiring
- surveys/routes: incentiveClaimed was set outside if(sub) block, marking
  response as claimed even when user has no subscription. Moved inside
  if(sub) so claims are only recorded when incentive is actually granted
2026-03-22 00:19:32 -07:00
saravanakumardb1
2f06aacc27 fix(platform-service): resolve P1 TODOs — delivery email subscribers + survey incentives
- delivery/subscribers: add resolveUserEmail() helper using auth getById()
- payment.failed: look up user email, dispatch payment-failed template
- trial_expiring: look up user, compute daysLeft from expiresAt, dispatch
- trial_expired: look up user, dispatch trial-expired template with upgradeUrl
- surveys/routes: wire incentive fulfillment to subscriptions module
  - pro_days: extend currentPeriodEnd by incentive amount
  - credits: add bonus tokensIncluded via subscriptions repo
- Update WORKSPACE_TODO_AUDIT.md — P0+P1 all resolved (7/18)
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:14:41 -07:00