Commit Graph

295 Commits

Author SHA1 Message Date
saravanakumardb1
33c1d8d5fa fix(platform-service): make fleet job claim truly atomic via datastore updateIfMatch
The foundation's revUpdateJob/revUpdateLease did a read -> rev-check -> write with
await points between them, so two CONCURRENT claims could both read the same rev,
both pass the check, and both write — a double-assignment the old (sequential) race
test could not catch.

Rewire revUpdateJob/revUpdateLease to delegate to the datastore's updateIfMatch,
which performs the compare and the write as one indivisible operation (Cosmos
If-Match; synchronous compare-set on memory). The coordinator's tryClaimJob keeps
identical external behavior (ok/conflict) but is now genuinely single-winner.

Upgrades the coordinator tests to prove atomicity under TRUE concurrency:
- two contenders via Promise.all -> exactly one ok, one conflict; assigned once;
  one run; one lease; leaseEpoch 1.
- N-claimer (15) stress via Promise.all -> one ok, N-1 conflicts, no double-assignment.
- N concurrent claimNextJob for one job -> exactly one non-null claim.
- N concurrent lease renewals -> exactly one wins.

Verified these concurrent tests FAIL against the old read-check-write (double-assign)
and pass after the fix.
2026-05-29 20:59:08 -07:00
saravanakumardb1
95dd7aa1d0 docs(platform-service): fleet module README (containers, claim/lease/fence protocol, reaper) 2026-05-29 20:20:54 -07:00
saravanakumardb1
8eb02c48aa feat(platform-service): fleet REST routes + module registration (P2 foundation)
Guarded REST under /api (auth + productId, like items): POST /fleet/jobs (idempotent
submit), GET /fleet/jobs (by stage/idempotencyKey), GET /fleet/jobs/:id, PATCH
/fleet/jobs/:id (fenced transition), POST /fleet/claim, lease renew/release,
factories/heartbeat, and runs/events streams. Every body validated with the Zod
schemas; fenced/conflict map to 409, missing to 404, invalid to 400. Registers
fleetRoutes in server.ts next to itemRoutes. Routes tested via Fastify inject on
the memory provider (real coordinator).
2026-05-29 20:20:46 -07:00
saravanakumardb1
8f51570da7 feat(platform-service): fleet coordinator — claim/lease/fence/heartbeat/reaper (P2 foundation)
The concurrency core (§4/§7/§8/§18/§25):
- claimNextJob: priority+age selection over queued/dep-satisfied jobs whose caps
  are a subset of the factory's, then tryClaimJob does a rev CAS to flip to
  assigned + acquire the lease — exactly one contender wins, no double-assignment.
- leases + fencing: acquire/reclaim bumps leaseEpoch; patchJobFenced/renew/release
  reject a call whose leaseEpoch < job.leaseEpoch (zombie worker can't overwrite).
- heartbeat + isFactoryStale for factory liveness.
- reapExpiredLeases: returns expired-lease jobs to queued/blocked, bumps the epoch
  (fencing the dead holder), preserves the checkpoint pointer (resume), marks the
  lease expired; idempotent. Documents why Cosmos TTL cannot do this.
- submit: idempotent (dedup/supersede/409) + submit-time dependency cycle
  detection; deps gating (shipped, or testing when depsMode:soft).

Tests drive the atomic-claim race, fencing, and reaper deterministically via the
rev CAS (no real threads).
2026-05-29 20:20:30 -07:00
saravanakumardb1
fada354df8 feat(platform-service): fleet repositories with rev compare-and-swap (P2 foundation)
One repository per fleet_* container on the @bytelyst/datastore abstraction
(memory + cosmos): create/getById/list (by productId, stage, idempotencyKey),
partition-aware single-partition queries, ordered append-only appendEvent, and
runs/leases/factories/profiles/artifacts CRUD. Adds revUpdateJob/revUpdateLease —
a `rev`-token compare-and-swap that writes only when the stored rev still matches
(the optimistic-concurrency primitive for atomic claim + fenced transitions;
maps to Cosmos _etag/If-Match in production).
2026-05-29 20:20:15 -07:00
saravanakumardb1
721d3fcb48 feat(platform-service): fleet data model + container registration (P2 foundation)
Adds the agent-gigafactory fleet data model (modules/fleet/types.ts): Zod schemas
as the source of truth with inferred types (no `any`) for the 7 durable containers
— FleetJobDoc, FleetRunDoc, FleetLeaseDoc, FleetFactoryDoc, FleetProfileDoc,
FleetEventDoc, FleetArtifactDoc — each carrying productId. Lifecycle stages mirror
the agent-queue gigafactory spec (queued|blocked|assigned|building|review|testing|
shipped|failed|dead_letter). Registers fleet_* containers with their partition keys
(/productId for jobs/factories/profiles, /jobId for runs/leases/events/artifacts).
2026-05-29 20:19:59 -07:00
saravanakumardb1
9d405952e2 fix(platform-service): TODO-4 \u2014 typed cast for request.auth augmentation
The DevOps admin preHandler read 'auth' as '(request as any).auth'.
The proper Fastify pattern is 'declare module' augmentation in
@bytelyst/fastify-auth, but the inline cast through 'unknown' is
sufficient for now and avoids touching the shared auth package.

Changed:
  - 'const auth = (request as any).auth;' \u2192
    'const auth = (request as unknown as { auth?: { role?: string } }).auth;'

Inline comment notes the cleaner 'declare module' alternative.

Final ecosystem state:
  scripts/check-rule-violations.sh: 0 findings across all rules \u2713

  web-hardcoded-hex:         0  \u2713
  b5-hardcoded-product-id:   0  \u2713
  b4-console-log:            0  \u2713
  b4-swift-print:            0  \u2713
  b4-python-print:           0  \u2713
  ts-any-type:               0  \u2713
  b7-emoji-in-code:          0  \u2713
2026-05-23 19:29:26 -07:00
saravanakumardb1
cde1a0b73c test(platform-service): align getAllProducts test with invttrdg fallback
CI run 67 surfaced a real test failure:

  src/modules/products/cache.test.ts:104
    getAllProducts > returns all cached products
    expected [ { id: 'lysnrai', …(11) }, …(2) ] to have a length of 2
    but got 3

Root cause: cache.ts has a TEMPORARY_FALLBACK_PRODUCTS map (currently
just 'invttrdg') that getAllProducts() merges into its return value
on top of the loaded cache. The test fixture loads 2 products
(lysnrai, mindlyst), so the actual return is 3 — the test was
written before the fallback shim landed and never got updated.

Two ways to reconcile: (a) make the test reflect today's behaviour,
or (b) gut the fallback. The cache.ts comment explicitly marks
the fallback as 'TODO(platform): remove after creating the real
product …', so the right move is (a): keep the shim in place and
make the test enforce the documented contract.

  - assertion now: toHaveLength(3) + .toContain('invttrdg')
  - inline comment ties the expectation back to cache.ts so a
    future cleanup removing the fallback will obviously need to
    drop it back to 2

Verified locally:
  pnpm vitest run cache.test.ts   -> 8/8 pass
2026-05-23 17:23:16 -07:00
saravanakumardb1
191b81756d fix(platform-service): resolve 3 TS errors in /devops/info handler
The platform-service build was failing with 3 unrelated TS errors,
surfaced while running the Gitea outdated-package detector earlier
in this session:

  src/server.ts(18,8):   Cannot find module '@bytelyst/devops/server'
  src/server.ts(318,61): Property 'cosmosEndpoint' does not exist on type 'ProductIdentity'
  src/server.ts(321,42): Property 'platformServiceUrl' does not exist on type 'ProductIdentity'

Root causes (two distinct bugs):

1. Stale install. '@bytelyst/devops' was already declared as
   'workspace:*' in services/platform-service/package.json (line 24),
   but node_modules/@bytelyst/devops/ did not exist. Re-running
   'pnpm install' at the workspace root materialised the symlink.

2. Variable shadowing. In the GET /devops/info handler the code
   declared a local 'const config' from loadProductIdentity() that
   shadowed the module-level 'config' (env vars) imported from
   './lib/config.js' at line 112. The author then tried to read
   'config.cosmosEndpoint' and 'config.platformServiceUrl' off the
   ProductIdentity, where those keys never exist:

     ProductIdentity = {
       productId, displayName, licensePrefix, configDirName,
       envVarPrefix, bundleIdSuffix, packageName
     }

   The intended values live on the env config:
     config.COSMOS_ENDPOINT  (Zod-validated, required at boot)
     config.HOST + config.PORT (defaults '0.0.0.0' / 4003)

   There is no 'platformServiceUrl' field anywhere in the codebase —
   it only appeared in this single buggy line. Reconstructed as
   '\${HOST}:\${PORT}' which is the URL admins would use to reach
   this service for the devops/info diagnostic dashboard.

Fix (services/platform-service/src/server.ts:310-339):
  - Rename local 'const config' to 'const productIdentity' to break
    the shadowing.
  - Use productIdentity.productId for the devops productId field.
  - Use config.COSMOS_ENDPOINT (the env config) for the cosmos
    dependency health check URL.
  - Use `http://${config.HOST}:${config.PORT}` for the extra
    platformServiceUrl field.
  - Add a doc comment block explaining the two-config distinction
    so future contributors don't reintroduce the shadow.

Verified:
  pnpm --filter @lysnrai/platform-service build    OK (0 errors)
  pnpm --filter @lysnrai/platform-service test     1511/1512 pass

The 1 remaining failure (src/modules/products/cache.test.ts line 104,
'returns all cached products' expects 2 products but got 3) is a
PRE-EXISTING product-registry test drift on main, verified by
stashing this commit's changes and re-running the same test against
the unmodified tree. It will be addressed separately.
2026-05-23 12:46:10 -07:00
root
c39da91588 feat(platform): add /devops page with platform common devops package
- Add @bytelyst/devops backend endpoints to platform-service
- Add /api/devops/version (public) and /api/devops/info (admin) endpoints
- Add /devops page to admin-web using @bytelyst/devops/ui DevopsPanel
- Add devops link to admin web sidebar navigation
- Add build metadata and runtime information display
- Follow trading web devops pattern

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-11 03:38:06 +00:00
root
8f449a5ee5 fix(platform): add temporary invttrdg product fallback 2026-05-05 21:07:26 +00:00
663dcdecab chore(platform): replace runtime console diagnostics
What changed:
- Replaced platform-service runtime console diagnostics with structured stderr logging helpers.
- Preserved push notification, AI diagnostics, auto-trigger, and diagnostics repository error handling semantics.

Warning impact:
- Platform runtime no-console warnings: 12 -> 0.
- Workspace lint: 12 -> 0 warnings.

Verification:
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint <runtime files> --ext .ts
- pnpm lint
2026-05-04 16:48:25 -07:00
b4403300fa chore(nomgap): finalize product deployment config
What changed:
- Remove nomgap-web from the ecosystem Docker stack now that web is Vercel-hosted.
- Add a TODO for deciding whether local Docker smoke tests still need a NomGap web service.
- Update NomGap product containers and feature flags.
- Seed the NomGap push trigger flag without duplicating the common encryption flag.

Safety notes:
- Dropped unrelated pnpm-lock.yaml formatting churn instead of committing it.

Verification:
- node JSON.parse products/nomgap/product.json
- ruby Psych.safe_load docker-compose.ecosystem.yml
- pnpm --filter @bytelyst/admin-web typecheck
- pnpm --filter @bytelyst/admin-web test
- pnpm --filter @bytelyst/admin-web exec eslint . --ext .ts,.tsx
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint . --ext .ts,.tsx
- pnpm typecheck
- pnpm lint
2026-05-04 16:29:20 -07:00
021f053143 chore(platform): type predictive campaign events
What changed:
- Replaced predictive campaign-engine event-bus any casts with typed bus.emit calls.
- Preserved existing fire-and-forget event dispatch behavior with void emit calls.

Warning impact:
- services/platform-service/src/modules/predictive-analytics/campaign-engine.ts: no-explicit-any 5 -> 0.
- Workspace lint baseline: 189 warnings -> 184 warnings.

Verification:
- pnpm --filter @lysnrai/platform-service build
- pnpm --filter @lysnrai/platform-service exec eslint src/modules/predictive-analytics/campaign-engine.ts
- pnpm --filter @lysnrai/platform-service test
- pnpm --filter @lysnrai/platform-service exec eslint . --ext .ts,.tsx
- pnpm lint
2026-05-04 15:58:44 -07:00
9625999ad4 chore(platform-service): remove stale lint disables
Drop unused eslint-disable comments now that the ignored destructured names use the underscore convention.

Verification: pnpm --filter @lysnrai/platform-service build; pnpm --filter @lysnrai/platform-service exec vitest run src/lib/declarative-loader.test.ts --pool forks; pnpm --filter @lysnrai/platform-service exec eslint src/lib/declarative-loader.test.ts --ext .ts,.tsx; pnpm lint.
2026-05-04 15:22:37 -07:00
saravanakumardb1
a954f434ef fix(lint): repair pre-existing baseline lint errors blocking W1 gates
Baseline origin/main pnpm -r lint failed with 90+ errors across
platform-service, extraction-service, and tracker-web. These block the
shared W1 quality gates (prompts/README.md §4) which require all of
typecheck + lint + build + test to be green before committing W1 infra
work. Fixes are strictly scoped to unblock gates:

- eslint.config.js: extend @typescript-eslint/no-unused-vars with
  varsIgnorePattern / caughtErrorsIgnorePattern / destructuredArrayIgnorePattern
  all honouring the existing `^_` convention already used for args.
- platform-service: add file-level eslint-disable for
  @typescript-eslint/no-unused-vars, no-redeclare, no-useless-escape on
  the 33 legacy files failing lint (ab-testing, ai-diagnostics,
  diagnostics, predictive-analytics, broadcasts/types, surveys/types,
  lib/push-notifications).
- extraction-service tests: drop unused vitest imports (beforeEach,
  afterEach, HealthCheck).
- tracker-web tracker-proxy.test.ts: prefix unused url with _.
- Applied eslint --fix on platform-service which normalised a handful
  of `let` → `const` and removed one redundant disable comment.

Scope creep vs W1 "Files You Own" is acknowledged — user explicitly
approved this path when baseline rot was surfaced.

Verified: pnpm -r typecheck, lint, build, test all green.
2026-04-16 13:06:37 -07:00
saravanakumardb1
05594a334f feat(jobs): register devintelli-daily-sync cron job (0 6 * * *)
- Add devintelli-daily-sync handler: POST to DevIntelli backend /api/sync/daily
- Uses x-internal-key header for service-to-service auth
- Add DEVINTELLI_BACKEND_URL + DEVINTELLI_INTERNAL_API_KEY env vars
- Cron: 0 6 * * * (6am UTC daily), timeout: 5 min
- Returns triggered/skipped/totalConnections metrics from DevIntelli response
2026-04-04 23:37:25 -07:00
saravanakumardb1
89e200fa9f feat(flags): seed devintelli feature flags (11 flags)
- devintelli_enabled, devintelli_scan_enabled, devintelli_daily_sync
- 6 analytics panel flags (overview, commit, pr, review, language, productivity, repo)
- devintelli_billing_enabled (disabled by default)

Aligns with backend/src/lib/feature-flags.ts defaults
2026-04-04 23:37:25 -07:00
fdf9286e34 fix(audit): preserve source event timestamps 2026-04-04 11:27:21 -07:00
ff8c5eb704 fix(runtime): add queued agent run state 2026-04-04 11:11:45 -07:00
fe36296196 feat(runtime): add platform runtime projection api 2026-04-04 01:14:37 -07:00
e377351842 feat(timeline): add platform timeline ingest and query api 2026-04-04 00:54:07 -07:00
705f58c5c5 chore(deploy): remove railway deployment artifacts 2026-04-03 17:05:15 -07:00
saravanakumardb1
a87c533fd3 feat(cowork-service): scaffold Fastify bridge + seed clawcowork feature flags (H.1 + H.2)
H.1: Product registration
- Added 12 clawcowork feature flags to platform-service flags/seed.ts
  (sandbox, plugins, mcp, scheduling, computer_use, parallel_agents,
   marketplace, wasm, llm_multi_model, audit, platform_auth, dispatch_api)

H.2: cowork-service scaffold (services/cowork-service/)
- @lysnrai/cowork-service on port 4009, productId clawcowork
- createServiceApp + startService from @bytelyst/fastify-core
- Modules: health (dependency check), tasks (submit/list/get/cancel)
- Zod-validated config, Swagger, readiness endpoint
- 8 tests passing (1 bootstrap + 7 task routes), typecheck clean
2026-04-02 20:39:22 -07:00
saravanakumardb1
0628f5b3bf test(platform): add 4 impersonation business rule tests (6→10) 2026-03-27 13:22:12 -07:00
saravanakumardb1
3cda7190fb feat(platform): add i18n translations module (P3.20) 2026-03-27 11:32:39 -07:00
saravanakumardb1
59f6ac1b9a fix(ai-diagnostics): keep cluster filters numeric 2026-03-23 16:21:08 -07:00
saravanakumardb1
cd811114e5 fix(devops): harden local shared-service docker bring-up 2026-03-22 12:34:38 -07:00
saravanakumardb1
67ef6a6068 fix(exports): preserve processing state on async export failures 2026-03-22 11:58:54 -07:00
saravanakumardb1
265599d005 fix(platform-service): harden broadcast metrics and export job lifecycle 2026-03-22 11:57:47 -07:00
saravanakumardb1
dda38aa009 fix(exports): strip data payload from list endpoint + update audit doc
- exports/routes: exclude inline data from GET /exports list response
  to prevent returning megabytes of serialized export data (perf+security)
- Update WORKSPACE_TODO_AUDIT.md: add post-audit review section with
  9 bugs found and fixed across 2 commits (73b07c2, 841cdf3), mark
  all action plan sprints complete
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:23:08 -07:00
saravanakumardb1
841cdf3a16 fix(platform-service+events): 3 more gaps in diagnostics + delivery
- diagnostics/subscribers: wire session.created email notification to
  target user using existing 'diagnostics-session-created' template
  (was just logging instead of sending the email)
- events/types: add missing 'currency' field to payment.failed schema
  (payment.succeeded had it, payment.failed did not — inconsistency)
- delivery/subscribers: use event.payload.currency instead of hardcoded
  empty string in payment-failed email variables
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:20:24 -07:00
saravanakumardb1
73b07c2c3a fix(platform-service): 5 bugs in recent P2/P3 implementations
- diagnostics/subscribers: use correct template IDs
  'diagnostics-session-cancelled' and 'diagnostics-session-completed'
  instead of non-existent 'generic' (would throw at runtime)
- delivery/templates: add missing 'broadcast' email template used by
  broadcast delivery route (dispatchEmail would throw on unknown ID)
- broadcasts/routes: replace broken dot-path 'metrics.sent' update
  with proper updateBroadcastMetrics() call, add productName variable
- exports/routes: store serialized data on job doc, add download
  endpoint GET /exports/:id/download with content-type headers,
  exclude data payload from metadata GET endpoint
- waitlist/routes: store invitation doc ID (inv_...) instead of
  code string (WL-...) in invitationCodeId field
- delivery/delivery.test.ts: update template count 12 -> 13
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:14:55 -07:00
saravanakumardb1
1576b699b0 feat(platform-service): resolve all P3 TODOs — diagnostics notifications + test cleanup
- diagnostics/subscribers: notify admin via email when debug session is
  cancelled (looks up session creator via getSession + getUserById)
- diagnostics/subscribers: email session summary (logs/traces/screenshots)
  to admin when debug session completes
- diagnostics/subscribers: send Slack alert via dispatchSlack for FATAL
  logs ingested during debug sessions (on-call engineer notification)
- feedback-client/integration.test.ts: replace TODO-4 with clear NOTE,
  fix unused var lint errors
- feedback-client/gdpr.test.ts: mark lifecycle policy as accepted,
  remove console.log + unused blobPath variable
- Update WORKSPACE_TODO_AUDIT.md — P3 section: all 5 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:03:51 -07:00
saravanakumardb1
6f03a74a76 feat(platform-service): resolve P2 TODOs — exports, broadcasts, telemetry, waitlist
- telemetry/repository: group upsertEventsBatch by pk — same-partition
  writes sequential, different partitions parallel (reduces contention)
- exports/routes: wire async export processing via process.nextTick —
  queries users/audit/telemetry/usage/subscriptions/licenses, serializes
  to CSV or JSON, updates job status with rowCount and fileSizeBytes
- broadcasts/repository: replace mock estimateTargetReach with real user
  count query from auth module, respects percentageRollout
- broadcasts/routes: wire async broadcast delivery — fetches target users,
  dispatches email per recipient, updates metrics on completion
- waitlist/routes: auto-generate invitation codes via invitations module
  when batch-inviting waitlist entries (WL-XXXXXXXX format, 14-day trial)
- CAPTCHA (item 12) deferred — requires external API keys
- Update WORKSPACE_TODO_AUDIT.md — P2 section: 5/6 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:41:11 -07:00
saravanakumardb1
09525f671f fix(platform-service): 3 bugs in delivery subscribers + survey incentives
- delivery/subscribers: welcome email used raw productId as productName,
  now uses resolveProductName() for proper display name
- delivery/subscribers: remove redundant String(daysLeft) in trial_expiring
- surveys/routes: incentiveClaimed was set outside if(sub) block, marking
  response as claimed even when user has no subscription. Moved inside
  if(sub) so claims are only recorded when incentive is actually granted
2026-03-22 00:19:32 -07:00
saravanakumardb1
2f06aacc27 fix(platform-service): resolve P1 TODOs — delivery email subscribers + survey incentives
- delivery/subscribers: add resolveUserEmail() helper using auth getById()
- payment.failed: look up user email, dispatch payment-failed template
- trial_expiring: look up user, compute daysLeft from expiresAt, dispatch
- trial_expired: look up user, dispatch trial-expired template with upgradeUrl
- surveys/routes: wire incentive fulfillment to subscriptions module
  - pro_days: extend currentPeriodEnd by incentive amount
  - credits: add bonus tokensIncluded via subscriptions repo
- Update WORKSPACE_TODO_AUDIT.md — P0+P1 all resolved (7/18)
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:14:41 -07:00
saravanakumardb1
9180954903 fix(platform+admin-web): 3 bugs in delivery retry + webhooks delivery loading
Backend (delivery retry):
- Use NotFoundError (404) instead of BadRequestError (400) for missing log doc
- Add telegram + slack retry support (was email-only, threw error for others)

Frontend (delivery page):
- Add pk field to DeliveryEntry interface
- Pass pk query param in retry call so backend can look up the doc
- Fix handleRetry to accept full entry object instead of just id

Frontend (webhooks page):
- Parallelize delivery fetches with Promise.allSettled (was sequential for loop)
- Significant page load improvement for subscriptions with many deliveries
2026-03-21 23:21:58 -07:00
saravanakumardb1
ead9457345 feat(platform+admin-web): implement 4 missing backend endpoints + re-enable frontend
Backend endpoints added:
- POST /delivery/logs/:id/retry — re-dispatches failed email deliveries (Q1)
- POST /reviews/:id/flag — flags review item with reason + admin metadata (Q2)
- DELETE /agent-evals/suites/:id — deletes evaluation suite with 204 response (Q3)

Frontend re-enabled:
- delivery: retry button for failed entries
- reviews: flag dropdown menu item
- agent-evals: delete dropdown menu item + Trash2 icon

Frontend fixed:
- webhooks: per-subscription delivery loading via GET /subscriptions/:id/deliveries (Q4)
2026-03-21 23:11:38 -07:00
saravanakumardb1
d1d01727e4 fix(platform): register ai-diagnostics routes + wire 6 hidden sidebar pages
Phase 0 from DASHBOARD_UI_COVERAGE_ROADMAP:
- Register ai-diagnostics routes in server.ts (671-line module was never mounted)
- Add 6 hidden pages to admin sidebar-nav.tsx:
  Debug Sessions, Health Dashboard, Extraction, Experiments,
  Predictive, AI Diagnostics
- /users was already in sidebar (no change needed)
- Kill switch verified: already per-product via productId query param
- Admin sidebar now has 33 items (was 27)
2026-03-21 17:40:35 -07:00
saravanakumardb1
267f8af3a4 test(ai-diagnostics): add 94 tests for error-normalization, clustering, query-parser 2026-03-21 17:06:24 -07:00
saravanakumardb1
7613d6890f feat(field-encrypt): admin-panel encryption toggle via feature flags
- FieldEncryptorConfig.enabled: false returns NullFieldEncryptor (no-op)
- NullFieldEncryptor stores plaintext as-is, decrypt returns ct directly
- 7 new tests for toggle behavior (50/50 total)
- encryption_enabled added to COMMON_FLAGS (seeded for all 10 products)
2026-03-21 15:24:19 -07:00
saravanakumardb1
4a47db72ae fix(flags): SSE stream endpoint + client — pass productId via query string
EventSource API cannot set custom headers, so the SSE /flags/stream
endpoint and feature-flag-client were broken for streaming mode:
- Server: accept productId and token from query string as fallback
  when x-product-id / authorization headers are absent
- Client: pass productId (and optional auth token) as query params
  when constructing the EventSource URL
2026-03-21 12:12:14 -07:00
saravanakumardb1
1e1ee969dc feat(flags): add seed flags for 6 missing products + 6 new evaluator edge-case tests
- seed.ts: add default flags for jarvisjr (3), peakpulse (3), flowmonk (2), notelett (2), actiontrail (2), localmemgpt (2)
  Previously only chronomind, nomgap, mindlyst, smartauth, lysnrai had seed flags — common flags (maintenance_mode, telemetry_enabled) were not seeded for newer products
- flags.test.ts: 63 → 69 tests (+6):
  - anonymous user with partial percentage returns off
  - partial rule rollout (0%) skips even matching users
  - neq operator (not equal)
  - contains operator
  - gte + lt numeric range on custom attributes
  - missing context attribute returns off
2026-03-21 11:52:08 -07:00
saravanakumardb1
dd113b96c9 fix(flags): critical partition key bug + audit snapshot integrity + anonymous rollout
- repository.ts: update() and remove() now require productId as partition key
  (was passing 'id' as both params — works with memory provider but fails on Cosmos DB)
- repository.ts: updateSegment() and removeSegment() also fixed
- routes.ts: all repo.update/remove calls updated to pass productId
- routes.ts: audit 'before' snapshots now use JSON deep copy instead of shallow spread
  (prevents nested object mutation from corrupting audit trail)
- routes.ts: kill switch audit now uses repo.update() return value for 'after' snapshot
- evaluator.ts: anonymous users (no userId) with partial percentage (0 < pct < 100)
  now correctly return 'off' instead of falling through to default variation
  (can't deterministically hash without a userId)
2026-03-21 11:50:08 -07:00
saravanakumardb1
ca6a4d41d8 feat(flags): production-grade feature flag system — multi-variate, segments, audit, SSE, scheduling, prerequisites
- types.ts: multi-variate flags (boolean/string/number/JSON), targeting rules with 18 operators, scheduling (enableAt/disableAt/gradual rollout), prerequisites, segments, audit log, evaluation context
- evaluator.ts: pure evaluation engine — schedule checking, prerequisite dependencies (circular detection), individual targeting, targeting rules (AND clauses), segment matching, percentage rollout (FNV-1a), OS version/platform/region filtering
- repository.ts: 3 collections — feature_flags, flag_segments, flag_audit_log
- routes.ts: 18 endpoints — flag CRUD, toggle, archive, kill switch (with tag filter), segment CRUD, audit log, POST /flags/evaluate (multi-variate), SSE /flags/stream, legacy /flags/poll backward-compat
- seed.ts: updated to produce full FeatureFlagDoc with variations, version
- flags.test.ts: 63 tests — schema validation, evaluator engine, targeting rules, segments, prerequisites, scheduling, gradual rollouts, multi-variate, version comparison, deterministic hashing
- @bytelyst/events: added flag.created, flag.updated, flag.deleted, flag.kill_switch event types
- @bytelyst/feature-flag-client: multi-variate support (getValue, getEvaluation, getAllEvaluations), SSE streaming mode, onChange listeners, auth token injection
- event-dispatcher.ts + webhooks/types.ts: wired new flag events
2026-03-21 11:44:49 -07:00
saravanakumardb1
26283b402a test(platform-service): cover ai diagnostics routes 2026-03-21 10:45:41 -07:00
saravanakumardb1
10c288857d fix(support-cases): prevent auto-triage from regressing case status
- Auto-triage previously always set status to 'triaged', even for cases
  already in in_progress, escalated, or other later states
- Now only transitions to 'triaged' if case is still in 'open' state
- Cases in later states keep their current status (only priority + tags update)
- Added regression test for in_progress case
- 10 support-cases tests passing
2026-03-20 06:25:43 -07:00
saravanakumardb1
78cb958a6d fix(ai-budgets): tighten rollover period filter to exclude stale entries
- Previous filter only checked e.recordedAt < currentPeriodStart
- Now also checks e.recordedAt >= prevPeriodStart (lower bound)
- Prevents entries from periods before the previous one from inflating
  the spent amount, which would reduce the rollover incorrectly
- 12 ai-budgets tests passing
2026-03-20 06:24:13 -07:00
saravanakumardb1
f3a4d915f5 fix(support-cases): prevent crash when auto-triage encounters undefined tags
- supportCase.tags is optional in SupportCaseDoc schema
- Spreading undefined throws TypeError at runtime
- Fixed both [...supportCase.tags] and .includes() call with ?? [] fallback
- Added regression test for undefined tags case
- 9 support-cases tests passing
2026-03-20 06:23:34 -07:00