Commit Graph

374 Commits

Author SHA1 Message Date
saravanakumardb1
ca7c3e571e fix(cowork-service): audit H.7 — fix LLM timeout, harden test mocks, add Ollama config
Bug fixes from systematic review of H.7 LLM router wiring:
- lib/llm-router.ts: remove RUST_RUNTIME_TIMEOUT_MS (300s IPC timeout) override
  — let LlmRouter use its built-in 30s default, appropriate for cloud API calls
- server.test.ts: add debug/error to appMock.log — prevents fragile failures
  if any startup path hits those log levels
- server.test.ts: add OLLAMA_URL + OLLAMA_MODELS to config mock

New feature:
- config.ts: add OLLAMA_URL + OLLAMA_MODELS env vars for local Ollama support
- server.ts: wire Ollama env vars into initLlmRouter() — set
  OLLAMA_MODELS=model1,model2 to auto-add local Ollama as a provider

57 tests passing, 9 test files, typecheck clean
2026-04-02 23:37:50 -07:00
saravanakumardb1
f542160784 feat(cowork-service): H.7 — wire @bytelyst/llm-router for multi-model routing
Added LLM routing module to cowork-service:
- lib/llm-router.ts — singleton LlmRouter with cloud + local Ollama support
- modules/llm/types.ts — Zod request schemas
- modules/llm/routes.ts — POST /api/llm/chat, GET /api/llm/providers, GET /api/llm/health
- All endpoints gated by llm_multi_model_enabled feature flag
- Best-effort init: service works without API keys (router stays uninitialized)
- 8 new tests (routes), server test updated for 3 route modules
- 57 total tests passing, typecheck clean
2026-04-02 23:10:07 -07:00
saravanakumardb1
53c3565874 fix(cowork-service): audit flush field name mismatch + server test mock gap
BUG: flush-scheduler.ts flushAudit() read 'events' from IPC response but
Rust handle_flush_audit() returns { count, entries }. Audit events were
silently lost (always empty array). Fixed to read 'entries'.

Also fixed:
- server.test.ts: added missing flush-scheduler.js mock (new import in server.ts)
- feature-flags.ts: doc comment '12 flags' → '13 flags'
- flush-scheduler.test.ts: mock data aligned to Rust response shape

49 tests passing, 8 test files, typecheck clean.
2026-04-02 23:02:38 -07:00
saravanakumardb1
9fc5af5b2b test(cowork-service): platform-client + flush-scheduler tests (17 new tests)
New test files:
- lib/platform-client.test.ts (8 tests): audit posting, telemetry batch,
  usage records, flag polling, error handling, query params
- lib/flush-scheduler.test.ts (9 tests): lifecycle start/stop, flushAll with
  IPC drain → platform-service posting, IPC error handling, empty responses,
  pollAndSyncFlags with local registry + IPC update, failure graceful handling

49 tests passing (was 32), 8 test files, typecheck clean.
2026-04-02 22:53:04 -07:00
saravanakumardb1
f8f3cdc242 feat(cowork-service): platform-client + flush scheduler (H.4-H.8 TS wiring)
New files:
- lib/platform-client.ts — REST client for platform-service endpoints:
  POST /audit, POST /telemetry/events, POST /usage, GET /flags/poll

- lib/flush-scheduler.ts — periodic drain of IPC buffers → platform-service:
  - flushAll(): drains audit, telemetry, budget from Rust IPC → REST
  - pollAndSyncFlags(): GET /flags/poll → update TS registry + push to IPC
  - Singleton pattern with start/stop/finalFlush lifecycle
  - All operations best-effort (logged, never crash service)

Updated server.ts:
- Starts flush scheduler after IPC bridge connects
- finalFlush() before shutdown (drain remaining events)

32 tests passing, typecheck clean.
2026-04-02 22:50:00 -07:00
saravanakumardb1
ff433e172d test(cowork-service): add product-config, feature-flags, health routes tests
New test files (3):
- lib/product-config.test.ts — 3 tests: PRODUCT_ID, productConfig fields, consistency
- lib/feature-flags.test.ts — 3 tests: 13 flag names match seed.ts, defaults correct, setFlag override
- modules/health/routes.test.ts — 2 tests: degraded when IPC disconnected, connected status reporting

32 tests passing (was 24), 6 test files, typecheck clean.
2026-04-02 22:22:59 -07:00
saravanakumardb1
2191427605 fix(cowork-service): 3 bugs — flag names, IPC call guard, health status
BUG 1: feature-flags.ts had 3 wrong flag names + missing 3 from seed.ts
  - Removed: browser_extension_enabled, institutional_knowledge_enabled
  - Renamed: connectors_enabled → mcp_connectors_enabled
  - Added: llm_multi_model_enabled, telemetry_enabled, platform_auth_required
  - Fixed defaults to match seed.ts (marketplace_enabled=true, dispatch_api_enabled=true)
  - Now 13 flags exactly matching platform-service/src/modules/flags/seed.ts

BUG 2: ipc-bridge.ts call() had 'initialize' exemption that allowed null deref
  - If call('initialize') was invoked externally without start(), the guard
    passed but this.child!.stdin!.write() would crash with null dereference
  - During normal start(), child.stdin.writable is true so no exemption needed
  - Removed the method !== 'initialize' exemption

BUG 3: health routes didn't factor IPC bridge into overall health status
  - allOk only checked platform-service reachability
  - Now allOk = depsOk && ipcConnected — service reports 503 when bridge is down
  - IPC bridge disconnection makes health 'degraded' (correct — fallback mode works)

24 tests passing, typecheck clean.
2026-04-02 22:20:52 -07:00
saravanakumardb1
19674c7ef7 feat(cowork-service): ecosystem alignment + IPC bridge to Rust runtime
ECOSYSTEM GAPS CLOSED — cowork-service now matches the pattern used by
all other product backends (FlowMonk, ActionTrail, NoteLett, etc.):

New lib files (6):
- lib/product-config.ts — canonical product identity (PRODUCT_ID, productConfig)
- lib/auth.ts — @bytelyst/fastify-auth createAuthMiddleware
- lib/request-context.ts — getUserId(), getRequestProductId()
- lib/telemetry.ts — @bytelyst/backend-telemetry buffer
- lib/feature-flags.ts — @bytelyst/backend-flags with 12 cowork flags
- lib/ipc-bridge.ts — IpcBridge class: spawn Rust child, JSON-RPC, 13 methods

Updated files:
- lib/config.ts — extends @bytelyst/backend-config baseBackendConfigSchema
- server.ts — JWT context, bootstrap endpoint, IPC startup, graceful shutdown
- modules/tasks/routes.ts — IPC bridge forwarding with in-memory fallback
- modules/health/routes.ts — productId from product-config, IPC status
- package.json — 7 new @bytelyst/* workspace deps

IPC bridge features:
- Spawns cowork-orchestrator --ipc-bridge as child process
- JSON-RPC 2.0 over stdin/stdout (line-delimited)
- 13 convenience methods matching Rust IpcHandler
- Timeout + pending request tracking
- Graceful shutdown with SIGTERM
- Singleton pattern with setIpcBridge() for testing

24 tests passing (was 8), typecheck clean.
2026-04-02 22:14:24 -07:00
saravanakumardb1
a87c533fd3 feat(cowork-service): scaffold Fastify bridge + seed clawcowork feature flags (H.1 + H.2)
H.1: Product registration
- Added 12 clawcowork feature flags to platform-service flags/seed.ts
  (sandbox, plugins, mcp, scheduling, computer_use, parallel_agents,
   marketplace, wasm, llm_multi_model, audit, platform_auth, dispatch_api)

H.2: cowork-service scaffold (services/cowork-service/)
- @lysnrai/cowork-service on port 4009, productId clawcowork
- createServiceApp + startService from @bytelyst/fastify-core
- Modules: health (dependency check), tasks (submit/list/get/cancel)
- Zod-validated config, Swagger, readiness endpoint
- 8 tests passing (1 bootstrap + 7 task routes), typecheck clean
2026-04-02 20:39:22 -07:00
saravanakumardb1
af605a6e7d fix(mcp): align ChronoMind type enums with actual backend schema
Fix 3 type mismatches in MCP chronomind-client.ts + chronomind-tools.ts:

1. TimerDoc.type: Remove non-existent 'event', keep 'countdown'|'alarm'|'pomodoro'
2. TimerDoc.state: Remove non-existent 'idle', add missing 'snoozed'|'dismissed'
3. RoutineDoc.status: Replace wrong 'idle'|'running' with actual
   'template'|'ready'|'active'|'cancelled'

Fix cascading usages:
- chronomind.timers.create tool: default state 'idle' → 'active'
- chronomind.timers.list tool: update type/state enum in Zod schema + description
- calendar-import-pipeline.ts: 'idle' → 'active' for imported timers

MCP server typecheck passes. No runtime behavior change for existing tools.
2026-03-31 23:57:24 -07:00
saravanakumardb1
efde14ba6e feat(mcp): Phase A.3 — 5 new ChronoMind MCP tools + client functions
Extend centralized MCP server with 5 new ChronoMind tools:

- chronomind.timers.reschedule — shift timer by delta or set new target time
- chronomind.timers.availability — find free time slots in a window
- chronomind.routines.start — start a routine from ready/template status
- chronomind.agentActions.list — list agent action audit trail
- chronomind.agentActions.approve — approve a proposed agent action

Client functions added to chronomind-client.ts:
- chronomindTimerReschedule, chronomindTimerAvailability, chronomindRoutineStart
- chronomindAgentActionCreate, chronomindAgentActionsList, chronomindAgentActionApprove

Write tools (reschedule, routines.start) record agent actions for audit trail.
Audit recording is fail-open — failures don't block the actual operation.

MCP server typecheck passes. No breaking changes to existing tools.
2026-03-31 23:41:43 -07:00
root
81951b173a feat(extraction): back product rate limits with valkey 2026-03-31 08:08:53 +00:00
root
b8661392c6 feat(observability): add phase 2 monitoring and valkey services 2026-03-31 06:57:12 +00:00
58c47a751a fix(mcp-server): pass workspaceId to notelett note get/update/delete/summarize tools
Backend requires workspaceId query param on single-note endpoints.
Updated client functions and tool schemas accordingly.

Made-with: Cursor
2026-03-29 22:10:03 -07:00
6997dff8d9 feat(mcp-server): register NoteLett tools (notes, workspaces, tasks, artifacts, summarize)
Adds notelett-client.ts HTTP wrapper, notelett-tools.ts with 10 MCP tool registrations,
and NOTELETT_BACKEND_URL config entry.

Made-with: Cursor
2026-03-29 20:57:16 -07:00
root
eba6c7a641 chore(platform): align docker and package outputs 2026-03-29 23:41:08 +00:00
saravanakumardb1
46ee14371c fix(ci): add --pool forks to all vitest test scripts to fix kill EPERM on Node v25
Root cause: tinypool worker teardown calls kill() which returns EPERM
in the act_runner host environment on Node.js v25.2.1. Tests pass but
the vitest process crashes during cleanup, causing CI failure.

Fix: --pool forks CLI flag on every package/service test script, plus
pool: 'forks' in all vitest.config.ts files. This uses child_process.fork()
worker management which handles termination cleanly.

60 package.json files updated, 10 vitest.config.ts files updated.
2026-03-27 23:23:38 -07:00
saravanakumardb1
0628f5b3bf test(platform): add 4 impersonation business rule tests (6→10) 2026-03-27 13:22:12 -07:00
saravanakumardb1
3cda7190fb feat(platform): add i18n translations module (P3.20) 2026-03-27 11:32:39 -07:00
saravanakumardb1
85aca5534b fix(docker): sync all 3 service Dockerfiles with complete workspace package.json list
platform-service had 16/60, extraction-service had 14/60, mcp-server had 34/60.
All three now list all 57 packages + 4 services + 2 dashboards + scripts.
Required for pnpm install --frozen-lockfile to resolve the full workspace.
2026-03-24 11:55:47 -07:00
saravanakumardb1
59f6ac1b9a fix(ai-diagnostics): keep cluster filters numeric 2026-03-23 16:21:08 -07:00
saravanakumardb1
cd811114e5 fix(devops): harden local shared-service docker bring-up 2026-03-22 12:34:38 -07:00
saravanakumardb1
67ef6a6068 fix(exports): preserve processing state on async export failures 2026-03-22 11:58:54 -07:00
saravanakumardb1
265599d005 fix(platform-service): harden broadcast metrics and export job lifecycle 2026-03-22 11:57:47 -07:00
saravanakumardb1
dda38aa009 fix(exports): strip data payload from list endpoint + update audit doc
- exports/routes: exclude inline data from GET /exports list response
  to prevent returning megabytes of serialized export data (perf+security)
- Update WORKSPACE_TODO_AUDIT.md: add post-audit review section with
  9 bugs found and fixed across 2 commits (73b07c2, 841cdf3), mark
  all action plan sprints complete
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:23:08 -07:00
saravanakumardb1
841cdf3a16 fix(platform-service+events): 3 more gaps in diagnostics + delivery
- diagnostics/subscribers: wire session.created email notification to
  target user using existing 'diagnostics-session-created' template
  (was just logging instead of sending the email)
- events/types: add missing 'currency' field to payment.failed schema
  (payment.succeeded had it, payment.failed did not — inconsistency)
- delivery/subscribers: use event.payload.currency instead of hardcoded
  empty string in payment-failed email variables
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:20:24 -07:00
saravanakumardb1
73b07c2c3a fix(platform-service): 5 bugs in recent P2/P3 implementations
- diagnostics/subscribers: use correct template IDs
  'diagnostics-session-cancelled' and 'diagnostics-session-completed'
  instead of non-existent 'generic' (would throw at runtime)
- delivery/templates: add missing 'broadcast' email template used by
  broadcast delivery route (dispatchEmail would throw on unknown ID)
- broadcasts/routes: replace broken dot-path 'metrics.sent' update
  with proper updateBroadcastMetrics() call, add productName variable
- exports/routes: store serialized data on job doc, add download
  endpoint GET /exports/:id/download with content-type headers,
  exclude data payload from metadata GET endpoint
- waitlist/routes: store invitation doc ID (inv_...) instead of
  code string (WL-...) in invitationCodeId field
- delivery/delivery.test.ts: update template count 12 -> 13
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:14:55 -07:00
saravanakumardb1
1576b699b0 feat(platform-service): resolve all P3 TODOs — diagnostics notifications + test cleanup
- diagnostics/subscribers: notify admin via email when debug session is
  cancelled (looks up session creator via getSession + getUserById)
- diagnostics/subscribers: email session summary (logs/traces/screenshots)
  to admin when debug session completes
- diagnostics/subscribers: send Slack alert via dispatchSlack for FATAL
  logs ingested during debug sessions (on-call engineer notification)
- feedback-client/integration.test.ts: replace TODO-4 with clear NOTE,
  fix unused var lint errors
- feedback-client/gdpr.test.ts: mark lifecycle policy as accepted,
  remove console.log + unused blobPath variable
- Update WORKSPACE_TODO_AUDIT.md — P3 section: all 5 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 01:03:51 -07:00
saravanakumardb1
6f03a74a76 feat(platform-service): resolve P2 TODOs — exports, broadcasts, telemetry, waitlist
- telemetry/repository: group upsertEventsBatch by pk — same-partition
  writes sequential, different partitions parallel (reduces contention)
- exports/routes: wire async export processing via process.nextTick —
  queries users/audit/telemetry/usage/subscriptions/licenses, serializes
  to CSV or JSON, updates job status with rowCount and fileSizeBytes
- broadcasts/repository: replace mock estimateTargetReach with real user
  count query from auth module, respects percentageRollout
- broadcasts/routes: wire async broadcast delivery — fetches target users,
  dispatches email per recipient, updates metrics on completion
- waitlist/routes: auto-generate invitation codes via invitations module
  when batch-inviting waitlist entries (WL-XXXXXXXX format, 14-day trial)
- CAPTCHA (item 12) deferred — requires external API keys
- Update WORKSPACE_TODO_AUDIT.md — P2 section: 5/6 resolved
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:41:11 -07:00
saravanakumardb1
09525f671f fix(platform-service): 3 bugs in delivery subscribers + survey incentives
- delivery/subscribers: welcome email used raw productId as productName,
  now uses resolveProductName() for proper display name
- delivery/subscribers: remove redundant String(daysLeft) in trial_expiring
- surveys/routes: incentiveClaimed was set outside if(sub) block, marking
  response as claimed even when user has no subscription. Moved inside
  if(sub) so claims are only recorded when incentive is actually granted
2026-03-22 00:19:32 -07:00
saravanakumardb1
2f06aacc27 fix(platform-service): resolve P1 TODOs — delivery email subscribers + survey incentives
- delivery/subscribers: add resolveUserEmail() helper using auth getById()
- payment.failed: look up user email, dispatch payment-failed template
- trial_expiring: look up user, compute daysLeft from expiresAt, dispatch
- trial_expired: look up user, dispatch trial-expired template with upgradeUrl
- surveys/routes: wire incentive fulfillment to subscriptions module
  - pro_days: extend currentPeriodEnd by incentive amount
  - credits: add bonus tokensIncluded via subscriptions repo
- Update WORKSPACE_TODO_AUDIT.md — P0+P1 all resolved (7/18)
- Typecheck clean, 1483/1483 tests pass
2026-03-22 00:14:41 -07:00
saravanakumardb1
9180954903 fix(platform+admin-web): 3 bugs in delivery retry + webhooks delivery loading
Backend (delivery retry):
- Use NotFoundError (404) instead of BadRequestError (400) for missing log doc
- Add telegram + slack retry support (was email-only, threw error for others)

Frontend (delivery page):
- Add pk field to DeliveryEntry interface
- Pass pk query param in retry call so backend can look up the doc
- Fix handleRetry to accept full entry object instead of just id

Frontend (webhooks page):
- Parallelize delivery fetches with Promise.allSettled (was sequential for loop)
- Significant page load improvement for subscriptions with many deliveries
2026-03-21 23:21:58 -07:00
saravanakumardb1
ead9457345 feat(platform+admin-web): implement 4 missing backend endpoints + re-enable frontend
Backend endpoints added:
- POST /delivery/logs/:id/retry — re-dispatches failed email deliveries (Q1)
- POST /reviews/:id/flag — flags review item with reason + admin metadata (Q2)
- DELETE /agent-evals/suites/:id — deletes evaluation suite with 204 response (Q3)

Frontend re-enabled:
- delivery: retry button for failed entries
- reviews: flag dropdown menu item
- agent-evals: delete dropdown menu item + Trash2 icon

Frontend fixed:
- webhooks: per-subscription delivery loading via GET /subscriptions/:id/deliveries (Q4)
2026-03-21 23:11:38 -07:00
saravanakumardb1
d1d01727e4 fix(platform): register ai-diagnostics routes + wire 6 hidden sidebar pages
Phase 0 from DASHBOARD_UI_COVERAGE_ROADMAP:
- Register ai-diagnostics routes in server.ts (671-line module was never mounted)
- Add 6 hidden pages to admin sidebar-nav.tsx:
  Debug Sessions, Health Dashboard, Extraction, Experiments,
  Predictive, AI Diagnostics
- /users was already in sidebar (no change needed)
- Kill switch verified: already per-product via productId query param
- Admin sidebar now has 33 items (was 27)
2026-03-21 17:40:35 -07:00
saravanakumardb1
267f8af3a4 test(ai-diagnostics): add 94 tests for error-normalization, clustering, query-parser 2026-03-21 17:06:24 -07:00
saravanakumardb1
2c6397272f fix(test): add env defaults to platform-service vitest config
Aligns service-local vitest.config.ts with root config so tests pass
both via 'pnpm test' (uses service config) and 'npx vitest run' (uses root).
Fixes telemetry.test.ts which fails because its import chain eagerly
loads config.ts → envSchema.parse() requiring COSMOS_ENDPOINT/KEY/JWT_SECRET.

Added: RATE_LIMIT_STORE_MODE=memory, COSMOS_ENDPOINT, COSMOS_KEY, JWT_SECRET
(all test-safe placeholders, never used at runtime with DB_PROVIDER=memory)
2026-03-21 15:32:14 -07:00
saravanakumardb1
7613d6890f feat(field-encrypt): admin-panel encryption toggle via feature flags
- FieldEncryptorConfig.enabled: false returns NullFieldEncryptor (no-op)
- NullFieldEncryptor stores plaintext as-is, decrypt returns ct directly
- 7 new tests for toggle behavior (50/50 total)
- encryption_enabled added to COMMON_FLAGS (seeded for all 10 products)
2026-03-21 15:24:19 -07:00
saravanakumardb1
4a47db72ae fix(flags): SSE stream endpoint + client — pass productId via query string
EventSource API cannot set custom headers, so the SSE /flags/stream
endpoint and feature-flag-client were broken for streaming mode:
- Server: accept productId and token from query string as fallback
  when x-product-id / authorization headers are absent
- Client: pass productId (and optional auth token) as query params
  when constructing the EventSource URL
2026-03-21 12:12:14 -07:00
saravanakumardb1
1e1ee969dc feat(flags): add seed flags for 6 missing products + 6 new evaluator edge-case tests
- seed.ts: add default flags for jarvisjr (3), peakpulse (3), flowmonk (2), notelett (2), actiontrail (2), localmemgpt (2)
  Previously only chronomind, nomgap, mindlyst, smartauth, lysnrai had seed flags — common flags (maintenance_mode, telemetry_enabled) were not seeded for newer products
- flags.test.ts: 63 → 69 tests (+6):
  - anonymous user with partial percentage returns off
  - partial rule rollout (0%) skips even matching users
  - neq operator (not equal)
  - contains operator
  - gte + lt numeric range on custom attributes
  - missing context attribute returns off
2026-03-21 11:52:08 -07:00
saravanakumardb1
dd113b96c9 fix(flags): critical partition key bug + audit snapshot integrity + anonymous rollout
- repository.ts: update() and remove() now require productId as partition key
  (was passing 'id' as both params — works with memory provider but fails on Cosmos DB)
- repository.ts: updateSegment() and removeSegment() also fixed
- routes.ts: all repo.update/remove calls updated to pass productId
- routes.ts: audit 'before' snapshots now use JSON deep copy instead of shallow spread
  (prevents nested object mutation from corrupting audit trail)
- routes.ts: kill switch audit now uses repo.update() return value for 'after' snapshot
- evaluator.ts: anonymous users (no userId) with partial percentage (0 < pct < 100)
  now correctly return 'off' instead of falling through to default variation
  (can't deterministically hash without a userId)
2026-03-21 11:50:08 -07:00
saravanakumardb1
ca6a4d41d8 feat(flags): production-grade feature flag system — multi-variate, segments, audit, SSE, scheduling, prerequisites
- types.ts: multi-variate flags (boolean/string/number/JSON), targeting rules with 18 operators, scheduling (enableAt/disableAt/gradual rollout), prerequisites, segments, audit log, evaluation context
- evaluator.ts: pure evaluation engine — schedule checking, prerequisite dependencies (circular detection), individual targeting, targeting rules (AND clauses), segment matching, percentage rollout (FNV-1a), OS version/platform/region filtering
- repository.ts: 3 collections — feature_flags, flag_segments, flag_audit_log
- routes.ts: 18 endpoints — flag CRUD, toggle, archive, kill switch (with tag filter), segment CRUD, audit log, POST /flags/evaluate (multi-variate), SSE /flags/stream, legacy /flags/poll backward-compat
- seed.ts: updated to produce full FeatureFlagDoc with variations, version
- flags.test.ts: 63 tests — schema validation, evaluator engine, targeting rules, segments, prerequisites, scheduling, gradual rollouts, multi-variate, version comparison, deterministic hashing
- @bytelyst/events: added flag.created, flag.updated, flag.deleted, flag.kill_switch event types
- @bytelyst/feature-flag-client: multi-variate support (getValue, getEvaluation, getAllEvaluations), SSE streaming mode, onChange listeners, auth token injection
- event-dispatcher.ts + webhooks/types.ts: wired new flag events
2026-03-21 11:44:49 -07:00
saravanakumardb1
26283b402a test(platform-service): cover ai diagnostics routes 2026-03-21 10:45:41 -07:00
saravanakumardb1
10c288857d fix(support-cases): prevent auto-triage from regressing case status
- Auto-triage previously always set status to 'triaged', even for cases
  already in in_progress, escalated, or other later states
- Now only transitions to 'triaged' if case is still in 'open' state
- Cases in later states keep their current status (only priority + tags update)
- Added regression test for in_progress case
- 10 support-cases tests passing
2026-03-20 06:25:43 -07:00
saravanakumardb1
78cb958a6d fix(ai-budgets): tighten rollover period filter to exclude stale entries
- Previous filter only checked e.recordedAt < currentPeriodStart
- Now also checks e.recordedAt >= prevPeriodStart (lower bound)
- Prevents entries from periods before the previous one from inflating
  the spent amount, which would reduce the rollover incorrectly
- 12 ai-budgets tests passing
2026-03-20 06:24:13 -07:00
saravanakumardb1
f3a4d915f5 fix(support-cases): prevent crash when auto-triage encounters undefined tags
- supportCase.tags is optional in SupportCaseDoc schema
- Spreading undefined throws TypeError at runtime
- Fixed both [...supportCase.tags] and .includes() call with ?? [] fallback
- Added regression test for undefined tags case
- 9 support-cases tests passing
2026-03-20 06:23:34 -07:00
saravanakumardb1
0bbae1f14e feat(platform): Phase 6 — Support Case Management
- Case timeline: GET /support/cases/:id/timeline
  - Merges case creation, notes, escalations chronologically
- SLA engine: GET /support/cases/:id/sla
  - Priority-based SLA targets (critical=1h/4h, high=4h/24h, etc.)
  - First-response and resolution breach detection
- Auto-triage: POST /support/cases/:id/auto-triage
  - Keyword-based priority heuristics (outage→critical, error→high, etc.)
  - Category detection (auth, billing, api)
  - Deduped tag application
- Case metrics: GET /support/metrics
  - Aggregates by status/priority/source
  - Avg resolution hours, SLA breach count, compliance rate
- New repo function: listAllCases
- 1,336 tests passing (5 new)
2026-03-20 03:38:08 -07:00
saravanakumardb1
a060ee4496 feat(platform): Phase 5 — Human Review Queue
- Batch decisions: POST /reviews/batch-decision (up to 50 items)
  - Parallel execution with allSettled, reports succeeded/failed counts
- Delegation: POST /reviews/:id/delegate
  - Reassigns review with delegation metadata tracking
  - Triggers notification to new assignee
- Auto-expiry: POST /reviews/expire
  - Scans pending/assigned reviews past dueAt, marks expired
- Review stats: GET /reviews/stats
  - Aggregates by status/priority/category, avg resolution time
  - Computes pendingCount, overdueCount, avgResolutionHours
- New repo functions: listExpired, listAll
- 1,331 tests passing (7 new)
2026-03-20 03:33:55 -07:00
saravanakumardb1
9758192377 feat(platform): Phase 4 — AI Governance & Evals
- Run history: GET /agent-evals/suites/:id/runs with limit param
- Regression comparison: GET /agent-evals/suites/:id/regression
  - Detects 5%+ score drop between consecutive runs
  - Returns latest vs previous comparison + trend data
- Release gate check: GET /agent-evals/suites/:id/gate
  - Checks if latest release-gate run passed threshold
- Agent compliance report: GET /agent-evals/agents/:agentId/report
  - Aggregates pass rate, avg score, suite counts, recent runs
- Eval scheduling: POST /agent-evals/suites/:id/schedule
  - Wires eval suite to job runner with cron expression
- New repo functions: listRunsBySuite, listRunsByAgent
- 1,324 tests passing (8 new)
2026-03-20 03:30:03 -07:00
saravanakumardb1
05acacd400 feat(platform): Phase 3 — AI Budget & Cost Governance
- Scope expansion: BudgetScopeTypeSchema now includes 'org' + 'workspace'
- Cost dashboard: GET /ai-budgets/costs with groupBy (model/agent/day/scope)
  - Aggregates totalCostUsd, totalTokens, entryCount, breakdown
- Budget rollover: POST /ai-budgets/policies/:id/rollover
  - Computes previous period remaining, creates rollover doc, adjusts budget
  - GET /ai-budgets/policies/:id/rollovers for history
- Enforcement check: POST /ai-budgets/check (pre-flight, no spend recorded)
  - Model allowlist + threshold evaluation, returns verdict + reasons
- New types: CostDashboardQuerySchema, BudgetRolloverSchema
- New repo functions: listAllSpendEntries, createRollover, listRollovers
- New Cosmos container: ai_budget_rollovers
- 1,316 tests passing (9 new)
2026-03-20 03:26:23 -07:00
saravanakumardb1
84dc348687 feat(platform): Phase 2 — Agent Runtime Orchestration
- New: agents/executor.ts — full agent execution engine
  - Multi-step pipeline: prompt_assembly → tool_execution → finalize
  - AbortController-based cancellation with in-memory tracking
  - Token usage aggregation across tool calls
  - Review-required tool gating (pauses run, returns review_required)
  - Step event streaming for SSE consumers
- New: agents/tool-registry.ts — global tool registry
  - Register/list/validate tools with risk levels + review flags
- New: agents/executor-routes.ts — 11 endpoints, 14 tests
  - POST /agents/execute, POST /runs/:id/cancel
  - GET /agents/active-runs, GET /runs/:id/children, GET /runs/:id/tree
  - GET /agents/:id/metrics, GET /runs/:id/stream (SSE)
  - GET /tools, POST /tools/validate, POST /agents/:id/schedule
- Enhanced: runs/repository.ts — added listChildRuns() for DAG query
- 1,307 tests passing (14 new)
2026-03-20 03:20:31 -07:00