Go to file
saravanakumardb1 68bfa3dbd8 feat(fleet): stale-factory lease reclaim + bounded GC sweep
Two recovery/cleanup gaps left the coordinator's containers growing without
bound and jobs stuck longer than necessary:

- reclaimStaleFactoryLeases: a crashed/partitioned factory stops heartbeating
  ~90s before its 900s lease TTL expires; the reaper now reclaims held leases of
  stale (or vanished) holders within one stale window, via the same fence +
  checkpoint-preserving path as the expiry reaper (refactored into reclaimLeaseJob).

- sweepFleetGarbage: deletes ephemeral coordination state on by default (finished
  expired/released leases past a 24h TTL; factory docs with no heartbeat for 7d —
  a live host just re-registers). Terminal-job retention (jobs + their runs/events/
  artifacts+blobs) is OPT-IN only via FLEET_GC_RETENTION_DAYS (default 0 = never
  delete history). Every delete is best-effort so one failure can't stall the sweep.

Both are wired into the existing reaper loop: recovery scans run every 30s, the
deletion sweep is throttled to hourly. New repo helpers (listHeldLeases,
listFinishedLeasesOlderThan, deleteLease, listAllFactories, deleteFactory,
listTerminalJobsOlderThan, deleteRun, deleteEvent) back the new coordinator
functions. Covered by cleanup.test.ts + expanded reaper.test.ts.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-06-01 11:34:14 -07:00
__LOCAL_LLMs chore(local-llms): refresh chat-history snapshot (mirrors verified diff-free) 2026-05-30 23:32:42 -07:00
.changeset chore(release): add changesets 2026-02-14 19:49:08 -08:00
.gitea/workflows feat(fleet-web): harden budget bar, surface SSE polling, allow checkpoint in patchJob 2026-05-30 20:35:05 -07:00
.github feat(agent-docs): single-source-of-truth pattern for agent instructions 2026-05-23 11:55:19 -07:00
.husky docs+chore: Wave 9.A + 13.E + 13.F + 13.G + CC.3 + CC.8 \u2014 18 boxes flipped 2026-05-27 17:48:13 -07:00
.windsurf/workflows docs: consolidate learning_ai_smart_auth references into learning_ai_auth_app 2026-05-24 14:31:38 -07:00
AI.dev docs(cheatsheets): document longrun helper in long-running-jobs guide 2026-05-30 19:26:06 -07:00
dashboards fix(tracker-web): exclude stale factories from the engine picker 2026-06-01 11:02:56 -07:00
docs docs(gigafactory): fix stale/incorrect fleet docs 2026-06-01 00:03:05 -07:00
e2e feat(diagnostics): Phase 4 - automated triggers, crash sessions, session replay, profiling [4.1][4.2][4.3][4.4] 2026-03-03 12:18:58 -08:00
packages fix(command-palette): guard keydown with undefined key (no crash) 2026-05-31 04:47:40 -07:00
products chore(nomgap): finalize product deployment config 2026-05-04 16:29:20 -07:00
reports docs(compliance): final roadmap update \u2014 100% ecosystem compliance reached 2026-05-23 19:34:49 -07:00
scripts chore(deps): bump @types/node 22 -> 25 (dev types) 2026-05-31 04:02:56 -07:00
services feat(fleet): stale-factory lease reclaim + bounded GC sweep 2026-06-01 11:34:14 -07:00
.aider.conf.yml feat(agent-docs): single-source-of-truth pattern for agent instructions 2026-05-23 11:55:19 -07:00
.dockerignore fix(docker): INFRA-gap-02 unblock full-stack docker compose up 2026-04-16 15:48:32 -07:00
.editorconfig chore(scaffold): initialize pnpm workspace with build tooling 2026-02-12 11:19:29 -08:00
.env.ecosystem.example docs(docker): update README, prompt.md, .env.ecosystem.example with audit fixes 2026-03-28 00:45:38 -07:00
.env.example docs(config): add GITEA_NPM_TOKEN to .env.example 2026-05-29 23:15:43 -07:00
.gitattributes chore: enforce LF line endings via .gitattributes 2026-05-31 20:23:18 -07:00
.gitignore feat(scripts): ecosystem-wide rule violation scanner + baseline report 2026-05-23 14:02:14 -07:00
.npmrc Improve shared UI primitives 2026-05-08 20:56:05 -07:00
.nvmrc chore(scaffold): initialize pnpm workspace with build tooling 2026-02-12 11:19:29 -08:00
.prettierignore chore: stop prettier churn on __LOCAL_LLMs mirrors + refresh snapshot 2026-05-30 23:25:34 -07:00
.prettierrc feat: add quick wins - prettier, bundle limits, coverage 2026-02-12 15:54:06 -08:00
.size-limit.cjs feat(packages): Wave 4 motion + Wave 5b data-viz + Wave 7 notifications-ui 2026-05-27 13:08:30 -07:00
AGENTS.md fix(config): remove :3300 port from Gitea npm registry URLs 2026-05-29 23:29:07 -07:00
docker-compose.ecosystem.yml fix(infra): bind caddy to public eth0 IP only 2026-05-30 16:37:09 +00:00
docker-compose.yml fix(infra,cowork): remove broken Cosmos emulator; harden IPC bridge 2026-05-30 10:27:12 +00:00
eslint.config.js fix(eslint): also ignore .pnpmfile.cjs / .cjs (CommonJS by design) 2026-05-23 16:58:45 -07:00
MANUAL_CI.md docs: fix stale references to consolidated services and migrated dashboards 2026-02-28 03:06:44 -08:00
package.json chore(deps): bump @types/node 22 -> 25 (dev types) 2026-05-31 04:02:56 -07:00
pnpm-lock.yaml fix(tracker-web): stable test DOM env + working localStorage (Node 25) 2026-05-31 04:12:34 -07:00
pnpm-workspace.yaml feat(devops): encryption migration CLI with embedded product configs 2026-03-21 13:19:55 -07:00
quick-check.sh ci: disable GitHub Actions and add manual quality checks 2026-02-12 23:13:07 -08:00
README.md chore(docker): add interactive cleanup menu 2026-05-05 18:28:55 -07:00
REPO_CONTEXT.md docs(workspace): index gitea runner rollout 2026-05-25 02:32:19 +00:00
tsconfig.base.json chore(scaffold): initialize pnpm workspace with build tooling 2026-02-12 11:19:29 -08:00
vitest.config.ts fix(ci): use forks pool in vitest to avoid tinypool kill EPERM on Node v25 2026-03-27 23:15:16 -07:00

ByteLyst Common Platform

Shared packages and microservices for ByteLyst ecosystem products.

⚠️ GitHub Actions Temporarily Disabled

CI/CD is currently disabled due to billing issues. Please run manual quality checks before merging:

  • See MANUAL_CI.md for instructions
  • Use .windsurf/workflows/production-readiness.md for comprehensive checks

Quick Start

# Install dependencies
pnpm install

# Build all packages + services
pnpm build

# Quick quality check (5 min) - run before pushing!
./quick-check.sh

# Run all tests
pnpm test

# Type-check all packages
pnpm typecheck

# Run a specific service in dev mode
pnpm --filter @lysnrai/platform-service dev

Prototype Deployment

For a single-host prototype, use Docker Compose with the repo root docker-compose.yml.

cp .env.example .env
./scripts/prototype-up.sh
pnpm prototype:self-test
pnpm docker:clean

See docs/PROTOTYPE_DEPLOYMENT.md for the required environment variables and day-to-day commands.

The prototype stack now includes a local Cosmos DB Emulator container, so the default .env.example values are wired for single-VM Docker use. Blob uploads are backed by local Azurite, prototype email delivery is backed by Mailpit, and the platform exposes prototype diagnostics at /api/health/dependencies, /api/self-test, and /api/self-test.json.

Current Capability Surface

  • Shared packages — 36 @bytelyst/* packages covering auth, config, API clients, storage, sync, telemetry, diagnostics, design tokens, SDK support, and testing.
  • Servicesplatform-service, extraction-service, mcp-server, and monitoring.
  • Dashboardsadmin-web, tracker-web, and ux-lab.
  • MCP/A2Aservices/mcp-server/ exposes tool routing, platform operator tools, extraction helpers, dev tools, product namespaces, and A2A orchestration pipelines.

Ecosystem Docs

Cross-product strategy and shared-contract documentation now lives under docs/ecosystem/.

Repository Structure

learning_ai_common_plat/
├── packages/                    # 36 shared libraries (@bytelyst/*)
│   ├── api-client/
│   ├── auth/
│   ├── auth-client/
│   ├── blob/ + blob-client/
│   ├── broadcast-client/ + survey-client/
│   ├── config/ + cosmos/
│   ├── dashboard-components/
│   ├── design-tokens/
│   ├── diagnostics-client/ + swift-diagnostics/
│   ├── events/ + fastify-core/
│   ├── extraction/ + llm/ + speech/
│   ├── feature-flag-client/ + feedback-client/ + kill-switch-client/
│   ├── kotlin-platform-sdk/ + swift-platform-sdk/ + react-native-platform-sdk/
│   ├── logger/ + monitoring/ + testing/
│   ├── offline-queue/ + platform-client/ + sync/ + telemetry-client/
│   └── datastore/ + storage/ + push/
├── dashboards/                  # Product-agnostic and internal web workspaces
│   ├── admin-web/               # Platform admin console (port 3001)
│   ├── tracker-web/             # Issue tracker + public roadmap (port 3003)
│   └── ux-lab/                  # Internal UX lab / MCP-assisted ops prototypes
├── services/                    # Platform services + tooling servers
│   ├── platform-service/        # Product-agnostic platform API (port 4003)
│   ├── extraction-service/      # LangExtract text extraction + Python sidecar (port 4005)
│   ├── mcp-server/              # MCP tool server + A2A orchestration
│   └── monitoring/              # Loki + Grafana config, health-check
└── docs/                        # Architecture docs, roadmap, analysis

Package Families

Family Representative Packages Purpose
Core platform @bytelyst/config, @bytelyst/cosmos, @bytelyst/errors, @bytelyst/logger, @bytelyst/testing Shared infrastructure for all services and dashboards
Auth & app clients @bytelyst/auth, @bytelyst/auth-client, @bytelyst/api-client, @bytelyst/platform-client, @bytelyst/react-auth Identity, auth flows, typed service clients
Diagnostics & telemetry @bytelyst/diagnostics-client, @bytelyst/telemetry-client, @bytelyst/swift-diagnostics Client diagnostics, event batching, crash/error capture
Storage & sync @bytelyst/blob, @bytelyst/blob-client, @bytelyst/datastore, @bytelyst/storage, @bytelyst/sync, @bytelyst/offline-queue Blob, local persistence, sync orchestration
Product experience @bytelyst/feature-flag-client, @bytelyst/feedback-client, @bytelyst/broadcast-client, @bytelyst/survey-client, @bytelyst/kill-switch-client Runtime platform features for product apps
AI & extraction @bytelyst/extraction, @bytelyst/llm, @bytelyst/speech Extraction tasks, LLM utilities, speech integration
UI & design @bytelyst/design-tokens, @bytelyst/dashboard-components Shared tokens and dashboard UI building blocks
Native SDKs @bytelyst/swift-platform-sdk, @bytelyst/kotlin-platform-sdk, @bytelyst/react-native-platform-sdk Cross-platform mobile/native platform access

Services and Dashboards

Surface Port Description
platform-service 4003 Product-agnostic Fastify platform API: auth, flags, telemetry, diagnostics, jobs, analytics, A/B testing, changelog, webhooks, marketplace, predictive analytics, and more
extraction-service 4005 LangExtract-based extraction service with task library and Python sidecar
mcp-server configurable MCP server exposing tool execution, platform tools, dev tools, extraction helpers, and A2A orchestration pipelines
monitoring Loki, Grafana, and health-check tooling
admin-web 3001 Platform admin console
tracker-web 3003 Tracker / public roadmap dashboard
ux-lab internal Internal UX lab and ops prototyping workspace

Note: billing-service (4002), growth-service (4001), and tracker-service (4004) were consolidated into platform-service (Feb 2026).

All services are product-agnostic — every Cosmos document includes a productId field, so a single deployment serves LysnrAI, MindLyst, or any future product.

Consuming Libraries from Product Repos

During development, use file: references in consumer package.json:

// From a Next.js dashboard at repo root (2 levels up):
"@bytelyst/errors": "file:../../learning_ai_common_plat/packages/errors"

// From MindLyst web (3 levels up — inside mindlyst-native/web/):
"@bytelyst/design-tokens": "file:../../../learning_ai_common_plat/packages/design-tokens"

All repos must be cloned side-by-side under the same parent directory.

Portable Builds (Docker / CI)

file: refs break in Docker and CI because the sibling repo isn't available at the expected relative path. Use the tarball prep workflow to make builds self-contained:

# 1. Build all @bytelyst/* packages
pnpm build

# 2. Pack tarballs + rewrite a consumer's package.json
#    (run from the consumer repo)
../learning_ai_common_plat/scripts/prep-consumer.sh <target-dir>

# 3. Build (Docker, EAS, CI — no sibling repo access needed)
docker build <target-dir>

# 4. Restore original package.json
../learning_ai_common_plat/scripts/prep-consumer.sh <target-dir> --restore

Each consumer repo has a convenience wrapper: scripts/docker-prep.sh (or scripts/docker-prep-dashboards.sh in LysnrAI).

Consumer Repo Wrapper Script Targets
learning_voice_ai_agent scripts/docker-prep-dashboards.sh user-dashboard-web
learning_ai_clock scripts/docker-prep.sh web
learning_ai_fastgap scripts/docker-prep.sh root package.json
learning_multimodal_memory_agents scripts/docker-prep.sh mindlyst-native/web

Dashboards inside this repo (dashboards/admin-web, dashboards/tracker-web) use workspace:* refs and do NOT need this workflow — pnpm resolves them automatically.

Infrastructure Lint

Validates all 25 Dockerfiles across the 11 ByteLyst repos using hadolint, and any Helm charts using helm lint + helm template.

# Prerequisites
brew install hadolint helm

# Lint everything
./scripts/lint-infra.sh

# Dockerfiles only / Helm charts only
./scripts/lint-infra.sh --docker
./scripts/lint-infra.sh --helm

# Explicit paths
./scripts/lint-infra.sh path/to/Dockerfile path/to/chart-dir

Suppressed rules (false positives for this codebase): DL3045, DL3018, DL3008, DL3059, SC2155.

Design Tokens

Generate platform-specific token files from the canonical JSON:

cd packages/design-tokens
pnpm generate

Outputs in packages/design-tokens/generated/:

  • tokens.css — CSS custom properties (--ml-*)
  • tokens.ts — TypeScript constants
  • MindLystTokens.kt — Kotlin object for KMP
  • MindLystTheme.swift — Swift structs for SwiftUI

Roadmap

See docs/ROADMAP.md for the full phased extraction plan.