From bfecc9f95d521a34cb862a0bcfe3b3f8453f82c2 Mon Sep 17 00:00:00 2001 From: root Date: Sat, 14 Mar 2026 06:26:08 +0000 Subject: [PATCH] docs(agents): sync local llm routing guidance --- AGENTS.md | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 52 insertions(+), 3 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 82e36c4f..e3fabb11 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -18,6 +18,7 @@ | **Runtime** | Node.js (ESM), TypeScript 5.7+ | | **Package manager** | pnpm (workspace) | | **Test runner** | Vitest | +| **Prototype stack** | Docker Compose with Cosmos emulator, Azurite, Mailpit, Traefik, Loki, and Grafana | ## 2. Monorepo Layout @@ -36,6 +37,7 @@ learning_ai_common_plat/ │ ├── blob/ # Azure Blob Storage client + SAS token helpers │ ├── extraction/ # createExtractionClient(), shared types for extraction consumers │ ├── monitoring/ # Health-check utilities, Loki/Grafana helpers +│ ├── llm-router/ # Deterministic LLM router: provider/model selection, fallback, health │ └── design-tokens/ # Cross-platform tokens (JSON → CSS/TS/Kotlin/Swift) │ ├── tokens/bytelyst.tokens.json # ← CANONICAL SOURCE │ ├── scripts/generate.ts # Token generator @@ -51,15 +53,35 @@ learning_ai_common_plat/ │ ├── extraction-service/ # LangExtract text extraction + Python sidecar (port 4005) │ └── monitoring/ # Loki + Grafana config, health-check script ├── docs/ # Architecture docs, roadmap, analysis -├── package.json # Root scripts: build, test, typecheck, clean +├── package.json # Root scripts: build, test, typecheck, clean, prototype:self-test ├── pnpm-workspace.yaml # Workspace: packages/* + services/* + dashboards/* ├── tsconfig.base.json # Shared TS config (ES2022, NodeNext, strict) ├── vitest.config.ts # Root vitest config (passWithNoTests) -├── docker-compose.yml # Full service stack + Loki + Grafana + Traefik +├── docker-compose.yml # Prototype stack: services + Cosmos emulator + Azurite + Mailpit + monitoring ├── .env.example # Required env vars template └── .editorconfig # Editor settings ``` +## 2A. Prototype Runtime Conventions + +- The current single-host prototype is defined by [`docker-compose.yml`](docker-compose.yml). +- Prototype state currently lives in: + - Cosmos DB Emulator + - Azurite blob storage + - Mailpit SMTP sandbox +- Reuse the existing prototype diagnostics instead of adding parallel health endpoints: + - `GET /health` + - `GET /api/health/dependencies` + - `GET /api/self-test` + - `GET /api/self-test.json` +- The canonical host-side smoke test command is `pnpm prototype:self-test`. +- The underlying implementation is [`scripts/prototype-self-test.sh`](scripts/prototype-self-test.sh). Extend it instead of creating duplicate one-off prototype scripts. +- If you change prototype infra, also update: + - [`README.md`](README.md) + - [`docs/PROTOTYPE_DEPLOYMENT.md`](docs/PROTOTYPE_DEPLOYMENT.md) + - [`docker-compose.yml`](docker-compose.yml) + - [`.env.example`](.env.example) when tracked defaults change + ## 3. Tech Stack Rules ### TypeScript (all packages + services) @@ -87,6 +109,7 @@ learning_ai_common_plat/ - Use `exports` field in `package.json` (not just `main`) - Peer dependencies for heavy/shared deps (`@azure/cosmos`, `jose`, `bcryptjs`, `react`, `zod`) - Workspace deps: `"@bytelyst/errors": "workspace:*"` +- If a dashboard or app consumes a local package outside the pnpm workspace, build the package first and import from `dist/`, not raw `src/` ## 4. Coding Conventions @@ -95,9 +118,13 @@ learning_ai_common_plat/ - Every Cosmos document MUST include a `productId` field - Every REST endpoint MUST validate input with Zod schemas - Every service MUST propagate `x-request-id` headers +- Prototype infra changes MUST preserve `pnpm prototype:self-test` +- Prototype dependency checks belong in the shared status/self-test surfaces, not ad hoc endpoints - Use `PRODUCT_ID` from `@bytelyst/config` (`loadProductIdentity()`) — never hardcode - Services use self-contained Zod config schemas in `src/lib/config.ts` (avoids zod version mismatch with shared packages) - Services re-export from `@bytelyst/*` in their `src/lib/` files (`errors.ts`, `cosmos.ts`, `product-config.ts`) +- For LLM routing, prefer `@bytelyst/llm-router` as the source of truth; do not introduce parallel routing logic unless explicitly required +- `__LOCAL_LLMs/dashboard` uses server-side routing in `src/app/api/ollama/chat/route.ts`; preserve that pattern and keep the UI as a thin client - Commit messages: `type(scope): description` — types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore` ### MUST NOT do @@ -106,6 +133,7 @@ learning_ai_common_plat/ - Never use `any` type — use Zod inference or explicit types - Never hardcode secrets or API keys - Secret guardrails: Husky runs `scripts/secret-scan-staged.sh` (pre-commit) and `scripts/secret-scan-repo.sh` (pre-push). See `docs/WINDSURF/CODEX_SESSION_SUMMARY_AND_PLAYBOOK.md`. +- Never commit real emulator keys or blob account keys in tracked files; keep placeholders in `.env.example` - Never modify tests to make them pass — fix the actual code - Never delete existing comments or documentation unless explicitly asked - Never add emojis to code unless explicitly asked @@ -201,6 +229,7 @@ See full audit: [`docs/design-system/DESIGN_SYSTEM_AUDIT_2026-03-03.md`](docs/de | **JWT / Auth** | `packages/auth/` | `src/index.ts` — `signJwt()`, `verifyJwt()`, `hashPassword()`, `verifyPassword()`, auth middleware | | **API client** | `packages/api-client/` | `src/index.ts` — `createApiClient()` with token injection | | **React auth** | `packages/react-auth/` | `src/index.ts` — `createAuthContext()` factory (provider + hook) | +| **LLM router** | `packages/llm-router/` | `src/router.ts` — route/plan/fallback, `src/registry.ts` — providers incl. local Ollama | | **Design tokens** | `packages/design-tokens/` | `tokens/bytelyst.tokens.json` (source), `scripts/generate.ts` (generator), `generated/` (output) | | **Auth / JWT issue** | `services/platform-service/` | `src/modules/auth/` | | **Feature flags** | `services/platform-service/` | `src/modules/flags/` — FNV-1a hash for deterministic rollout | @@ -225,6 +254,7 @@ See full audit: [`docs/design-system/DESIGN_SYSTEM_AUDIT_2026-03-03.md`](docs/de | **Extraction Python** | `services/extraction-service/` | `python/src/` — LangExtract sidecar (FastAPI :4006), extractor, task registry, language detection | | **Extraction package** | `packages/extraction/` | `src/index.ts` — `createExtractionClient()`, shared types for consumers | | **Monitoring** | `services/monitoring/` | `health-check.ts`, `loki/`, `grafana/` | +| **Local LLM dashboard** | `__LOCAL_LLMs/dashboard/` | `src/app/api/ollama/chat/route.ts` — shared router + Ollama bridge, `src/app/lib/llm-router.ts` — built package re-export | ### Dashboard Consumers (via `file:` refs) @@ -238,6 +268,17 @@ The following dashboards consume `@bytelyst/*` packages: **Prerequisite:** Run `pnpm build` in this repo before running `npm install` in any dashboard. +### Local Mission Control Dashboard + +`__LOCAL_LLMs/dashboard/` is a standalone Next.js app for local Ollama usage. It is not part of the pnpm workspace, but it consumes `@bytelyst/llm-router`. + +- Build source of truth: `packages/llm-router/` +- Dashboard bridge: `__LOCAL_LLMs/dashboard/src/app/lib/llm-router.ts` +- Chat routing path: `__LOCAL_LLMs/dashboard/src/app/api/ollama/chat/route.ts` +- The dashboard must consume the built package output from `packages/llm-router/dist/` +- `__LOCAL_LLMs/dashboard/package.json` runs `predev` and `prebuild` hooks to build `@bytelyst/llm-router` automatically +- Auto model selection should happen on the server route, not in the React client + ## 6. How to Run Things ```bash @@ -263,10 +304,15 @@ pnpm --filter @lysnrai/extraction-service dev # port 4005 # ── Run tests for one workspace ──────────────────── pnpm --filter @lysnrai/platform-service test pnpm --filter @bytelyst/errors test +pnpm --filter @bytelyst/llm-router test # ── Generate design tokens ───────────────────────── pnpm --filter @bytelyst/design-tokens generate +# ── Local LLM dashboard ──────────────────────────── +cd __LOCAL_LLMs/dashboard && npm run dev # predev builds @bytelyst/llm-router first +cd __LOCAL_LLMs/dashboard && npm run build # prebuild builds @bytelyst/llm-router first + # ── Docker Compose (all services + monitoring) ───── docker compose up -d docker compose down @@ -363,6 +409,7 @@ DEFAULT_PRODUCT_ID=lysnrai @bytelyst/blob ← peers: @azure/storage-blob @bytelyst/extraction ← no deps (types + client factory) @bytelyst/monitoring ← no deps (health-check utilities) +@bytelyst/llm-router ← no deps (deterministic provider/model router + local Ollama plan support) @bytelyst/fastify-core ← deps: fastify; createServiceApp() + startService() @bytelyst/logger ← deps: pino @bytelyst/testing ← deps: vitest; shared mocks + Fastify inject helpers @@ -418,5 +465,7 @@ Each product backend uses `@bytelyst/*` packages via `file:` refs and follows th 3. **Don't use `dependencies` for heavy libs in packages** — use `peerDependencies` so consumers control the version. 4. **Don't mix up package scopes** — libraries are `@bytelyst/*`, services are `@lysnrai/*`. 5. **Don't run `npm` commands** — this is a pnpm workspace. Always use `pnpm`. + Exception: `__LOCAL_LLMs/dashboard/` is intentionally npm-managed and may use `npm run dev/build/start`. 6. **Don't modify generated files directly** — edit `bytelyst.tokens.json` and re-run the generator. -7. **Build packages before testing services** — service tests may import from `@bytelyst/*` dist. Run `pnpm build` first if you get import errors. +7. **Build packages before testing services or non-workspace dashboards** — service tests and `__LOCAL_LLMs/dashboard` may import from `@bytelyst/*` `dist/`. Run `pnpm build` first if you get import errors. +8. **Don't duplicate LLM routing logic** — update `@bytelyst/llm-router` first, then have consumers call it.