docs(agents): sync local llm routing guidance

This commit is contained in:
root 2026-03-14 06:26:08 +00:00
parent 2b4fccb744
commit bfecc9f95d

View File

@ -18,6 +18,7 @@
| **Runtime** | Node.js (ESM), TypeScript 5.7+ |
| **Package manager** | pnpm (workspace) |
| **Test runner** | Vitest |
| **Prototype stack** | Docker Compose with Cosmos emulator, Azurite, Mailpit, Traefik, Loki, and Grafana |
## 2. Monorepo Layout
@ -36,6 +37,7 @@ learning_ai_common_plat/
│ ├── blob/ # Azure Blob Storage client + SAS token helpers
│ ├── extraction/ # createExtractionClient(), shared types for extraction consumers
│ ├── monitoring/ # Health-check utilities, Loki/Grafana helpers
│ ├── llm-router/ # Deterministic LLM router: provider/model selection, fallback, health
│ └── design-tokens/ # Cross-platform tokens (JSON → CSS/TS/Kotlin/Swift)
│ ├── tokens/bytelyst.tokens.json # ← CANONICAL SOURCE
│ ├── scripts/generate.ts # Token generator
@ -51,15 +53,35 @@ learning_ai_common_plat/
│ ├── extraction-service/ # LangExtract text extraction + Python sidecar (port 4005)
│ └── monitoring/ # Loki + Grafana config, health-check script
├── docs/ # Architecture docs, roadmap, analysis
├── package.json # Root scripts: build, test, typecheck, clean
├── package.json # Root scripts: build, test, typecheck, clean, prototype:self-test
├── pnpm-workspace.yaml # Workspace: packages/* + services/* + dashboards/*
├── tsconfig.base.json # Shared TS config (ES2022, NodeNext, strict)
├── vitest.config.ts # Root vitest config (passWithNoTests)
├── docker-compose.yml # Full service stack + Loki + Grafana + Traefik
├── docker-compose.yml # Prototype stack: services + Cosmos emulator + Azurite + Mailpit + monitoring
├── .env.example # Required env vars template
└── .editorconfig # Editor settings
```
## 2A. Prototype Runtime Conventions
- The current single-host prototype is defined by [`docker-compose.yml`](docker-compose.yml).
- Prototype state currently lives in:
- Cosmos DB Emulator
- Azurite blob storage
- Mailpit SMTP sandbox
- Reuse the existing prototype diagnostics instead of adding parallel health endpoints:
- `GET /health`
- `GET /api/health/dependencies`
- `GET /api/self-test`
- `GET /api/self-test.json`
- The canonical host-side smoke test command is `pnpm prototype:self-test`.
- The underlying implementation is [`scripts/prototype-self-test.sh`](scripts/prototype-self-test.sh). Extend it instead of creating duplicate one-off prototype scripts.
- If you change prototype infra, also update:
- [`README.md`](README.md)
- [`docs/PROTOTYPE_DEPLOYMENT.md`](docs/PROTOTYPE_DEPLOYMENT.md)
- [`docker-compose.yml`](docker-compose.yml)
- [`.env.example`](.env.example) when tracked defaults change
## 3. Tech Stack Rules
### TypeScript (all packages + services)
@ -87,6 +109,7 @@ learning_ai_common_plat/
- Use `exports` field in `package.json` (not just `main`)
- Peer dependencies for heavy/shared deps (`@azure/cosmos`, `jose`, `bcryptjs`, `react`, `zod`)
- Workspace deps: `"@bytelyst/errors": "workspace:*"`
- If a dashboard or app consumes a local package outside the pnpm workspace, build the package first and import from `dist/`, not raw `src/`
## 4. Coding Conventions
@ -95,9 +118,13 @@ learning_ai_common_plat/
- Every Cosmos document MUST include a `productId` field
- Every REST endpoint MUST validate input with Zod schemas
- Every service MUST propagate `x-request-id` headers
- Prototype infra changes MUST preserve `pnpm prototype:self-test`
- Prototype dependency checks belong in the shared status/self-test surfaces, not ad hoc endpoints
- Use `PRODUCT_ID` from `@bytelyst/config` (`loadProductIdentity()`) — never hardcode
- Services use self-contained Zod config schemas in `src/lib/config.ts` (avoids zod version mismatch with shared packages)
- Services re-export from `@bytelyst/*` in their `src/lib/` files (`errors.ts`, `cosmos.ts`, `product-config.ts`)
- For LLM routing, prefer `@bytelyst/llm-router` as the source of truth; do not introduce parallel routing logic unless explicitly required
- `__LOCAL_LLMs/dashboard` uses server-side routing in `src/app/api/ollama/chat/route.ts`; preserve that pattern and keep the UI as a thin client
- Commit messages: `type(scope): description` — types: `feat`, `fix`, `docs`, `refactor`, `test`, `chore`
### MUST NOT do
@ -106,6 +133,7 @@ learning_ai_common_plat/
- Never use `any` type — use Zod inference or explicit types
- Never hardcode secrets or API keys
- Secret guardrails: Husky runs `scripts/secret-scan-staged.sh` (pre-commit) and `scripts/secret-scan-repo.sh` (pre-push). See `docs/WINDSURF/CODEX_SESSION_SUMMARY_AND_PLAYBOOK.md`.
- Never commit real emulator keys or blob account keys in tracked files; keep placeholders in `.env.example`
- Never modify tests to make them pass — fix the actual code
- Never delete existing comments or documentation unless explicitly asked
- Never add emojis to code unless explicitly asked
@ -201,6 +229,7 @@ See full audit: [`docs/design-system/DESIGN_SYSTEM_AUDIT_2026-03-03.md`](docs/de
| **JWT / Auth** | `packages/auth/` | `src/index.ts``signJwt()`, `verifyJwt()`, `hashPassword()`, `verifyPassword()`, auth middleware |
| **API client** | `packages/api-client/` | `src/index.ts``createApiClient()` with token injection |
| **React auth** | `packages/react-auth/` | `src/index.ts``createAuthContext()` factory (provider + hook) |
| **LLM router** | `packages/llm-router/` | `src/router.ts` — route/plan/fallback, `src/registry.ts` — providers incl. local Ollama |
| **Design tokens** | `packages/design-tokens/` | `tokens/bytelyst.tokens.json` (source), `scripts/generate.ts` (generator), `generated/` (output) |
| **Auth / JWT issue** | `services/platform-service/` | `src/modules/auth/` |
| **Feature flags** | `services/platform-service/` | `src/modules/flags/` — FNV-1a hash for deterministic rollout |
@ -225,6 +254,7 @@ See full audit: [`docs/design-system/DESIGN_SYSTEM_AUDIT_2026-03-03.md`](docs/de
| **Extraction Python** | `services/extraction-service/` | `python/src/` — LangExtract sidecar (FastAPI :4006), extractor, task registry, language detection |
| **Extraction package** | `packages/extraction/` | `src/index.ts``createExtractionClient()`, shared types for consumers |
| **Monitoring** | `services/monitoring/` | `health-check.ts`, `loki/`, `grafana/` |
| **Local LLM dashboard** | `__LOCAL_LLMs/dashboard/` | `src/app/api/ollama/chat/route.ts` — shared router + Ollama bridge, `src/app/lib/llm-router.ts` — built package re-export |
### Dashboard Consumers (via `file:` refs)
@ -238,6 +268,17 @@ The following dashboards consume `@bytelyst/*` packages:
**Prerequisite:** Run `pnpm build` in this repo before running `npm install` in any dashboard.
### Local Mission Control Dashboard
`__LOCAL_LLMs/dashboard/` is a standalone Next.js app for local Ollama usage. It is not part of the pnpm workspace, but it consumes `@bytelyst/llm-router`.
- Build source of truth: `packages/llm-router/`
- Dashboard bridge: `__LOCAL_LLMs/dashboard/src/app/lib/llm-router.ts`
- Chat routing path: `__LOCAL_LLMs/dashboard/src/app/api/ollama/chat/route.ts`
- The dashboard must consume the built package output from `packages/llm-router/dist/`
- `__LOCAL_LLMs/dashboard/package.json` runs `predev` and `prebuild` hooks to build `@bytelyst/llm-router` automatically
- Auto model selection should happen on the server route, not in the React client
## 6. How to Run Things
```bash
@ -263,10 +304,15 @@ pnpm --filter @lysnrai/extraction-service dev # port 4005
# ── Run tests for one workspace ────────────────────
pnpm --filter @lysnrai/platform-service test
pnpm --filter @bytelyst/errors test
pnpm --filter @bytelyst/llm-router test
# ── Generate design tokens ─────────────────────────
pnpm --filter @bytelyst/design-tokens generate
# ── Local LLM dashboard ────────────────────────────
cd __LOCAL_LLMs/dashboard && npm run dev # predev builds @bytelyst/llm-router first
cd __LOCAL_LLMs/dashboard && npm run build # prebuild builds @bytelyst/llm-router first
# ── Docker Compose (all services + monitoring) ─────
docker compose up -d
docker compose down
@ -363,6 +409,7 @@ DEFAULT_PRODUCT_ID=lysnrai
@bytelyst/blob ← peers: @azure/storage-blob
@bytelyst/extraction ← no deps (types + client factory)
@bytelyst/monitoring ← no deps (health-check utilities)
@bytelyst/llm-router ← no deps (deterministic provider/model router + local Ollama plan support)
@bytelyst/fastify-core ← deps: fastify; createServiceApp() + startService()
@bytelyst/logger ← deps: pino
@bytelyst/testing ← deps: vitest; shared mocks + Fastify inject helpers
@ -418,5 +465,7 @@ Each product backend uses `@bytelyst/*` packages via `file:` refs and follows th
3. **Don't use `dependencies` for heavy libs in packages** — use `peerDependencies` so consumers control the version.
4. **Don't mix up package scopes** — libraries are `@bytelyst/*`, services are `@lysnrai/*`.
5. **Don't run `npm` commands** — this is a pnpm workspace. Always use `pnpm`.
Exception: `__LOCAL_LLMs/dashboard/` is intentionally npm-managed and may use `npm run dev/build/start`.
6. **Don't modify generated files directly** — edit `bytelyst.tokens.json` and re-run the generator.
7. **Build packages before testing services** — service tests may import from `@bytelyst/*` dist. Run `pnpm build` first if you get import errors.
7. **Build packages before testing services or non-workspace dashboards** — service tests and `__LOCAL_LLMs/dashboard` may import from `@bytelyst/*` `dist/`. Run `pnpm build` first if you get import errors.
8. **Don't duplicate LLM routing logic** — update `@bytelyst/llm-router` first, then have consumers call it.