# Engineering Review & Scorecard > Evidence-based, read-only review of the entire `~/code/mygh` workspace (~38 git > repos) per `docs/prompts/engineering-review-scorecard.md`. Generated 2026-05-30. > > **Method:** static inspection only — file reads, `grep`, and read-only `git`. > No builds, installs, or test runs were executed (that would mutate the trees), > so dynamic results (pass/fail, coverage %) are inferred from config + test > counts, not measured. See §9 for limits. Per-repo evidence was gathered by > parallel read-only agents and spot-verified. --- ## 1. Executive Summary **What this is:** a single developer running a surprisingly coherent *product ecosystem* — ~10 product apps (clock, notes, fastgap, peakpulse, flowmonk, efforise, jarvis_jr, trails, talk2obsidian, local-memory-gpt, voice-ai-agent, multimodal/mindlyst) sharing one platform monorepo (`learning_ai_common_plat`, 36 `@bytelyst/*` packages, auth/Cosmos/design-tokens), orchestrated by a single `docker-compose.ecosystem.yml` (~20 services) and driven heavily by AI agents through a homegrown `agent-queue`. This is far more disciplined than a typical "learning" folder. **Overall maturity:** **Beta-quality ecosystem.** A core of genuinely production-grade repos (`learning_ai_notes`, `learning_ai_trails`, `oss/claw-code`/`claw-cowork`, `learning_ai_clock`, `learning_ai_fastgap`) surrounded by a long tail of MVP/prototype repos with thin or zero tests and no CI. **Biggest strengths (top 3)** 1. **Strong platform discipline.** Shared `@bytelyst/*` packages, a repeated `types.ts → repository.ts → routes.ts` backend pattern, Cosmos partition-key conventions (`/userId`, `productId` on every doc), per-repo `AGENTS.md`, conventional commits, and field-level encryption (`field-encrypt.ts`) recur across the best repos. 2. **Clean security posture for a personal workspace.** Secret scans across all repos surfaced **no real committed production secrets** — only `.env.example` placeholders, the public Azure Cosmos emulator key, dev `JWT_SECRET=dev-...` values, and Azure Key Vault *references*. `.gitignore` is present nearly everywhere. 3. **Top repos are legitimately good.** `notes`, `trails`, and the two Rust `claw-*` repos show modular architecture, real test suites (28–80+ files), CI, multi-stage Docker, and strict typing (`0` `as any` in several backends). **Biggest risks (top 3)** 1. **CI is the weak link.** GitHub Actions is **disabled (billing)** on the platform monorepo `learning_ai_common_plat` and on `voice_ai_agent` (`*.disabled` workflows); ~15 repos have **no CI at all**. The shared platform that everything depends on has no automated gate. 2. **Process churn dirties the repos.** A live `agent-queue` daemon + `devin` agents in `--permission-mode dangerous` were actively writing to repos; ~14 repos were found dirty with uncommitted work, several behind `origin`. Work is at risk of being lost or silently diverging. 3. **Testing is bimodal.** Excellent in the flagship repos, **zero** in many others (`productivity_web`, `webui_copilot`, `pytorch_todo_predictor`, `server-survival`, `sidecar_setup`, `mac_tooling`). No portfolio-wide coverage signal. **Is the dev style helping or hurting velocity?** **Net helping, but fraying at the edges.** The platform/agent approach clearly lets one person ship a dozen apps — that's the upside. The drag is operational: disabled CI, constantly-dirty working trees, abandoned worktrees, and "AI-generated scaffolding smell" in a few repos (e.g. `magic_clipboard_mgr`'s 50+ service files + phase-named test buckets). Tightening the commit/CI loop would convert a lot of that churn back into velocity. --- ## 2. Overall Score Sheet Scores are 1–10 (1 = critical/broken, 10 = production-grade), aggregated across the ~30 code repos (pure docs/usage repos excluded from category math). | Category | Score | Justification (evidence) | |---|---|---| | A. Repository organization | **8** | Consistent `@bytelyst/*` + `types/repository/routes` pattern, per-repo `AGENTS.md`, clear monorepos; minus for ~14 dirty trees, stray worktrees, a few unstructured repos. | | B. Code quality | **7** | Flagships: strict TS, `0` `as any`, no `console.log`, Zod validation. Tail: `print()`-heavy (`2nd_brain` 60+, `mac_tooling` 200+), `any` leaks, AI-scaffold smell (`magic_clipboard_mgr`). | | C. Architecture | **8** | Genuinely strong: shared platform, datastore abstraction, deterministic engines (`flowmonk` scheduler), risk-scoring (`trails`), MCP integrations, clean native/web boundaries. | | D. DevOps & deployment | **6** | Ecosystem compose orchestrates ~20 services, multi-stage Dockerfiles common — but **CI disabled on the platform repo**, ~15 repos with no CI, and **0 healthchecks** in `docker-compose.ecosystem.yml`. | | E. Testing | **6** | Bimodal: `notes`/`fastgap`/`clock`/`trails`/`claw-*` have 28–600+ tests; many repos have 0. E2E frequently `continue-on-error: true`. No measured coverage. | | F. Security | **8** | No real committed secrets anywhere; field encryption + Key Vault refs in the mature repos; `.gitignore`/`.env.example` discipline. Minus for `NODE_TLS_REJECT_UNAUTHORIZED=0` in some Docker, thin input-validation in prototypes. | | G. Product readiness | **7** | Several apps runnable end-to-end (web+backend); mobile/native surfaces often partial; CI-disabled + flaky E2E hold back true "launchable". | | H. AI-agent practices | **6** | Impressive tooling (`agent-queue`, profiles, job briefs, `AGENTS.md`), but guardrails are weak: `--permission-mode dangerous`, agents dirtying live repos, duplicate work landing upstream, no enforced test-before-commit. | | I. Personal workflow | **6** | Good: conventional commits, auto `backup-main-*` branches, `AGENTS.md`. Bad: ~14 dirty repos, branches behind `origin`, abandoned worktrees, no unified release/issue discipline. | | **Weighted overall** | **≈ 7.0** | Beta-quality. See weighting below. | **Weighting & rationale:** Security (F) and Product readiness (G) weighted ~1.5×, Testing (E) and DevOps (D) ~1.25× (these gate real-world reliability); A/B/C/H/I at 1.0×. The strong architecture/security pull the number up; the weak CI/testing pull it back to a solid-but-not-shippable **~7.0**. --- ## 3. Per-Product / Per-Repo Breakdown Maturity legend: **PROD** = production-grade, **BETA**, **MVP**, **PROTO** = prototype/learning, **REF** = docs/reference (not code). ### Flagship products (platform-integrated) | Repo | Stack | Tests | CI | Docker | Maturity | |---|---|---|---|---|---| | `learning_ai_notes` | Fastify5 + Next16 + Expo, Cosmos | 80+ files | ✓ gitea | ✓ | **BETA→PROD** | | `learning_ai_trails` | Fastify5 + Next16 + SDK, Cosmos | 28 files | ✓ gitea | ✓ | **PROD** | | `learning_ai_clock` | Next16 PWA + iOS/Android, Fastify | 662 total | ✓ gitea | ✓ | **BETA** | | `learning_ai_fastgap` | Expo + Next16 + Fastify | 700+ total | ✓ gitea (7 jobs) | ✓ | **BETA** | | `learning_ai_peakpulse` | SwiftUI + Fastify | 26 files | ✓ (backend) | ✓ | **BETA→PROD** | | `learning_ai_flowmonk` | Next16 + Fastify + Expo | 102 backend | ✓ gitea | ✓ | **BETA** | | `learning_ai_efforise` | React/Vite + Fastify + RN | ~9 backend | ✓ gitea | ✓ | **MVP** | | `learning_ai_dev_intelli` | Fastify + Next16, GitHub API | 52 backend | ✓ gitea | ✓ | **MVP** | | `learning_ai_local_memory_gpt` | Fastify + Next16, SQLite/Ollama | 122 | ✓ gitea | ✓ | **MVP** | | `learning_ai_talk2obsidian` | Fastify + Vite, SQLite/Ollama | 8 | ✗ | ✓ | **BETA** | | `learning_voice_ai_agent` | Python + Fastify + Next + KMP | 463+ | ⚠ disabled | ✓ | **BETA** | | `learning_multimodal_memory_agents` (MindLyst) | KMP + Next + Fastify | 33 | ⚠ disabled | ✓ | **MVP** | | `learning_ai_jarvis_jr` | SwiftUI + Next + Android | ~13 web | ✓ gitea | ✓ | **ALPHA/BETA** | | `learning_ai_auth_app` | iOS/watchOS/Android (spec+UI) | 0 (here) | ✗ | ✗ | **MVP (spec)** | ### Platform & infra | Repo | Stack | Notes | Maturity | |---|---|---|---| | `learning_ai_common_plat` | pnpm monorepo, 36 `@bytelyst/*`, Fastify, Cosmos | ~466k LOC; full auth (OAuth/MFA/passkeys/SAML); **GH Actions disabled (billing)**, gitea CI active | **PROD** | | `learning_ai_devops_tools` | Bash + Python + Node (this repo) | GitHub admin scripts, `agent-queue`, Hermes dashboard; thin tests | **PROD (scripts) / MVP (dash)** | | `learning_ai_k8s_streaming` | Python FastAPI + Helm | Use-case registry, HPA/probes, load tools | **BETA→PROD** | | `learning_ai_local_llms` | Next16 dashboard + Python TTS | Ollama mission-control; 57 tests | **BETA** | ### Tools / OSS / native | Repo | Stack | Notes | Maturity | |---|---|---|---| | `oss/learning_ai_claw-code-oss` | Rust workspace (10+ crates) | `unsafe forbid`, clippy pedantic, 40+ test files | **PROD** | | `oss/learning_ai_claw-cowork` | Rust + Tauri + Python | 65+ test files, E2E, Docker | **PROD** | | `learning_magic_terminal` | **Rust** | README+CI+many tests; command-blocks v2; dirty(5) | **BETA** | | `learning_notif_scanr` | **Swift** (Package.swift) | tests present, **no CI**, no Docker | **MVP** | | `ios/learning_swift_hourglass` | Swift/SwiftUI macOS | MVVM, 2 test files, no CI | **MVP** | | `learning_ai_magic_clipboard_mgr` | Swift/macOS, GRDB | 24 tests but 50+ services + phase-named tests (AI-scaffold smell) | **MVP** | | `learning_ai_mac_tooling` | Python FastAPI + React | forensics toolkit; **0 tests**, 200+ `print()`, 3k-line files | **PROTO** | | `copilot/learning_ai_uxui_web` | Next16 + MSW + Playwright | component showcase, Lighthouse CI | **MVP** | | `learning_ai_productivity_web` | Next15, client-only | clean registry pattern, **0 tests** | **MVP** | | `learning_ai_webui_copilot` | Python FastAPI + LangChain | rules/policy engines, **0 tests, no Docker/CI** | **MVP** | | `learning_agent_monitoring_fx` | npm monorepo + KMP | agent/ingest/web work, native WIP, 54 `console.log`, TODOs | **BETA** | | `learning_agentic_tools_portal` | Python Flask + uv | minimal (1 endpoint, 1 test), has CI | **PROTO** | | `learning_server-survival-devops-web` | Vanilla JS + Three.js | playable game, **0 tests** | **MVP** | | `learning_pytorch_todo_predictor` | Python + PyTorch | educational, **0 tests**, **no upstream** | **PROTO** | | `learning_sidecar_setup` | Next16 scaffold + py stub | scaffolding only, **no upstream**, dirty(8) | **PROTO** | | `learning_claude_code_setup` | Bash + markdown | setup notes/scripts; dirty(1) | **REF** | | `learning_github_copilot` | Markdown (CLI/SDK docs) | reference only | **REF** | | `learning_python_sandbox` | Python | LeetCode/learning; dirty(1) | **PROTO** | | `learning_ai_materials` | Docs | NBA handover package | **REF** | | `learning_windsurf_setup` | Usage logs | not a codebase | **N/A** | --- ## 4. Findings by Dimension ### A. Repository organization - **Fact:** Strong, repeated conventions — `AGENTS.md`/`CLAUDE.md` per repo, pnpm workspaces, `types→repository→routes` backend modules, `docs/` with PRD/ROADMAP. - **Fact:** ~14 repos dirty at audit time; abandoned `worktrees/` (now cleaned); some repos behind `origin`. Two repos (`pytorch_todo_predictor`, `sidecar_setup`) have **no git upstream**. - **Reco:** Adopt a "clean tree or it doesn't exist" rule (see §8). Add upstreams for the two orphan repos or mark them clearly local. ### B. Code quality - **Fact:** Best repos enforce strict TS (`0` `as any` in `notes`, `trails`, `local_memory_gpt` backends), no `console.log` (Fastify logger), Zod validation. - **Fact:** `learning_ai_2nd_brain` has 60+ `print()`; `mac_tooling` 200+ and 3k+-line files (`network_transfer_audit.py` 3521 lines); `magic_clipboard_mgr` shows AI-scaffold smell (50+ service files, `Phase5–8`/`RemainingQATests`). - **Reco:** Lint-gate `print()`/`console.log` in the Python/TS repos; split the 3k-line files; audit `magic_clipboard_mgr` for stubbed vs real services. ### C. Architecture - **Fact:** Clear separation and reuse: shared auth/datastore/design-tokens, deterministic scheduler (`flowmonk`), risk engine (`trails`), use-case registry (`k8s_streaming`), MCP tool servers, Rust crate boundaries (`claw-*`). - **Reco:** This is the strongest dimension — protect it by keeping product domains out of `common_plat` and vice-versa. ### D. DevOps & deployment - **Fact:** `docker-compose.ecosystem.yml` wires ~20 services (10 backends + 10 webs) + infra (Cosmos emulator, Azurite, Traefik, Loki, Grafana, MCP); 30 `restart:` policies, 24 `build:` contexts, but **0 `healthcheck:` blocks**. - **Fact:** GH Actions disabled on `common_plat` + `voice_ai_agent`; ~15 repos no CI. - **Reco (P1):** Add healthchecks + `depends_on: condition: service_healthy` to the ecosystem compose; re-enable or fully migrate CI to gitea self-hosted. ### E. Testing - **Fact:** `fastgap` (~700), `clock` (662), `notes` (80+ files), `voice_ai_agent` (463+), `claw-cowork` (65+ files) are excellent; ~8 repos have 0 tests. - **Fact:** E2E often `continue-on-error: true` (`fastgap`, `flowmonk`, `jarvis_jr`, `local_memory_gpt`) — i.e. not actually gating. - **Reco:** Set a per-repo minimum (smoke + happy-path) and stop masking E2E failures with `continue-on-error` once stabilized. ### F. Security - **Fact:** No real committed secrets across all repos. Matches were `.env.example` placeholders, the public Cosmos emulator key (`C2y6yDjf5/R...`), `dev-*` JWT secrets, and Azure Key Vault references. - **Fact:** Field encryption (AES-256-GCM) in `clock`/`notes`/`dev_intelli`; `unsafe_code = "forbid"` in the Rust repos. - **Watch:** `NODE_TLS_REJECT_UNAUTHORIZED=0` seen in some Docker setups; thin input validation / no rate-limiting in the prototype Python apps. ### G. Product readiness - **Fact:** Web+backend pairs generally run end-to-end; native/mobile surfaces (iOS/Android/KMP) are frequently partial or scaffolded. - **Reco:** Pick 2–3 flagships (`notes`, `trails`, `clock`) and drive them to a true launch checklist; treat the rest explicitly as experiments. ### H. AI-agent practices - **Fact:** Sophisticated `agent-queue` (profiles, job briefs, lifecycle dirs, Node dashboard) — genuinely advanced for a solo setup. - **Fact:** Guardrails weak: agents run `--permission-mode dangerous`, write to live working trees (caused the dirty-repo churn), and **landed duplicate work** (during this session a rebase auto-dropped 2 commits already pushed upstream). - **Reco:** Standardize the agent task contract (§8): one task = one branch = clean tree → tests → commit → push; ignore runtime/queue state in git (already fixed in this repo this session). ### I. Personal engineering workflow - **Fact:** Conventional commits, auto `backup-main-*` branches (nice safety net), `AGENTS.md` discipline. - **Fact:** Too many long-lived dirty trees and behind-`origin` branches; no visible issue tracker or release cadence. - **Reco:** A weekly "sync sweep" (rebase+push all clean repos, list dirty) — you effectively did this manually this session; automate it. --- ## 5. Prioritized Action Plan **P0 — now (correctness / risk)** 1. **Re-establish a working CI gate on `learning_ai_common_plat`** (everything depends on it). Either fix GH Actions billing or make gitea CI the enforced gate. *(M, common_plat)* 2. **Resolve the ~14 dirty repos**: review + commit or discard intentionally; add upstreams for `pytorch_todo_predictor` & `sidecar_setup`. *(M, workspace)* 3. **Decide the agent-queue daemon policy** so it doesn't write to live trees uncontrolled (it was running in `dangerous` mode). *(S, devops_tools)* **P1 — this week** 4. Add **healthchecks** to `docker-compose.ecosystem.yml` (0 today) + ordered `depends_on`. *(M, common_plat/ecosystem)* 5. Stop masking E2E with `continue-on-error: true` once stabilized; make at least smoke E2E gating. *(M, fastgap/flowmonk/jarvis_jr)* 6. Replace `print()` with logging in `2nd_brain` (60+) and `mac_tooling` (200+). *(S–M)* **P2 — this month** 7. Add minimum test suites to the 0-test repos that matter (`productivity_web`, `webui_copilot`, `agent_monitoring_fx`). *(M)* 8. Audit `magic_clipboard_mgr` for dead/stubbed services (50+ files). *(M)* 9. Split 3k-line files in `mac_tooling`. *(M)* 10. Remove `NODE_TLS_REJECT_UNAUTHORIZED=0` from Docker; add rate-limiting to the Python prototypes. *(S–M)* **P3 — nice to have** 11. Portfolio-wide coverage reporting + dependency audit (`npm audit`/`pip-audit`) in CI. *(M)* 12. A lightweight issue/release cadence for the 2–3 flagships. *(S)* --- ## 6. Safe Auto-Fix Candidates *(Low-risk; listed only — not applied. Each needs your approval.)* - **Ecosystem compose healthchecks** — add `healthcheck:` to each backend/web service in `docker-compose.ecosystem.yml`. Safe: additive. - **Add upstreams** for `learning_pytorch_todo_predictor` and `learning_sidecar_setup` (`git remote add origin … && git push -u`). Safe once remote exists. - **Lint rule to ban `print()`** in `learning_ai_2nd_brain` (ruff `T20`) — flags only; you fix incrementally. - **Drop `NODE_TLS_REJECT_UNAUTHORIZED=0`** from Docker envs where a real CA/host override is available. (Verify per service first.) - **`.gitignore` audit** for the few repos still tracking runtime artifacts (pattern already fixed in `devops_tools` this session). ## 7. Delegate-to-Agent Queue Ready-to-paste briefs (each self-contained, one branch, clean-tree rule): 1. **"Add healthchecks to ecosystem compose"** — repo `common_plat`; read `docker-compose.ecosystem.yml`; add `healthcheck` + ordered `depends_on` to all `*-backend`/`*-web` services; `docker compose config` must pass; no app code changes. 2. **"De-`print()` 2nd_brain"** — repo `learning_ai_2nd_brain`; replace `print()` with `typer.echo`/logging in `src/brain/**`; keep behavior identical; run `pytest`. 3. **"Bootstrap tests for webui_copilot"** — repo `learning_ai_webui_copilot`; add `pytest` smoke tests for `site_backend` rules/policy engines + a copilot happy-path; wire a `.github`/gitea CI job. 4. **"Service audit: magic_clipboard_mgr"** — repo `learning_ai_magic_clipboard_mgr`; produce a report of which of the 50+ services are wired vs stubbed; no code changes. 5. **"Stabilize E2E"** — repos `fastgap`/`flowmonk`; make smoke E2E reliable, then remove `continue-on-error: true` for that job only. ## 8. Recommended Standard Operating Procedure (for every agent task) 1. **One task = one branch** off latest `origin/main`; never work on a dirty tree. 2. **Scope it** with a job brief (you already do this in `agent-queue/docs/jobs/`). 3. **Test before commit**: typecheck + lint + unit must pass locally. 4. **Commit small**, conventional messages; **push the branch**, open a PR — don't let agents push straight to `main` of the shared platform. 5. **Never track runtime/queue state** (ignore `agent-queue/queue/*` lifecycle — fixed here this session). 6. **Prefer least-privilege** over `--permission-mode dangerous`; reserve dangerous mode for sandboxed/disposable checkouts. 7. **Weekly sync sweep**: rebase+push all clean repos, list dirty ones for review. ## 9. What I Could Not Inspect - **No dynamic results.** I did not run `npm/pnpm install`, builds, `pytest`, `vitest`, Playwright, `cargo test`, or `docker compose up` (those mutate trees / need services). Test counts and CI configs are evidence of *intended* coverage, not measured pass/coverage. - **No live `git` per-repo ahead/behind** inside the read-only agents (they lacked shell git); branch/dirty facts come from the orchestrator's own checks and may have shifted as the agent-queue daemon ran. - **One agent batch misfired**: it reported 5 repos as "missing" (`claude_code_setup`, `github_copilot`, `magic_terminal`, `notif_scanr`, `python_sandbox`) due to a read-access issue; I re-scanned them directly — they exist (notably `magic_terminal` = Rust, `notif_scanr` = Swift). - **Mobile/native depth** (iOS/Android/KMP/Tauri runtime behavior) and **secret *values*** were not executed/decrypted — only presence/format was checked. - **`.env.ecosystem`** holds dev-only values; production secret management (Key Vault wiring) was inferred from references, not verified live. --- ### TL;DR - Coherent **beta-grade product ecosystem** (~38 repos) — far beyond "learning". - **Architecture & security are strong; CI & testing are the weak links.** - **P0:** restore a CI gate on `common_plat`, clean the ~14 dirty repos, and rein in the `dangerous`-mode agent-queue. - A handful of flagships (`notes`, `trails`, `claw-*`, `clock`, `fastgap`) are genuinely production-grade; the long tail is MVP/prototype. - Tighten the agent commit/CI loop (§8) and most of the operational churn converts back into velocity.