Read-only, evidence-based review of the ~38-repo workspace produced from docs/prompts/engineering-review-scorecard.md: per-repo breakdown, 1-10 category scorecard (weighted overall ~7.0, beta-grade), prioritized P0-P3 action plan, safe auto-fix candidates, delegate-to-agent queue, and an agent SOP. No code changes. Generated with [Devin](https://cli.devin.ai/docs) Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
20 KiB
Engineering Review & Scorecard
Evidence-based, read-only review of the entire
~/code/myghworkspace (~38 git repos) perdocs/prompts/engineering-review-scorecard.md. Generated 2026-05-30.Method: static inspection only — file reads,
grep, and read-onlygit. No builds, installs, or test runs were executed (that would mutate the trees), so dynamic results (pass/fail, coverage %) are inferred from config + test counts, not measured. See §9 for limits. Per-repo evidence was gathered by parallel read-only agents and spot-verified.
1. Executive Summary
What this is: a single developer running a surprisingly coherent product
ecosystem — ~10 product apps (clock, notes, fastgap, peakpulse, flowmonk,
efforise, jarvis_jr, trails, talk2obsidian, local-memory-gpt, voice-ai-agent,
multimodal/mindlyst) sharing one platform monorepo (learning_ai_common_plat,
36 @bytelyst/* packages, auth/Cosmos/design-tokens), orchestrated by a single
docker-compose.ecosystem.yml (~20 services) and driven heavily by AI agents
through a homegrown agent-queue. This is far more disciplined than a typical
"learning" folder.
Overall maturity: Beta-quality ecosystem. A core of genuinely
production-grade repos (learning_ai_notes, learning_ai_trails,
oss/claw-code/claw-cowork, learning_ai_clock, learning_ai_fastgap)
surrounded by a long tail of MVP/prototype repos with thin or zero tests and no
CI.
Biggest strengths (top 3)
- Strong platform discipline. Shared
@bytelyst/*packages, a repeatedtypes.ts → repository.ts → routes.tsbackend pattern, Cosmos partition-key conventions (/userId,productIdon every doc), per-repoAGENTS.md, conventional commits, and field-level encryption (field-encrypt.ts) recur across the best repos. - Clean security posture for a personal workspace. Secret scans across all
repos surfaced no real committed production secrets — only
.env.exampleplaceholders, the public Azure Cosmos emulator key, devJWT_SECRET=dev-...values, and Azure Key Vault references..gitignoreis present nearly everywhere. - Top repos are legitimately good.
notes,trails, and the two Rustclaw-*repos show modular architecture, real test suites (28–80+ files), CI, multi-stage Docker, and strict typing (0as anyin several backends).
Biggest risks (top 3)
- CI is the weak link. GitHub Actions is disabled (billing) on the
platform monorepo
learning_ai_common_platand onvoice_ai_agent(*.disabledworkflows); ~15 repos have no CI at all. The shared platform that everything depends on has no automated gate. - Process churn dirties the repos. A live
agent-queuedaemon +devinagents in--permission-mode dangerouswere actively writing to repos; ~14 repos were found dirty with uncommitted work, several behindorigin. Work is at risk of being lost or silently diverging. - Testing is bimodal. Excellent in the flagship repos, zero in many
others (
productivity_web,webui_copilot,pytorch_todo_predictor,server-survival,sidecar_setup,mac_tooling). No portfolio-wide coverage signal.
Is the dev style helping or hurting velocity? Net helping, but fraying at
the edges. The platform/agent approach clearly lets one person ship a dozen
apps — that's the upside. The drag is operational: disabled CI, constantly-dirty
working trees, abandoned worktrees, and "AI-generated scaffolding smell" in a
few repos (e.g. magic_clipboard_mgr's 50+ service files + phase-named test
buckets). Tightening the commit/CI loop would convert a lot of that churn back
into velocity.
2. Overall Score Sheet
Scores are 1–10 (1 = critical/broken, 10 = production-grade), aggregated across the ~30 code repos (pure docs/usage repos excluded from category math).
| Category | Score | Justification (evidence) |
|---|---|---|
| A. Repository organization | 8 | Consistent @bytelyst/* + types/repository/routes pattern, per-repo AGENTS.md, clear monorepos; minus for ~14 dirty trees, stray worktrees, a few unstructured repos. |
| B. Code quality | 7 | Flagships: strict TS, 0 as any, no console.log, Zod validation. Tail: print()-heavy (2nd_brain 60+, mac_tooling 200+), any leaks, AI-scaffold smell (magic_clipboard_mgr). |
| C. Architecture | 8 | Genuinely strong: shared platform, datastore abstraction, deterministic engines (flowmonk scheduler), risk-scoring (trails), MCP integrations, clean native/web boundaries. |
| D. DevOps & deployment | 6 | Ecosystem compose orchestrates ~20 services, multi-stage Dockerfiles common — but CI disabled on the platform repo, ~15 repos with no CI, and 0 healthchecks in docker-compose.ecosystem.yml. |
| E. Testing | 6 | Bimodal: notes/fastgap/clock/trails/claw-* have 28–600+ tests; many repos have 0. E2E frequently continue-on-error: true. No measured coverage. |
| F. Security | 8 | No real committed secrets anywhere; field encryption + Key Vault refs in the mature repos; .gitignore/.env.example discipline. Minus for NODE_TLS_REJECT_UNAUTHORIZED=0 in some Docker, thin input-validation in prototypes. |
| G. Product readiness | 7 | Several apps runnable end-to-end (web+backend); mobile/native surfaces often partial; CI-disabled + flaky E2E hold back true "launchable". |
| H. AI-agent practices | 6 | Impressive tooling (agent-queue, profiles, job briefs, AGENTS.md), but guardrails are weak: --permission-mode dangerous, agents dirtying live repos, duplicate work landing upstream, no enforced test-before-commit. |
| I. Personal workflow | 6 | Good: conventional commits, auto backup-main-* branches, AGENTS.md. Bad: ~14 dirty repos, branches behind origin, abandoned worktrees, no unified release/issue discipline. |
| Weighted overall | ≈ 7.0 | Beta-quality. See weighting below. |
Weighting & rationale: Security (F) and Product readiness (G) weighted ~1.5×, Testing (E) and DevOps (D) ~1.25× (these gate real-world reliability); A/B/C/H/I at 1.0×. The strong architecture/security pull the number up; the weak CI/testing pull it back to a solid-but-not-shippable ~7.0.
3. Per-Product / Per-Repo Breakdown
Maturity legend: PROD = production-grade, BETA, MVP, PROTO = prototype/learning, REF = docs/reference (not code).
Flagship products (platform-integrated)
| Repo | Stack | Tests | CI | Docker | Maturity |
|---|---|---|---|---|---|
learning_ai_notes |
Fastify5 + Next16 + Expo, Cosmos | 80+ files | ✓ gitea | ✓ | BETA→PROD |
learning_ai_trails |
Fastify5 + Next16 + SDK, Cosmos | 28 files | ✓ gitea | ✓ | PROD |
learning_ai_clock |
Next16 PWA + iOS/Android, Fastify | 662 total | ✓ gitea | ✓ | BETA |
learning_ai_fastgap |
Expo + Next16 + Fastify | 700+ total | ✓ gitea (7 jobs) | ✓ | BETA |
learning_ai_peakpulse |
SwiftUI + Fastify | 26 files | ✓ (backend) | ✓ | BETA→PROD |
learning_ai_flowmonk |
Next16 + Fastify + Expo | 102 backend | ✓ gitea | ✓ | BETA |
learning_ai_efforise |
React/Vite + Fastify + RN | ~9 backend | ✓ gitea | ✓ | MVP |
learning_ai_dev_intelli |
Fastify + Next16, GitHub API | 52 backend | ✓ gitea | ✓ | MVP |
learning_ai_local_memory_gpt |
Fastify + Next16, SQLite/Ollama | 122 | ✓ gitea | ✓ | MVP |
learning_ai_talk2obsidian |
Fastify + Vite, SQLite/Ollama | 8 | ✗ | ✓ | BETA |
learning_voice_ai_agent |
Python + Fastify + Next + KMP | 463+ | ⚠ disabled | ✓ | BETA |
learning_multimodal_memory_agents (MindLyst) |
KMP + Next + Fastify | 33 | ⚠ disabled | ✓ | MVP |
learning_ai_jarvis_jr |
SwiftUI + Next + Android | ~13 web | ✓ gitea | ✓ | ALPHA/BETA |
learning_ai_auth_app |
iOS/watchOS/Android (spec+UI) | 0 (here) | ✗ | ✗ | MVP (spec) |
Platform & infra
| Repo | Stack | Notes | Maturity |
|---|---|---|---|
learning_ai_common_plat |
pnpm monorepo, 36 @bytelyst/*, Fastify, Cosmos |
~466k LOC; full auth (OAuth/MFA/passkeys/SAML); GH Actions disabled (billing), gitea CI active | PROD |
learning_ai_devops_tools |
Bash + Python + Node (this repo) | GitHub admin scripts, agent-queue, Hermes dashboard; thin tests |
PROD (scripts) / MVP (dash) |
learning_ai_k8s_streaming |
Python FastAPI + Helm | Use-case registry, HPA/probes, load tools | BETA→PROD |
learning_ai_local_llms |
Next16 dashboard + Python TTS | Ollama mission-control; 57 tests | BETA |
Tools / OSS / native
| Repo | Stack | Notes | Maturity |
|---|---|---|---|
oss/learning_ai_claw-code-oss |
Rust workspace (10+ crates) | unsafe forbid, clippy pedantic, 40+ test files |
PROD |
oss/learning_ai_claw-cowork |
Rust + Tauri + Python | 65+ test files, E2E, Docker | PROD |
learning_magic_terminal |
Rust | README+CI+many tests; command-blocks v2; dirty(5) | BETA |
learning_notif_scanr |
Swift (Package.swift) | tests present, no CI, no Docker | MVP |
ios/learning_swift_hourglass |
Swift/SwiftUI macOS | MVVM, 2 test files, no CI | MVP |
learning_ai_magic_clipboard_mgr |
Swift/macOS, GRDB | 24 tests but 50+ services + phase-named tests (AI-scaffold smell) | MVP |
learning_ai_mac_tooling |
Python FastAPI + React | forensics toolkit; 0 tests, 200+ print(), 3k-line files |
PROTO |
copilot/learning_ai_uxui_web |
Next16 + MSW + Playwright | component showcase, Lighthouse CI | MVP |
learning_ai_productivity_web |
Next15, client-only | clean registry pattern, 0 tests | MVP |
learning_ai_webui_copilot |
Python FastAPI + LangChain | rules/policy engines, 0 tests, no Docker/CI | MVP |
learning_agent_monitoring_fx |
npm monorepo + KMP | agent/ingest/web work, native WIP, 54 console.log, TODOs |
BETA |
learning_agentic_tools_portal |
Python Flask + uv | minimal (1 endpoint, 1 test), has CI | PROTO |
learning_server-survival-devops-web |
Vanilla JS + Three.js | playable game, 0 tests | MVP |
learning_pytorch_todo_predictor |
Python + PyTorch | educational, 0 tests, no upstream | PROTO |
learning_sidecar_setup |
Next16 scaffold + py stub | scaffolding only, no upstream, dirty(8) | PROTO |
learning_claude_code_setup |
Bash + markdown | setup notes/scripts; dirty(1) | REF |
learning_github_copilot |
Markdown (CLI/SDK docs) | reference only | REF |
learning_python_sandbox |
Python | LeetCode/learning; dirty(1) | PROTO |
learning_ai_materials |
Docs | NBA handover package | REF |
learning_windsurf_setup |
Usage logs | not a codebase | N/A |
4. Findings by Dimension
A. Repository organization
- Fact: Strong, repeated conventions —
AGENTS.md/CLAUDE.mdper repo, pnpm workspaces,types→repository→routesbackend modules,docs/with PRD/ROADMAP. - Fact: ~14 repos dirty at audit time; abandoned
worktrees/(now cleaned); some repos behindorigin. Two repos (pytorch_todo_predictor,sidecar_setup) have no git upstream. - Reco: Adopt a "clean tree or it doesn't exist" rule (see §8). Add upstreams for the two orphan repos or mark them clearly local.
B. Code quality
- Fact: Best repos enforce strict TS (
0as anyinnotes,trails,local_memory_gptbackends), noconsole.log(Fastify logger), Zod validation. - Fact:
learning_ai_2nd_brainhas 60+print();mac_tooling200+ and 3k+-line files (network_transfer_audit.py3521 lines);magic_clipboard_mgrshows AI-scaffold smell (50+ service files,Phase5–8/RemainingQATests). - Reco: Lint-gate
print()/console.login the Python/TS repos; split the 3k-line files; auditmagic_clipboard_mgrfor stubbed vs real services.
C. Architecture
- Fact: Clear separation and reuse: shared auth/datastore/design-tokens,
deterministic scheduler (
flowmonk), risk engine (trails), use-case registry (k8s_streaming), MCP tool servers, Rust crate boundaries (claw-*). - Reco: This is the strongest dimension — protect it by keeping product
domains out of
common_platand vice-versa.
D. DevOps & deployment
- Fact:
docker-compose.ecosystem.ymlwires ~20 services (10 backends + 10 webs) + infra (Cosmos emulator, Azurite, Traefik, Loki, Grafana, MCP); 30restart:policies, 24build:contexts, but 0healthcheck:blocks. - Fact: GH Actions disabled on
common_plat+voice_ai_agent; ~15 repos no CI. - Reco (P1): Add healthchecks +
depends_on: condition: service_healthyto the ecosystem compose; re-enable or fully migrate CI to gitea self-hosted.
E. Testing
- Fact:
fastgap(~700),clock(662),notes(80+ files),voice_ai_agent(463+),claw-cowork(65+ files) are excellent; ~8 repos have 0 tests. - Fact: E2E often
continue-on-error: true(fastgap,flowmonk,jarvis_jr,local_memory_gpt) — i.e. not actually gating. - Reco: Set a per-repo minimum (smoke + happy-path) and stop masking E2E
failures with
continue-on-erroronce stabilized.
F. Security
- Fact: No real committed secrets across all repos. Matches were
.env.exampleplaceholders, the public Cosmos emulator key (C2y6yDjf5/R...),dev-*JWT secrets, and Azure Key Vault references. - Fact: Field encryption (AES-256-GCM) in
clock/notes/dev_intelli;unsafe_code = "forbid"in the Rust repos. - Watch:
NODE_TLS_REJECT_UNAUTHORIZED=0seen in some Docker setups; thin input validation / no rate-limiting in the prototype Python apps.
G. Product readiness
- Fact: Web+backend pairs generally run end-to-end; native/mobile surfaces (iOS/Android/KMP) are frequently partial or scaffolded.
- Reco: Pick 2–3 flagships (
notes,trails,clock) and drive them to a true launch checklist; treat the rest explicitly as experiments.
H. AI-agent practices
- Fact: Sophisticated
agent-queue(profiles, job briefs, lifecycle dirs, Node dashboard) — genuinely advanced for a solo setup. - Fact: Guardrails weak: agents run
--permission-mode dangerous, write to live working trees (caused the dirty-repo churn), and landed duplicate work (during this session a rebase auto-dropped 2 commits already pushed upstream). - Reco: Standardize the agent task contract (§8): one task = one branch = clean tree → tests → commit → push; ignore runtime/queue state in git (already fixed in this repo this session).
I. Personal engineering workflow
- Fact: Conventional commits, auto
backup-main-*branches (nice safety net),AGENTS.mddiscipline. - Fact: Too many long-lived dirty trees and behind-
originbranches; no visible issue tracker or release cadence. - Reco: A weekly "sync sweep" (rebase+push all clean repos, list dirty) — you effectively did this manually this session; automate it.
5. Prioritized Action Plan
P0 — now (correctness / risk)
- Re-establish a working CI gate on
learning_ai_common_plat(everything depends on it). Either fix GH Actions billing or make gitea CI the enforced gate. (M, common_plat) - Resolve the ~14 dirty repos: review + commit or discard intentionally;
add upstreams for
pytorch_todo_predictor&sidecar_setup. (M, workspace) - Decide the agent-queue daemon policy so it doesn't write to live trees
uncontrolled (it was running in
dangerousmode). (S, devops_tools)
P1 — this week
4. Add healthchecks to docker-compose.ecosystem.yml (0 today) + ordered
depends_on. (M, common_plat/ecosystem)
5. Stop masking E2E with continue-on-error: true once stabilized; make at least
smoke E2E gating. (M, fastgap/flowmonk/jarvis_jr)
6. Replace print() with logging in 2nd_brain (60+) and mac_tooling (200+).
(S–M)
P2 — this month
7. Add minimum test suites to the 0-test repos that matter (productivity_web,
webui_copilot, agent_monitoring_fx). (M)
8. Audit magic_clipboard_mgr for dead/stubbed services (50+ files). (M)
9. Split 3k-line files in mac_tooling. (M)
10. Remove NODE_TLS_REJECT_UNAUTHORIZED=0 from Docker; add rate-limiting to the
Python prototypes. (S–M)
P3 — nice to have
11. Portfolio-wide coverage reporting + dependency audit (npm audit/pip-audit)
in CI. (M)
12. A lightweight issue/release cadence for the 2–3 flagships. (S)
6. Safe Auto-Fix Candidates
(Low-risk; listed only — not applied. Each needs your approval.)
- Ecosystem compose healthchecks — add
healthcheck:to each backend/web service indocker-compose.ecosystem.yml. Safe: additive. - Add upstreams for
learning_pytorch_todo_predictorandlearning_sidecar_setup(git remote add origin … && git push -u). Safe once remote exists. - Lint rule to ban
print()inlearning_ai_2nd_brain(ruffT20) — flags only; you fix incrementally. - Drop
NODE_TLS_REJECT_UNAUTHORIZED=0from Docker envs where a real CA/host override is available. (Verify per service first.) .gitignoreaudit for the few repos still tracking runtime artifacts (pattern already fixed indevops_toolsthis session).
7. Delegate-to-Agent Queue
Ready-to-paste briefs (each self-contained, one branch, clean-tree rule):
- "Add healthchecks to ecosystem compose" — repo
common_plat; readdocker-compose.ecosystem.yml; addhealthcheck+ ordereddepends_onto all*-backend/*-webservices;docker compose configmust pass; no app code changes. - "De-
print()2nd_brain" — repolearning_ai_2nd_brain; replaceprint()withtyper.echo/logging insrc/brain/**; keep behavior identical; runpytest. - "Bootstrap tests for webui_copilot" — repo
learning_ai_webui_copilot; addpytestsmoke tests forsite_backendrules/policy engines + a copilot happy-path; wire a.github/gitea CI job. - "Service audit: magic_clipboard_mgr" — repo
learning_ai_magic_clipboard_mgr; produce a report of which of the 50+ services are wired vs stubbed; no code changes. - "Stabilize E2E" — repos
fastgap/flowmonk; make smoke E2E reliable, then removecontinue-on-error: truefor that job only.
8. Recommended Standard Operating Procedure (for every agent task)
- One task = one branch off latest
origin/main; never work on a dirty tree. - Scope it with a job brief (you already do this in
agent-queue/docs/jobs/). - Test before commit: typecheck + lint + unit must pass locally.
- Commit small, conventional messages; push the branch, open a PR — don't
let agents push straight to
mainof the shared platform. - Never track runtime/queue state (ignore
agent-queue/queue/*lifecycle — fixed here this session). - Prefer least-privilege over
--permission-mode dangerous; reserve dangerous mode for sandboxed/disposable checkouts. - Weekly sync sweep: rebase+push all clean repos, list dirty ones for review.
9. What I Could Not Inspect
- No dynamic results. I did not run
npm/pnpm install, builds,pytest,vitest, Playwright,cargo test, ordocker compose up(those mutate trees / need services). Test counts and CI configs are evidence of intended coverage, not measured pass/coverage. - No live
gitper-repo ahead/behind inside the read-only agents (they lacked shell git); branch/dirty facts come from the orchestrator's own checks and may have shifted as the agent-queue daemon ran. - One agent batch misfired: it reported 5 repos as "missing"
(
claude_code_setup,github_copilot,magic_terminal,notif_scanr,python_sandbox) due to a read-access issue; I re-scanned them directly — they exist (notablymagic_terminal= Rust,notif_scanr= Swift). - Mobile/native depth (iOS/Android/KMP/Tauri runtime behavior) and secret values were not executed/decrypted — only presence/format was checked.
.env.ecosystemholds dev-only values; production secret management (Key Vault wiring) was inferred from references, not verified live.
TL;DR
- Coherent beta-grade product ecosystem (~38 repos) — far beyond "learning".
- Architecture & security are strong; CI & testing are the weak links.
- P0: restore a CI gate on
common_plat, clean the ~14 dirty repos, and rein in thedangerous-mode agent-queue. - A handful of flagships (
notes,trails,claw-*,clock,fastgap) are genuinely production-grade; the long tail is MVP/prototype. - Tighten the agent commit/CI loop (§8) and most of the operational churn converts back into velocity.