bytelyst-devops-tools/docs/prompts/engineering-review-scorecard.md
saravanakumardb1 92479113d0 docs(prompts): add engineering review & scorecard master prompt
Reusable evidence-based review prompt covering repos, code, architecture,
DevOps, testing, security, product-readiness, and AI-agent practices, with
a 1-10 scorecard and prioritized action plan output.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-05-30 20:29:49 -07:00

9.1 KiB
Raw Blame History

Engineering Review & Scorecard — Master Prompt

Reusable, copy/paste prompt for a deep, evidence-based review of an entire multi-repo workspace, its code, DevOps posture, and the human + AI-agent development practices behind it. Drop this into Claude Code / Codex / Devin / Copilot inside your VM or main repo workspace and let it run end-to-end.

Output is a single committed report: ENGINEERING_REVIEW_SCORECARD.md.


Prompt

You are acting simultaneously as a Principal Software Engineer, a Staff-level code reviewer, a startup CTO advisor, and a DevOps architect.

I want a brutally honest but constructive review of my entire development setup: codebase, repositories, engineering practices, deployment practices, security posture, and product-readiness.

Do not give generic advice. Inspect the actual repos, files, scripts, configs, commits, docs, Docker setup, CI/CD, tests, logs, dependencies, and deployment structure before forming any opinion.

My context

I am building multiple AI / productivity / startup apps and I use AI coding agents heavily. I want to know:

  1. What is good?
  2. What is broken?
  3. What is risky?
  4. What is slowing me down?
  5. What should be fixed first?
  6. What practices should I adopt to become more reliable, faster, and production-ready?
  7. What work can be delegated to AI agents immediately?

Rules of engagement

  • Be direct, specific, and evidence-based. Do not flatter me.
  • Do not make assumptions without checking files. If you cannot inspect something, say exactly what was missing and why.
  • Always cite file paths, repo names, the commands you ran, and concrete examples (short snippets, not walls of code).
  • Do not make destructive changes. Do not commit, push, delete, or rewrite history. For now, analyze and produce a report only.
  • If you find quick, low-risk fixes, list them separately as "Safe Auto-Fix Candidates" with the exact change and the file — but do not apply them unless I explicitly ask.
  • Prefer reading over running. Only run the read-only / non-destructive commands below. Never run anything that mutates state, deletes data, or pushes.

Scope & discovery

Inspect all accessible repos/projects under the current workspace and likely project folders. First discover what exists:

pwd
find ~ -maxdepth 4 -name ".git" -type d 2>/dev/null | sed 's#/.git##' | sort
find ~ -maxdepth 4 \( -name "package.json" -o -name "pyproject.toml" \
  -o -name "requirements.txt" -o -name "Dockerfile" \
  -o -name "docker-compose.yml" -o -name "compose.yml" \) 2>/dev/null | sort

Common roots to check (skip any that don't exist): ~/repos, ~/projects, ~/apps, ~/workspace, ~/code, ~/dev, ~/bytelyst, note-based project folders, and the current directory + subdirs.

Then group repos by product / app so the review is organized by product, not just by folder.

Review dimensions

A. Repository organization — clear naming; active vs abandoned repos obvious; docs present; clear README; consistent folder structure; duplicate/fragmented versions; safe env-file handling; understandable local scripts.

B. Code quality — TypeScript/Python/Node quality; modularity; error handling; logging; naming; dead code; over/under-engineering; security-sensitive code; duplication; hardcoded values; poor abstractions; AI-generated code smell.

C. Architecture — clarity; clean frontend/backend/database boundaries; consistent APIs; safe authentication; authorization / RLS / tenant isolation; reliable background jobs; understandable agent workflows; cleanly isolated integrations; product domains not incorrectly mixed.

D. DevOps & deployment — Dockerfile & compose quality; port conflicts; health checks; restart policies; reverse-proxy (nginx) readiness; SSL/certbot; secrets management; logging/monitoring; backups; DB migration strategy; CI/CD readiness; rollback strategy; dev/stage/prod separation.

E. Testing — unit / integration / E2E / API / smoke tests; build checks; lint/typecheck; test reliability; coverage gaps; recommended minimum test suite per repo.

F. Security — committed secrets; .env exposure; auth weaknesses; API route vulns; missing validation; dependency vulns; over-permissive CORS; unsafe file upload; unsafe shell execution; missing rate limits; missing audit logs; dangerous agent permissions; data-privacy issues.

G. Product readiness — can a user complete a flow end-to-end? core flows working? clear landing pages? stable onboarding/auth; user-friendly errors; broken screens; unfinished features; what blocks launch.

H. AI-agent development practices — am I using agents effectively? prompts too vague? agents committing too much at once? roadmaps/checklists maintained? incremental changes? tests run before commits? agents documenting work? repo drift/duplication caused by agents? guardrails to add; the standard prompt/process I should use for every agent task.

I. Personal engineering workflow — branching; commit quality; README/roadmap discipline; issue tracking; release discipline; documentation quality; local setup reliability; context files for AI agents; repo cleanup needs; backup strategy; prioritization.

Commands to run where applicable (read-only / non-destructive)

For Node / TypeScript repos:

npm install --ignore-scripts || true
npm run lint || true
npm run typecheck || true
npm run build || true
npm test || true
npm audit --audit-level=moderate || true

For Python repos:

python --version || true
pip --version || true
python -m compileall . || true
pytest || true
pip-audit || true

For Docker repos:

docker compose config || true
docker compose ps || true
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}" || true

For Git / repo health:

git status --short || true
git log --oneline -10 || true
git branch --show-current || true
git remote -v || true

For secret scanning (read-only grep):

grep -RIn --exclude-dir=node_modules --exclude-dir=.git \
  --exclude-dir=dist --exclude-dir=build \
  -E "OPENAI_API_KEY|ANTHROPIC_API_KEY|GOOGLE_API_KEY|AWS_ACCESS_KEY|AWS_SECRET|SUPABASE_SERVICE_ROLE|PRIVATE_KEY|PASSWORD|SECRET|TOKEN" . || true

Note: a grep hit is a candidate, not proof. Confirm whether each match is a real committed secret, a placeholder, or a variable name before reporting it.

Required output

Create a single report named ENGINEERING_REVIEW_SCORECARD.md with the following sections, in order.

1. Executive Summary

A direct, high-level opinion:

  • Overall maturity.
  • Biggest strengths (top 3).
  • Biggest risks (top 3).
  • Is this prototype, MVP, beta, or production quality? Justify it.
  • Is the current repo/development style helping or hurting velocity? Why?

2. Overall Score Sheet

Score each category 110 (1 = critical/broken, 10 = excellent/production-grade). Show the evidence behind each score in one line.

Category Score (110) Justification (evidence)
A. Repository organization
B. Code quality
C. Architecture
D. DevOps & deployment
E. Testing
F. Security
G. Product readiness
H. AI-agent practices
I. Personal workflow
Weighted overall

State the weighting you used for the overall score (e.g. Security and Product readiness weighted higher), and give a one-paragraph rationale.

3. Per-Product / Per-Repo Breakdown

For each product group: repos involved, stack, what works, what's broken, top risks, and a maturity label (prototype / MVP / beta / prod).

4. Findings by Dimension (AI)

For each dimension: concrete findings with file paths + repo names + examples, ordered by severity. Separate facts (what you observed) from recommendations (what to change).

5. Prioritized Action Plan

A single ranked list across all repos:

  • P0 — Fix now (security, data loss, launch blockers).
  • P1 — This week.
  • P2 — This month.
  • P3 — Nice to have. Each item: what, why it matters, rough effort (S/M/L), and which repo/file.

6. Safe Auto-Fix Candidates

Low-risk changes you could make immediately if I approve — with the exact file, the exact change, and why it's safe. Do not apply them.

7. Delegate-to-Agent Queue

Tasks ready to hand to an AI agent right now. For each: a tight, self-contained task brief (repo, files to read first, objective, constraints, definition of done) so I can paste it straight into an agent.

The repeatable process + guardrails I should adopt for every future AI-agent task (branching, scoping, test-before-commit, documentation, review gates).

9. What You Could Not Inspect

Explicitly list anything inaccessible, skipped, or assumed, and what I'd need to provide for a complete review.


Final instruction

Work methodically: discover → group → inspect → score → recommend. When you are done, print the path to ENGINEERING_REVIEW_SCORECARD.md and a 5-bullet TL;DR. Do not commit or push it — leave it for me to review.