feat(scripts): ecosystem-wide rule violation scanner + baseline report

Adds scripts/check-rule-violations.sh: a marker-based, repo-agnostic scanner that audits every repo in repos.txt for violations of the canonical rules in AI.dev/SKILLS/agent-behavior-guidelines.md plus common per-repo MUST NOT rules. Rules currently scanned (7): - b4-console-log \\ console.log in non-test, non-script TS/JS - b4-swift-print \\ print() in non-test Swift - b4-python-print \\ print() in src/tools/backend-python (CLIs excluded) - ts-any-type \\ any type in non-test TS source - web-hardcoded-hex \\ #rgb / #rrggbb literals outside design-tokens - b5-hardcoded-product-id \\ literal product ID strings outside config - b7-emoji-in-code \\ decorative emojis (faces/food/etc.) in source Precision filters baked in: - Cross-product UI in common_plat dashboards exempted from product-id rule - TS literal type definitions exempted from product-id rule - JSDoc/docstring comment lines exempted from product-id rule - scripts/ directories exempted from console.log/print rules (CLIs print) - CLI entrypoint files (cli.py, __main__.py) exempted from python-print - Sandbox dirs (__LOCAL_LLMs, chat-history, __experiments) excluded - Unicode 'Miscellaneous Symbols' block (✓✗⚠★☐) NOT flagged as emoji (universally used as UI status indicators, not decorative) Bash 3.2 compatible (no associative arrays). Runs in ~13 seconds across 19 repos. Output: - reports/rule-violations-YYYY-MM-DD.md (human-readable, dated, gitignored) - reports/rule-violations-YYYY-MM-DD.json (machine-readable, dated, gitignored) - reports/rule-violations-baseline.md (this commit's snapshot, committed) Baseline (2026-05-23) totals: Total findings: 2548 across 19 repos - critical: 13 (real hardcoded product IDs in non-canonical locations) - major: 1821 (mostly hardcoded hex colors + console.log) - minor: 714 (any type, decorative emojis) By rule: web-hardcoded-hex 1370 b7-emoji-in-code 465 b4-python-print 351 ts-any-type 249 b4-console-log 93 b5-hardcoded-product-id 13 b4-swift-print 7 Repos clean (0 findings): - learning_ai_smart_auth (docs-only) - learning_ai_auth_app (small native scaffolding only) Repos with highest finding counts: - learning_ai_mac_tooling: 585 (Python backend + React dashboard) - learning_ai_common_plat: 521 (large shared platform) - learning_ai_fastgap: 409 - learning_ai_multimodal: 312 Next phase: per-repo triage and fix, processing repos in order of ascending complexity per the roadmap (see prior planning conversation). The scanner is the gating tool for that work.
2026-05-23 14:02:14 -07:00 · 2026-05-23 14:02:14 -07:00 · 4967b125fd
commit 4967b125fd
parent a72abbcda1
3 changed files with 3120 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -28,3 +28,9 @@ packages/kotlin-platform-sdk/build/
 packages/kotlin-platform-sdk/.gradle/
 packages/kotlin-platform-sdk/gradle/
 packages/kotlin-platform-sdk/gradlew
 # Rule violation scanner outputs (regenerated by scripts/check-rule-violations.sh)
 # Baseline reports are committed as rule-violations-baseline.* for diffing.
 reports/rule-violations-2*.json
 reports/rule-violations-2*.md
 !reports/rule-violations-baseline.md
--- a/reports/rule-violations-baseline.md
+++ b/reports/rule-violations-baseline.md
--- a/scripts/check-rule-violations.sh
+++ b/scripts/check-rule-violations.sh
@ -0,0 +1,453 @@
 #!/usr/bin/env bash
 # check-rule-violations.sh — Ecosystem-wide AGENTS.md rule compliance scanner.
 #
 # Scans every repo in repos.txt for violations of the canonical rules defined
 # in AI.dev/SKILLS/agent-behavior-guidelines.md (Parts B4, B5, B7) plus a few
 # stack-specific per-repo MUST NOT rules (no `any`, no hardcoded hex colors).
 #
 # Output:
 #   reports/rule-violations-YYYY-MM-DD.md   — human-readable, grouped by repo
 #   reports/rule-violations-YYYY-MM-DD.json — machine-readable, one finding per line
 #
 # Usage:
 #   bash scripts/check-rule-violations.sh                  # scan all repos
 #   bash scripts/check-rule-violations.sh <repo-path>      # scan single repo
 #   bash scripts/check-rule-violations.sh --quiet          # only print summary
 #
 # Exit code: 0 always (scanner is informational, not gating).
 #
 # Each rule is implemented as a function:
 #   scan_<rule_id>() { repo_dir="$1"; ...emit findings via emit_finding... }
 # Adding a new rule:
 #   1. Add a function below
 #   2. Append to RULES=() array
 #   3. Document severity + rationale in the function header comment
 #
 # Findings are emitted via emit_finding which appends to two global arrays
 # (FINDINGS_JSON and FINDINGS_MD) consumed at the end by the report writer.
 set -uo pipefail
 # NOT -e: grep returning 1 (no matches) is normal and must not abort the scan.
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 REPOS_TXT="${SCRIPT_DIR}/../.windsurf/workflows/repos.txt"
 BASE_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)"
 REPORTS_DIR="${SCRIPT_DIR}/../reports"
 TODAY="$(date +%Y-%m-%d)"
 JSON_OUT="${REPORTS_DIR}/rule-violations-${TODAY}.json"
 MD_OUT="${REPORTS_DIR}/rule-violations-${TODAY}.md"
 RED='\033[0;31m'; YELLOW='\033[1;33m'; GREEN='\033[0;32m'; BLUE='\033[0;34m'; NC='\033[0m'
 mkdir -p "$REPORTS_DIR"
 # ─── Mode parsing ──────────────────────────────────────────────────────────────
 QUIET=0
 SINGLE_REPO=""
 for arg in "$@"; do
  case "$arg" in
    --quiet|-q) QUIET=1 ;;
    --help|-h) sed -n '2,30p' "$0"; exit 0 ;;
    -*) echo "Unknown flag: $arg" >&2; exit 1 ;;
    *) SINGLE_REPO="$arg" ;;
  esac
 done
 # ─── Repo list ─────────────────────────────────────────────────────────────────
 REPOS=()
 if [[ -n "$SINGLE_REPO" ]]; then
  REPOS+=("$SINGLE_REPO")
 else
  while IFS= read -r line; do
    [[ "$line" =~ ^[[:space:]]*# ]] && continue
    [[ -z "${line// }" ]] && continue
    REPOS+=("$line")
  done < "$REPOS_TXT"
 fi
 # ─── Finding accumulator ───────────────────────────────────────────────────────
 # Globals updated by emit_finding(); flushed at end of each repo.
 declare -a FINDINGS_JSON
 declare -a FINDINGS_MD
 # Per-repo counters (reset before each repo). Bash 3.2 compatible — no
 # associative arrays. Ecosystem totals are computed at the end by grep'ing
 # the JSONL output file (each finding is one line).
 REPO_CRITICAL=0
 REPO_MAJOR=0
 REPO_MINOR=0
 # emit_finding RULE_ID SEVERITY REPO FILE LINE EVIDENCE
 # SEVERITY: critical|major|minor
 emit_finding() {
  local rule_id="$1" severity="$2" repo="$3" file="$4" line="$5" evidence="$6"
  # Strip leading repo path for shorter reporting
  local rel_file="${file#${BASE_DIR}/${repo}/}"
  # JSON escape (basic — content is grep output, no nested quotes from us)
  local json_evidence="${evidence//\\/\\\\}"
  json_evidence="${json_evidence//\"/\\\"}"
  json_evidence="${json_evidence//	/\\t}"
  FINDINGS_JSON+=("{\"rule\":\"$rule_id\",\"severity\":\"$severity\",\"repo\":\"$repo\",\"file\":\"$rel_file\",\"line\":$line,\"evidence\":\"$json_evidence\"}")
  FINDINGS_MD+=("- **[$severity]** \`$rel_file:$line\` — $evidence")
  case "$severity" in
    critical) REPO_CRITICAL=$(( REPO_CRITICAL + 1 )) ;;
    major)    REPO_MAJOR=$(( REPO_MAJOR + 1 )) ;;
    minor)    REPO_MINOR=$(( REPO_MINOR + 1 )) ;;
  esac
 }
 # ─── Shared exclusion globs (grep --exclude-dir / --exclude) ───────────────────
 #
 # These directories are always excluded from scans. They contain either
 # generated, vendored, or test code (tests have a separate exclusion policy
 # per rule — tests are sacred but they don't violate prod-code rules).
 EXCLUDE_DIRS=(
  --exclude-dir=node_modules
  --exclude-dir=.git
  --exclude-dir=.next
  --exclude-dir=.turbo
  --exclude-dir=dist
  --exclude-dir=build
  --exclude-dir=coverage
  --exclude-dir=__pycache__
  --exclude-dir=.venv
  --exclude-dir=venv
  --exclude-dir=target          # Rust
  --exclude-dir=Pods            # CocoaPods
  --exclude-dir=DerivedData     # Xcode
  --exclude-dir=.gradle
  --exclude-dir=reports
  --exclude-dir=test-results
  --exclude-dir=playwright-report
  --exclude-dir=.pytest_cache
  --exclude-dir=.ruff_cache
  --exclude-dir=generated       # design-tokens output
  --exclude-dir=__LOCAL_LLMs    # local LLM sandbox (not production)
  --exclude-dir=chat-history    # IDE chat archives
  --exclude-dir=__experiments   # experimental code
  --exclude-dir=experiments     # experimental code
  --exclude-dir=_archive_helper # archived code
  --exclude-dir=.docker-deps    # bundled npm tarballs for Docker builds
 )
 # Test-file exclusions (used per-rule; some rules apply to tests, others don't)
 TEST_EXCLUDES=(
  --exclude-dir=__tests__
  --exclude-dir=__mocks__
  --exclude-dir=e2e
  --exclude-dir=tests
  --exclude=*.test.ts
  --exclude=*.test.tsx
  --exclude=*.spec.ts
  --exclude=*.spec.tsx
  --exclude=*.test.js
  --exclude=*.spec.js
 )
 # ─── RULES ─────────────────────────────────────────────────────────────────────
 #
 # Each scan_* function:
 #   • Receives $1 = absolute repo path
 #   • Emits findings via emit_finding
 #   • Must not error out (use || true after grep when needed)
 # B4 — no `console.log` in production TS/JS code.
 # Severity: major. Tests excluded. Allowed: console.warn/error for legit errors
 # in some places, but we still flag console.log to surface.
 scan_b4_console_log() {
  local repo="$1" repo_dir="$2"
  while IFS=: read -r file line evidence; do
    [[ -z "$file" ]] && continue
    # Skip commented lines (// or # before console.log)
    [[ "$evidence" =~ ^[[:space:]]*(//|#|\*) ]] && continue
    # Skip CLI/admin scripts — their job is to print to terminal.
    # Pattern: top-level scripts/ directory in either common_plat or product repos.
    [[ "$file" =~ (^|/)scripts/ ]] && continue
    emit_finding "b4-console-log" "major" "$repo" "$file" "$line" "console.log: ${evidence:0:80}"
  done < <(grep -rnE 'console\.log\(' "$repo_dir" \
    --include='*.ts' --include='*.tsx' --include='*.js' --include='*.jsx' --include='*.mjs' \
    "${EXCLUDE_DIRS[@]}" "${TEST_EXCLUDES[@]}" 2>/dev/null || true)
 }
 # B4 — no `print(` in Swift production code.
 # Severity: major. XCTest test files excluded.
 scan_b4_swift_print() {
  local repo="$1" repo_dir="$2"
  while IFS=: read -r file line evidence; do
    [[ -z "$file" ]] && continue
    [[ "$evidence" =~ ^[[:space:]]*// ]] && continue
    emit_finding "b4-swift-print" "major" "$repo" "$file" "$line" "Swift print(): ${evidence:0:80}"
  done < <(grep -rnE '\bprint\(' "$repo_dir" --include='*.swift' \
    --exclude-dir=Tests --exclude='*Tests.swift' --exclude='*Test.swift' \
    "${EXCLUDE_DIRS[@]}" 2>/dev/null || true)
 }
 # B4 — no `print(` in Python production code under src/ tools/ backend-python/.
 # Severity: major. Scripts dir and tests excluded (scripts and CLIs may print).
 scan_b4_python_print() {
  local repo="$1" repo_dir="$2"
  # Note: scripts/ explicitly excluded (CLIs print to terminal). Only scan
  # core src/library/backend trees where structlog/logging should be used.
  for src in src tools backend-python backend/src; do
    [[ -d "${repo_dir}/${src}" ]] || continue
    while IFS=: read -r file line evidence; do
      [[ -z "$file" ]] && continue
      [[ "$evidence" =~ ^[[:space:]]*# ]] && continue
      # Skip CLI entrypoint files (often named cli.py, __main__.py).
      [[ "$file" =~ /(cli|__main__|main)\.py$ ]] && continue
      emit_finding "b4-python-print" "major" "$repo" "$file" "$line" "Python print(): ${evidence:0:80}"
    done < <(grep -rnE '^\s*print\(' "${repo_dir}/${src}" --include='*.py' \
      --exclude-dir=tests --exclude='test_*.py' --exclude='*_test.py' \
      "${EXCLUDE_DIRS[@]}" 2>/dev/null || true)
  done
 }
 # B4 — TypeScript `any` type usage in source code.
 # Severity: minor (per AGENTS.md it's a MUST NOT for most TS repos but is
 # pervasive in some legacy code; flagging as minor avoids drowning the report).
 # Excludes: tests, .d.ts files, generated code, lines that are pre-existing
 # `// eslint-disable` annotated.
 scan_ts_any_type() {
  local repo="$1" repo_dir="$2"
  while IFS=: read -r file line evidence; do
    [[ -z "$file" ]] && continue
    [[ "$evidence" =~ eslint-disable ]] && continue
    # Skip type narrowing patterns like `as any` for legitimate JSON parsing
    # (we still report them but in a follow-up rule if needed)
    emit_finding "ts-any-type" "minor" "$repo" "$file" "$line" "any type: ${evidence:0:80}"
  done < <(grep -rnE ':\s*any\b|\bas\s+any\b' "$repo_dir" \
    --include='*.ts' --include='*.tsx' \
    --exclude='*.d.ts' \
    "${EXCLUDE_DIRS[@]}" "${TEST_EXCLUDES[@]}" 2>/dev/null || true)
 }
 # Web-specific: hardcoded hex colors in TS/TSX/CSS source code.
 # Per-repo AGENTS.md MUST NOT — colors must come from design-tokens
 # (--XX-* CSS custom properties).
 # Severity: major. Excludes: design-tokens output, tests, comments.
 # Evidence: the actual matched hex code (e.g., "#fff") plus a short context
 # snippet, so triage doesn't need to open the file for trivial cases.
 scan_web_hardcoded_hex() {
  local repo="$1" repo_dir="$2"
  # Use grep -oE with --only-matching-line position so we can show the real match.
  # Pattern: '#' followed by either 6 or 3 hex chars at a word boundary.
  while IFS=: read -r file line match; do
    [[ -z "$file" ]] && continue
    # Allow hex colors in tokens / generated / design-system files
    [[ "$file" =~ /(tokens\.css|.*\.tokens\..*|generated/|design-tokens/) ]] && continue
    emit_finding "web-hardcoded-hex" "major" "$repo" "$file" "$line" "Hardcoded hex color: $match"
  done < <(grep -rnoE '#[0-9a-fA-F]{6}\b|#[0-9a-fA-F]{3}\b' "$repo_dir" \
    --include='*.ts' --include='*.tsx' --include='*.css' --include='*.scss' \
    "${EXCLUDE_DIRS[@]}" "${TEST_EXCLUDES[@]}" 2>/dev/null || true)
 }
 # B5 — hardcoded product ID string literals outside shared/product.json
 # and product-config files.
 # Severity: critical. The canonical pattern is PRODUCT_ID from product-config.ts
 # or @bytelyst/config.
 scan_b5_hardcoded_product_id() {
  local repo="$1" repo_dir="$2"
  local product_ids='"(lysnrai|mindlyst|chronomind|jarvisjr|nomgap|peakpulse|flowmonk|notelett|actiontrail|localmemgpt|efforise|localllmlab|smartauth|productivity-web|talk2obs)"'
  while IFS=: read -r file line evidence; do
    [[ -z "$file" ]] && continue
    # Allow in canonical locations
    [[ "$file" =~ (shared/product\.json|product-config\.(ts|js|swift|kt)|product\.manifest\.json) ]] && continue
    # Allow in test fixtures (they need literal IDs)
    [[ "$file" =~ (__tests__|tests/|\.test\.|\.spec\.) ]] && continue
    # Allow in docs
    [[ "$file" =~ \.(md|mdx)$ ]] && continue
    # Allow cross-product UI in common_plat dashboards (they legitimately
    # enumerate all products for admin operations).
    [[ "$repo" == "learning_ai_common_plat" && "$file" =~ dashboards/(admin-web|tracker-web|ux-lab)/ ]] && continue
    # Allow obvious enumeration patterns: SelectItem value=, option value=,
    # product list arrays, etc. These are intentional cross-product references,
    # not hardcoded product identity.
    [[ "$evidence" =~ (SelectItem|option|productId:|product:)[[:space:]]*[=:][[:space:]]*\" ]] && continue
    # Skip JSDoc / docstring / inline comment lines containing example product IDs.
    # Pattern: line begins with whitespace then '*' (JSDoc continuation),
    # '//' (line comment), or '#' (Python comment).
    [[ "$evidence" =~ ^[[:space:]]*(\*|//|#) ]] && continue
    emit_finding "b5-hardcoded-product-id" "critical" "$repo" "$file" "$line" "Hardcoded product ID: ${evidence:0:80}"
  done < <(grep -rnE "$product_ids" "$repo_dir" \
    --include='*.ts' --include='*.tsx' --include='*.js' \
    "${EXCLUDE_DIRS[@]}" 2>/dev/null || true)
 }
 # B7 — emojis in source code (per-repo AGENTS.md "Never add emojis to code").
 # Severity: minor. Excludes: markdown, tests, generated files.
 # Implementation: writes a small Python helper to a temp file and runs it,
 # avoiding the bash heredoc-in-process-substitution pattern which produced
 # 'ambiguous redirect' errors under set -u.
 scan_b7_emojis() {
  local repo="$1" repo_dir="$2"
  command -v python3 >/dev/null 2>&1 || return 0
  local py_helper out
  py_helper="$(mktemp -t emoji-scan.XXXXXX.py)"
  out="$(mktemp -t emoji-out.XXXXXX)"
  cat > "$py_helper" <<'PYEOF'
 import os, re, sys
 root = sys.argv[1]
 # Only flag DECORATIVE emojis (faces, food, animals, transport, hearts).
 # Explicitly EXCLUDE U+2600-U+27BF (Miscellaneous Symbols) which contains
 # ✓ ✗ ⚠ ★ ☐ ☑ ✓ — universally used as UI status indicators, not decorative.
 EMOJI_RE = re.compile(
    r"[\U0001F600-\U0001F64F]"     # emoticons (faces)
    r"|[\U0001F300-\U0001F5FF]"     # misc symbols + pictographs (decorative)
    r"|[\U0001F680-\U0001F6FF]"     # transport + map
    r"|[\U0001F700-\U0001F77F]"     # alchemical symbols
    r"|[\U0001F900-\U0001F9FF]"     # supplemental symbols + pictographs
    r"|[\U0001FA70-\U0001FAFF]"     # symbols + pictographs extended-A
 )
 EXTS = {".ts", ".tsx", ".js", ".jsx", ".py", ".swift", ".kt", ".rs"}
 SKIP_DIRS = {"node_modules", ".git", ".next", "dist", "build", "coverage",
             "__pycache__", "target", "Pods", "DerivedData", "reports",
             "test-results", ".pytest_cache", ".venv", "venv",
             "__tests__", "__mocks__", "tests", "e2e", "generated",
             "__LOCAL_LLMs", "chat-history", "__experiments",
             "experiments", "_archive_helper", ".docker-deps",
             ".turbo", ".ruff_cache", ".gradle", "playwright-report"}
 for dp, dirs, files in os.walk(root):
    dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
    for f in files:
        ext = os.path.splitext(f)[1]
        if ext not in EXTS:
            continue
        if f.endswith((".test.ts", ".test.tsx", ".spec.ts", ".spec.tsx",
                       ".test.js", ".spec.js")):
            continue
        fp = os.path.join(dp, f)
        try:
            with open(fp, encoding="utf-8", errors="replace") as fh:
                for i, l in enumerate(fh, 1):
                    m = EMOJI_RE.search(l)
                    if m:
                        # Emit: filepath:line:emoji_char
                        print(f"{fp}:{i}:{m.group(0)}")
        except (OSError, UnicodeDecodeError):
            continue
 PYEOF
  python3 "$py_helper" "$repo_dir" > "$out" 2>/dev/null || true
  while IFS=: read -r file line evidence; do
    [[ -z "$file" ]] && continue
    emit_finding "b7-emoji-in-code" "minor" "$repo" "$file" "$line" "Emoji in code: $evidence"
  done < "$out"
  rm -f "$py_helper" "$out"
 }
 # ─── Rule registry ─────────────────────────────────────────────────────────────
 RULES=(
  scan_b4_console_log
  scan_b4_swift_print
  scan_b4_python_print
  scan_ts_any_type
  scan_web_hardcoded_hex
  scan_b5_hardcoded_product_id
  scan_b7_emojis
 )
 # ─── Scan loop ─────────────────────────────────────────────────────────────────
 echo "" > "$JSON_OUT"
 {
  echo "# Rule Violations Report — ${TODAY}"
  echo ""
  echo "> Generated by \`scripts/check-rule-violations.sh\` against canonical rules in"
  echo "> [\`AI.dev/SKILLS/agent-behavior-guidelines.md\`](../AI.dev/SKILLS/agent-behavior-guidelines.md)."
  echo ""
  echo "Severity legend: **critical** = data/security risk · **major** = rule violation · **minor** = style"
  echo ""
 } > "$MD_OUT"
 [[ "$QUIET" -eq 0 ]] && echo -e "${BLUE}Scanning $(echo "${REPOS[@]}" | wc -w | tr -d ' ') repo(s) against ${#RULES[@]} rules...${NC}"
 total_findings=0
 for repo in "${REPOS[@]}"; do
  repo_dir="${BASE_DIR}/${repo}"
  if [[ ! -d "$repo_dir" ]]; then
    [[ "$QUIET" -eq 0 ]] && echo -e "${YELLOW}  skip: $repo (directory missing)${NC}"
    continue
  fi
  # Reset per-repo finding arrays + counters
  FINDINGS_MD=()
  FINDINGS_JSON=()
  REPO_CRITICAL=0
  REPO_MAJOR=0
  REPO_MINOR=0
  for rule_fn in "${RULES[@]}"; do
    "$rule_fn" "$repo" "$repo_dir"
  done
  # Append per-repo section to markdown report
  local_count=${#FINDINGS_MD[@]}
  total_findings=$(( total_findings + local_count ))
  c=$REPO_CRITICAL
  M=$REPO_MAJOR
  m=$REPO_MINOR
  {
    echo "## \`$repo\`"
    echo ""
    if [[ "$local_count" -eq 0 ]]; then
      echo "✅ No violations found."
    else
      echo "**Counts:** critical=$c · major=$M · minor=$m · total=$local_count"
      echo ""
      printf "%s\n" "${FINDINGS_MD[@]}"
    fi
    echo ""
  } >> "$MD_OUT"
  # Append per-repo JSON lines
  if [[ "$local_count" -gt 0 ]]; then
    printf '%s\n' "${FINDINGS_JSON[@]}" >> "$JSON_OUT"
  fi
  if [[ "$QUIET" -eq 0 ]]; then
    if [[ "$local_count" -eq 0 ]]; then
      echo -e "  ${GREEN}✓ $repo${NC}  (0 findings)"
    else
      echo -e "  ${YELLOW}⚠ $repo${NC}  critical=$c major=$M minor=$m total=$local_count"
    fi
  fi
 done
 # ─── Summary ───────────────────────────────────────────────────────────────────
 # Compute per-rule totals from the JSONL output (bash 3.2-compatible).
 # Extract "rule":"<id>" via sed, then sort | uniq -c.
 RULE_COUNTS_FILE="$(mktemp)"
 sed -nE 's/.*"rule":"([^"]+)".*/\1/p' "$JSON_OUT" | sort | uniq -c | sort -rn > "$RULE_COUNTS_FILE"
 {
  echo "## Ecosystem totals by rule"
  echo ""
  echo "| Rule | Total findings |"
  echo "|------|----------------|"
  while read -r count rule; do
    [[ -z "$rule" ]] && continue
    echo "| \`$rule\` | $count |"
  done < "$RULE_COUNTS_FILE"
  echo ""
  echo "**Grand total: $total_findings findings across ${#REPOS[@]} repos.**"
 } >> "$MD_OUT"
 echo ""
 echo -e "${BLUE}═══ Summary ═══${NC}"
 echo "  Total findings:  $total_findings"
 echo "  Markdown report: $MD_OUT"
 echo "  JSON report:     $JSON_OUT"
 echo ""
 echo "By rule (highest first):"
 while read -r count rule; do
  [[ -z "$rule" ]] && continue
  printf "  %-32s %d\n" "$rule" "$count"
 done < "$RULE_COUNTS_FILE"
 rm -f "$RULE_COUNTS_FILE"