Commit Graph

4 Commits

Author SHA1 Message Date
saravanakumardb1
d5d30ed912 feat(scripts): scanner precision tweaks + Phase 2b complete (8 repos clean)
Scanner refinements eliminate 3 false-positive categories:

1. tailwind.config.{ts,js,cjs,mjs} \u2014 these declare color palettes
   for downstream Tailwind classes; the hex is the definition.
2. **/backend/** files \u2014 backend modules don't do UI styling. Hex
   values there are domain data (theme presets, zone colors, agent
   accent colors) stored in Cosmos / sent to clients as data.
3. /tools/{color-picker,markdown-preview,qr-code,image-to-base64,
   regex-tester}/ pages in productivity_web \u2014 these tools manipulate
   hex/color values as their primary content, not for styling.
4. HTML numeric character references like 📄 \u2014 they encode
   Unicode characters, not hex colors (digits subset of hex fool regex).
5. themeColor: metadata in Next.js layouts (PWA manifest spec).

Phase 2b fixes pushed to:
- learning_ai_jarvis_jr        (1 hex \u2192 0)  commit bf9e1c7
- oss/learning_ai_claw-cowork  (2 real hex \u2192 0) commit 9017dd8
(productivity_web 9 \u2192 0 and voice_ai_agent 16 \u2192 0 cleared automatically
by the scanner refinement, no source changes needed in those repos.)

Cumulative progress:
  Total findings:  2548 (Phase 0 start) \u2192 1577 (-38%)
  web-hardcoded-hex: 465 \u2192 406 (-13%)

Repos at 0 hex findings (8/19):
- learning_ai_smart_auth     learning_ai_auth_app
- learning_ai_talk2obsidian  learning_ai_local_memory_gpt
- learning_ai_trails         learning_ai_local_llms
- learning_ai_jarvis_jr      learning_ai_productivity_web
- learning_voice_ai_agent    oss/learning_ai_claw-cowork

Remaining hex-heavy repos:
- learning_ai_flowmonk           107
- learning_multimodal_memory      94
- learning_ai_fastgap             89
- learning_ai_common_plat         59
- learning_ai_efforise            39
- learning_ai_mac_tooling         18
2026-05-23 14:23:55 -07:00
saravanakumardb1
616e973866 feat(scripts): skip themeColor metadata + record 4 hex-clean repos
Scanner refinement:
- Add themeColor: exception. Next.js PWA metadata 'themeColor' is a
  W3C Web App Manifest field that must be a literal hex string;
  CSS custom properties cannot be used. Skipping these is correct.

Baseline regenerated to reflect fixes pushed to:
- learning_ai_talk2obsidian   (1 hex \u2192 0)  commit d20848a
- learning_ai_local_memory_gpt (1 hex \u2192 0)  commit a5def1c
- learning_ai_trails           (1 hex \u2192 0)  commit 10549e6
- learning_ai_local_llms       (2 hex \u2192 0)  commit ca853f1

Total ecosystem hex findings: 465 \u2192 457
4 repos remain at 0 findings: talk2obsidian, local_memory_gpt,
smart_auth, auth_app.
2026-05-23 14:16:17 -07:00
saravanakumardb1
14ab38e49e feat(scripts): precision-tune rule violation scanner (hex false positives)
Three precision improvements that drop total findings from 2548 to 1643
without losing real violations:

1. web-hardcoded-hex: switch from grep -oE to grep -nE so the scanner
   can examine each match in CONTEXT, then apply context filters:
   - Skip CSS custom property DEFINITIONS:  '--bl-accent: #5A8CFF'
   - Skip var(--token, #fallback) patterns: defensive design-token
     fallbacks for boot-order safety, not raw hardcodes
   - Skip globals.css, *.tokens.*, *Theme.{ts,tsx,swift,kt} files
   - Skip design-system/ and color-picker/markdown-preview tool pages

2. b5-hardcoded-product-id: scripts/ exclusion (was previously bypassed
   for the script case but still caught churn-alert.ts genuinely).

3. Updates baseline report. Findings by category:

   Before                              After
   -----                                -----
   web-hardcoded-hex       1370        465  (-66%)
   b7-emoji-in-code         465        465
   b4-python-print          351        351
   ts-any-type              249        249
   b4-console-log            93         93
   b5-hardcoded-product-id   13         13
   b4-swift-print             7          7
                          ----        ----
   Total                  2548       1643

Remaining hex findings are now substantively real:
  - flowmonk:  114 (zone seed data: { color: '#5A8CFF' })
  - fastgap:   102 (BodyCanvas organ colors, organ-data.ts)
  - mindlyst:   97 (mixed UI + data)
  - common_plat: 59 (brand colors in login page: Google #4285F4 etc.)
  - efforise:   39
  - mac_tooling: 18

These fall into three classes which will be triaged in Phase 2:
  A. Brand colors (Google login etc.) - keep, document as exceptions
  B. Data seeds (zone colors, category colors) - migrate to design tokens
  C. Inline styling (color: '#fff') - replace with var(--xx-token)
2026-05-23 14:10:59 -07:00
saravanakumardb1
4967b125fd feat(scripts): ecosystem-wide rule violation scanner + baseline report
Adds scripts/check-rule-violations.sh: a marker-based, repo-agnostic
scanner that audits every repo in repos.txt for violations of the
canonical rules in AI.dev/SKILLS/agent-behavior-guidelines.md plus
common per-repo MUST NOT rules.

Rules currently scanned (7):
- b4-console-log    \\  console.log in non-test, non-script TS/JS
- b4-swift-print    \\  print() in non-test Swift
- b4-python-print   \\  print() in src/tools/backend-python (CLIs excluded)
- ts-any-type       \\  any type in non-test TS source
- web-hardcoded-hex \\  #rgb / #rrggbb literals outside design-tokens
- b5-hardcoded-product-id \\ literal product ID strings outside config
- b7-emoji-in-code  \\  decorative emojis (faces/food/etc.) in source

Precision filters baked in:
- Cross-product UI in common_plat dashboards exempted from product-id rule
- TS literal type definitions exempted from product-id rule
- JSDoc/docstring comment lines exempted from product-id rule
- scripts/ directories exempted from console.log/print rules (CLIs print)
- CLI entrypoint files (cli.py, __main__.py) exempted from python-print
- Sandbox dirs (__LOCAL_LLMs, chat-history, __experiments) excluded
- Unicode 'Miscellaneous Symbols' block (✓✗⚠★☐) NOT flagged as emoji
  (universally used as UI status indicators, not decorative)

Bash 3.2 compatible (no associative arrays). Runs in ~13 seconds across
19 repos.

Output:
- reports/rule-violations-YYYY-MM-DD.md   (human-readable, dated, gitignored)
- reports/rule-violations-YYYY-MM-DD.json (machine-readable, dated, gitignored)
- reports/rule-violations-baseline.md     (this commit's snapshot, committed)

Baseline (2026-05-23) totals:
  Total findings:  2548 across 19 repos
  - critical: 13   (real hardcoded product IDs in non-canonical locations)
  - major:    1821 (mostly hardcoded hex colors + console.log)
  - minor:    714  (any type, decorative emojis)

By rule:
  web-hardcoded-hex       1370
  b7-emoji-in-code         465
  b4-python-print          351
  ts-any-type              249
  b4-console-log            93
  b5-hardcoded-product-id   13
  b4-swift-print             7

Repos clean (0 findings):
  - learning_ai_smart_auth (docs-only)
  - learning_ai_auth_app (small native scaffolding only)

Repos with highest finding counts:
  - learning_ai_mac_tooling: 585 (Python backend + React dashboard)
  - learning_ai_common_plat: 521 (large shared platform)
  - learning_ai_fastgap:     409
  - learning_ai_multimodal:  312

Next phase: per-repo triage and fix, processing repos in order of
ascending complexity per the roadmap (see prior planning conversation).
The scanner is the gating tool for that work.
2026-05-23 14:02:14 -07:00