learning_ai_common_plat/__LOCAL_LLMs/docs/DASHBOARD_REVIEW.md
saravanakumardb1 519f348583 docs(local-llm): add Next Wave — 15 model intelligence improvements (N1–N15)
Section 8 of DASHBOARD_REVIEW.md: pre-load RAM estimates, will-it-fit
indicator, RAM budget bar, context window, architecture/vision/think
badges, sort, tok/s history, countdown, session stats, delete confirm,
co-load suggestions. Organized in 4 tiers with sprint plan.
2026-02-19 22:32:29 -08:00

20 KiB
Raw Blame History

Mission Control Dashboard — Bug & Improvement Review

Systematic code review of __LOCAL_LLMs/dashboard/ (6 source files, 1,395 lines) Last updated: Feb 19, 2026


File Inventory

File Lines Purpose
src/app/page.tsx 1,079 Main dashboard UI (single component)
src/app/globals.css 91 Design tokens, animations, base styles
src/app/layout.tsx 20 Root layout (metadata, dark mode)
src/app/api/ollama/route.ts 117 Ollama REST proxy (list, load, unload, pull, delete, show, generate)
src/app/api/ollama/stream/route.ts 38 Ollama streaming generate proxy (NDJSON)
src/app/api/whisper/route.ts 66 Whisper binary + GGML model discovery
src/app/api/system/route.ts 162 System info (chip, memory via vm_stat, disk, brew)

Stack: Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript


1. Bugs

  • B1. Hardcoded machine specs in headerpage.tsx:317 Subtitle reads Apple M4 Pro · 48 GB · {system?.platform} — should use system?.chip and formatBytes(system?.memory.total) dynamically so it works on any machine.

  • B2. Pull model blocks UI — no progress feedbackapi/ollama/route.ts:84-92 handlePull calls Ollama with stream: false. Large models (20+ GB) block for 30+ minutes. The Next.js API route will likely timeout. Must use stream: true and pipe progress events to the client. (Combined with F1.)

  • B3. Dead code: non-streaming generate actionapi/ollama/route.ts:69-82 The action === 'generate' handler is unused — UI only uses /api/ollama/stream. Remove or keep as fallback with a comment.

  • B4. Escape key closes modal during active streamingpage.tsx:188-197 Global keydown handler calls setPromptModel(null) unconditionally. Backdrop click correctly checks !promptLoading. Escape should also respect promptLoading to prevent discarding an in-flight response.

  • B5. Auto-refresh (15s) fires during streaming/pullpage.tsx:182-185 setInterval(fetchAll, 15000) runs unconditionally. During streaming this causes background churn and potential UI flicker. Should pause while promptLoading or pullLoading is true.

  • B6. Toast ID collision on HMR remountpage.tsx:156-159 toastId.current resets to 0 on component remount during dev. Use Date.now() or crypto.randomUUID() for robust uniqueness.

  • B7. vm_stat page size hardcodedapi/system/route.ts:103 Hardcoded 16384. Should parse from vm_stat's first line: "(page size of NNNNN bytes)" for portability.

  • B8. Whisper models dir not configurableapi/whisper/route.ts:24 Hardcoded to ~/whisper-models. Should scan multiple known paths (/opt/homebrew/share/whisper-cpp/models/, ~/whisper-models, ~/.cache/whisper/) or accept WHISPER_MODELS_DIR env var.

  • B9. No AbortController for streaming fetchpage.tsx:250-289 Closing the prompt modal doesn't cancel the underlying fetch. The reader.read() loop continues in the background wasting CPU/bandwidth until the model finishes generating.

  • B10. Brew shows "Loading..." when array is emptypage.tsx:936-940 When system.brewPackages is [] (all uninstalled), displays "Loading..." instead of "No packages found". Needs to distinguish "still fetching" vs "fetched but empty".

  • B11. Prompt text not cleared on close without sendpage.tsx:951-957 Backdrop click clears promptText, but Escape handler (B4 fix) should also clear it. Otherwise stale text persists when re-opening.


2. Code Quality

  • CQ1. Monolithic 1,079-line single componentpage.tsx All interfaces, utilities, sub-components, and 900+ lines of JSX in one file. Extract to:

    • components/ — StatusDot, ProgressBar, ToastContainer, PromptModal, OllamaModelsPanel, SystemPanel, WhisperPanel, BrewPanel
    • lib/types.ts — interfaces (OllamaModel, SystemData, etc.)
    • lib/format.ts — formatBytes, formatUptime
    • lib/hooks.ts — useAutoRefresh, useToasts, useOllamaActions
  • CQ2. Pervasive inline styles instead of CSS/Tailwind classespage.tsx (100+ occurrences) Every style={{ color: 'var(--text-tertiary)' }} should be a utility class. Options: custom Tailwind theme mapping, or CSS utility classes in globals.css (e.g., .text-muted).

  • CQ3. OLLAMA_URL duplicatedapi/ollama/route.ts:3 + api/ollama/stream/route.ts:3 Same process.env.OLLAMA_URL || 'http://localhost:11434' in two files. Extract to lib/ollama-config.ts.

  • CQ4. No React Error Boundarypage.tsx Unexpected API response shape crashes the entire dashboard. Add an error.tsx (Next.js App Router convention) for graceful recovery.

  • CQ5. No loading skeleton / shimmer UI Initial load shows "..." placeholders. Skeleton cards would be more polished.

  • CQ6. No TypeScript strict null checks in API responses API route handlers catch errors but return loosely typed JSON. Add Zod validation on the Ollama/system responses to prevent runtime surprises.


3. Features

  • F1. Streaming pull with progress bar (fixes B2) Use Ollama stream: true for /api/pull. Create /api/ollama/pull/route.ts that pipes NDJSON progress. UI shows progress bar with completed/total bytes, speed, and ETA.

  • F2. Model search/filter Search input above models list. Filter by name, family, quantization. Useful when 10+ models are installed.

  • F3. Prompt history (localStorage) Store last 20 prompts with model name + timestamp. Dropdown in prompt modal to re-run previous prompts.

  • F4. Chat mode (multi-turn conversation) Use Ollama /api/chat instead of /api/generate. Chat bubble layout with message history. System prompt input field.

  • F5. Model comparison (side-by-side) Send same prompt to 2 models simultaneously. Display responses side-by-side with latency/quality comparison.

  • F6. Token/s metrics after generation Parse eval_count and eval_duration from the final NDJSON chunk. Display tokens/second, total tokens, and latency in the response footer.

  • F7. System resource sparklines (time-series) Ring buffer of memory/CPU snapshots (localStorage). Render mini sparkline charts in the System panel. Spot trends over time.

  • F8. Ollama server logs viewer Read ~/.ollama/logs/ and display in a collapsible terminal-style panel. Filter by level. Auto-scroll.

  • F9. Modelfile / template viewer The show action already fetches Modelfile, template, and system prompt. Display in a collapsible code block in expanded model details.

  • F10. Dark/light theme toggle Add :root.light CSS variable overrides. Theme toggle with localStorage persistence. Current architecture supports this natively.

  • F11. Keyboard shortcuts panel (? key) Show all shortcuts in a modal: ⌘+Enter (send), Esc (close), R (refresh), / (search models), ? (help).

  • F12. Whisper transcription test Upload/record a short audio clip, transcribe locally via whisper-cli, display result with latency. Tests the full local STT pipeline.

  • F13. Responsive mobile layout Better breakpoints for the 4-column stats row and 3-column main grid. Collapsible sidebar on mobile.

  • F14. Model tags/labels (localStorage) User-defined tags (coding, fast, vision) with colored badges. Persisted in localStorage.

  • F15. Extraction service integration panel Show extraction-service (port 4005) health status. Run test extractions against loaded Ollama models. Bridges dashboard to LysnrAI pipeline.

  • F16. Auto-load preferred model Mark a model as "auto-load" (stored in localStorage). When Ollama is online but no models loaded, auto-load the preferred model.


4. Performance & Reliability

  • P1. No request deduplication on Refreshpage.tsx:164-176 Rapid clicks on Refresh fire duplicate fetchAll() calls. Add a fetchingRef guard or disable the button during fetch (partially done for actionLoading but not for fetchAll).

  • P2. Static cache never expiresapi/system/route.ts:81-90 staticCache (chip, GPU, brew) lives forever in the server process. Brew package upgrades won't reflect. Add 5-minute TTL.

  • P3. du -sk ~/.ollama/models on every refreshapi/system/route.ts:41 Traverses entire models directory every 15 seconds. Cache with 60-second TTL.

  • P4. No fetch timeout on Ollama callsapi/ollama/route.ts:5-12 fetchOllama has no AbortSignal or timeout. If Ollama hangs, the dashboard hangs. Add 5-second timeout.

  • P5. system_profiler slow on first loadapi/system/route.ts:52-53 Takes ~2-3 seconds. Cached after first call, but first dashboard load waits. Consider eager background fetch on server start or return placeholder.


5. Security & Hardening

  • S1. No input validation on model namesapi/ollama/route.ts:50-51 model from request body passed directly to Ollama. Add regex validation: ^[a-zA-Z0-9._:/-]{1,256}$.

  • S2. Shell command interpolation patternapi/system/route.ts:67 execAsync(\brew list --versions ${pkg}`)— safe today (hardcoded targets) but fragile. UseexecFile('brew', ['list', '--versions', pkg])` for safety.

  • S3. No CORS or auth (acceptable for local-only, documented) Any local process can call API routes. Fine for dev tool; document the assumption.


6. Implementation Tracker

Sprint 1 — Critical Bug Fixes (est. 12 hrs)

# ID Task Effort Commit
1 - [x] B4 Guard Escape key during streaming 5 min 2da67c2
2 - [x] B5 Pause auto-refresh during prompt/pull 10 min 2da67c2
3 - [x] B9 Add AbortController to streaming fetch 15 min 2da67c2
4 - [x] B1 Dynamic chip/RAM in header 5 min 2da67c2
5 - [x] B11 Clear prompt text on Escape close 5 min 2da67c2
6 - [x] P4 Add timeout to Ollama fetch calls 10 min 2da67c2
7 - [x] B3 Remove dead generate action (or document) 5 min 2da67c2
8 - [x] B6 Use Date.now() for toast IDs 2 min 2da67c2
9 - [x] B10 Fix brew "Loading..." vs "empty" state 5 min 2da67c2

Sprint 2 — Pull Progress + Metrics (est. 23 hrs)

# ID Task Effort Commit
10 - [x] B2+F1 Streaming pull with progress bar 60 min 2d9475b
11 - [x] F6 Display tokens/s after generation 30 min 2d9475b
12 - [x] B7 Parse vm_stat page size dynamically 10 min 2d9475b
13 - [x] B8 Multi-path whisper model discovery 15 min 2d9475b

Sprint 3 — Component Refactor (est. 23 hrs)

# ID Task Effort Commit
14 - [x] CQ1 Extract components into separate files 90 min 75a3cd0
15 - [x] CQ4 Add error.tsx Error Boundary 15 min 75a3cd0
16 - [x] CQ3 Shared ollama-config.ts 10 min 75a3cd0
17 - [x] CQ2 Consolidate inline styles → CSS classes 45 min ed93a6f
18 - [x] S1 Add model name input validation 10 min 75a3cd0
19 - [x] S2 Replace exec → execFile for brew 10 min 75a3cd0

Sprint 4 — UX Enhancements (est. 34 hrs)

# ID Task Effort Commit
20 - [x] F3 Prompt history (localStorage) 45 min 9c2f5f3
21 - [x] F9 Modelfile viewer in expanded details 30 min 9c2f5f3
22 - [x] F4 Chat mode (multi-turn via /api/chat) 90 min ed93a6f
23 - [x] F2 Model search/filter 30 min 9c2f5f3
24 - [x] F11 Keyboard shortcuts panel 20 min 9c2f5f3

Sprint 5 — Integration & Polish (est. 23 hrs)

# ID Task Effort Commit
25 - [x] F15 Extraction service panel 60 min 8bdd5ee
26 - [x] F12 Whisper transcription test 45 min 8bdd5ee
27 - [x] F7 System resource sparklines 45 min 8bdd5ee
28 - [x] CQ5 Loading skeleton UI 20 min 8bdd5ee
29 - [x] P1-P3 Request dedup + cache TTLs 30 min b1fda3a
30 - [x] F16 Auto-load preferred model 20 min ed93a6f

Deferred (nice-to-have)

ID Task Notes
- [x] F5 Model comparison (side-by-side) 8bdd5ee
- [x] F10 Dark/light theme toggle ed93a6f
- [x] F13 Responsive mobile layout 8bdd5ee
- [x] F14 Model tags/labels ed93a6f
- [x] CQ6 Zod validation on API responses ed93a6f
- [x] F8 Ollama server logs viewer 8bdd5ee
- [x] S3 CORS / auth (documented) 8bdd5ee

7. Commit Log

Commits will be added here as work progresses.

# Date Commit Sprint Items Completed
1 Feb 19 2da67c2 Sprint 1 B1, B3, B4, B5, B6, B9, B10, B11, P4
2 Feb 19 2d9475b Sprint 2 B2, B7, B8, F1, F6
3 Feb 19 75a3cd0 Sprint 3 CQ1, CQ3, CQ4, S1, S2
4 Feb 19 9c2f5f3 Sprint 4 F2, F3, F9, F11
5 Feb 19 b1fda3a Sprint 5 P1, P2, P3
6 Feb 19 ed93a6f Sprint 6 CQ2, CQ6, P5, F4, F10, F14, F16
7 Feb 19 8bdd5ee Sprint 7 F5, F7, F8, F12, F13, F15, CQ5, S3

39 items total: 11 bugs, 6 code quality, 16 features, 5 performance, 3 security All 39 items completed across 7 sprints (9 code commits + doc updates) Actual total effort: ~8 hours across 7 sprints


8. Next Wave — Model Intelligence & Pre-Load Metrics

Proposed improvements focused on helping users make informed decisions before loading a model.

Tier A — Pre-Load Decision Metrics (est. 45 min)

ID Feature Description
N1 Estimated RAM per model Approximate from disk size: Q4_K_M ≈ 1.2×disk in RAM. Show on every model card (e.g., ~22 GB RAM), not just running models.
N2 "Will it fit?" indicator Compare estimated RAM vs system.memory.free + cached. Color-code: 🟢 Fits, 🟡 Tight (80100%), 🔴 Won't fit. Show on Load button or as badge.
N3 Aggregate loaded model RAM Sum VRAM of all running models. Display at top of models panel: "3 models loaded · 28.5 GB VRAM".

Tier B — Rich Model Metadata (est. 60 min)

ID Feature Description
N4 RAM budget bar Horizontal stacked bar: `[OS+Apps Model A (loaded) Model B (loaded) Free]`. Instant visual of memory headroom.
N5 Context window size Fetch context_length from Ollama /api/showmodel_info. Display on card (e.g., 128k ctx). Critical for knowing max prompt length.

Tier C — Model Intelligence Badges (est. 45 min)

ID Feature Description
N6 <think> warning badge If model is DeepSeek R1 family, show ⚠️ badge: "Emits <think> traces — strip before JSON.parse". Prevents silent JSON failures.
N7 Vision model indicator If model is multimodal (llava, qwen2.5vl), show 👁 badge. These need image input — text-only prompts are suboptimal.
N8 Architecture badge Show model arch (llama, qwen2, phi3, deepseek2) as subtle pill on the card. Currently buried in expanded details.
N9 Sort/order models Dropdown to sort by: name, size, parameters, running status, last modified. Currently uses Ollama's default order.
N10 Ollama version display Call /api/version. Show in Ollama status card. Useful for debugging model compatibility.

Tier D — Runtime Metrics & UX (est. 30 min)

ID Feature Description
N11 Last known tok/s per model Persist StreamMetrics.tokensPerSec in localStorage keyed by model. Show on card (e.g., ~45 tok/s). Compare speeds without re-benchmarking.
N12 Auto-unload countdown Replace static Expires: 3:45 PM with live countdown: Unloads in 4m 32s. More actionable.
N13 Session stats per model Track prompts sent + tokens generated per model in session. Show in expanded details.
N14 Delete confirmation + reclaim Show "Delete qwen2.5-coder:32b? Reclaim 18.5 GB disk." before deleting. Currently no confirmation.
N15 Simultaneous load suggestions Based on available RAM, suggest which models can be co-loaded. E.g., "Can co-load llama3.1:8b + qwen2.5-coder:32b (28 GB, 20 GB free)".

Implementation Plan

Sprint Items Focus Effort
8 N1, N2, N3 Pre-load RAM estimates ~45 min
9 N4, N5 RAM bar + context window ~60 min
10 N6, N7, N8, N9, N10 Badges + sort + version ~45 min
11 N11, N12, N13, N14, N15 Runtime metrics + UX ~30 min