docs(local-llm): consolidate dashboard docs into dashboard/docs/

- Created DASHBOARD_PRD.md — full updated PRD with current 19-file architecture, all 10 API routes, UI layout, data flow, localStorage keys, security model, and v1-v3 changelog. - Created DASHBOARD_ROADMAP.md — phased implementation plan for N1-N15 improvements across 4 phases: pre-load intelligence, rich metadata, model intelligence badges, runtime metrics. Includes acceptance criteria and implementation details per item. - Updated DASHBOARD_REVIEW.md — refreshed file inventory to 19 files (~2,930 lines), fixed broken Tier B markdown table, added cross-links. - Replaced __LOCAL_LLMs/docs/05-mission-control-dashboard.md with redirect pointer to new dashboard/docs/ location. Dashboard docs are now co-located at __LOCAL_LLMs/dashboard/docs/: - DASHBOARD_PRD.md (product requirements) - DASHBOARD_REVIEW.md (audit + 39 completed items + N1-N15 proposals) - DASHBOARD_ROADMAP.md (phased implementation plan)
2026-02-19 22:54:18 -08:00 · 2026-02-19 22:54:18 -08:00 · cd6e561f1b
commit cd6e561f1b
parent 519f348583
4 changed files with 798 additions and 264 deletions
--- a/__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md
+++ b/__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md
@ -0,0 +1,307 @@
+# Mission Control Dashboard — Product Requirements Document
+
+> Local LLM management dashboard for macOS development machines.
+> Last updated: Feb 19, 2026
+>
+> See also: [DASHBOARD_REVIEW.md](DASHBOARD_REVIEW.md) · [DASHBOARD_ROADMAP.md](DASHBOARD_ROADMAP.md)
+
+---
+
+## 1. Overview
+
+A dark-themed, real-time dashboard built with Next.js 16 for managing a local LLM stack (Ollama + Whisper.cpp) on macOS Apple Silicon. Designed as a developer tool — no authentication, no cloud dependencies, runs entirely on localhost.
+
+### Key Capabilities
+
+- **Ollama management** — model listing, load/unload, pull/delete, VRAM monitoring
+- **Streaming prompt interface** — send prompts with real-time NDJSON streaming, token metrics
+- **Chat mode** — multi-turn conversations with system prompt, chat bubble UI
+- **Model comparison** — side-by-side prompt responses from two models
+- **System monitoring** — chip, RAM, disk, uptime, memory sparklines, Ollama disk footprint
+- **Whisper.cpp integration** — binary/model discovery, audio transcription test
+- **Extraction service health** — server-side proxy to check extraction-service (port 4005)
+- **Ollama logs viewer** — collapsible terminal-style panel with color-coded log levels
+- **Model management UX** — search/filter, tags, auto-load preferred model, prompt history
+- **Theme support** — dark/light toggle with localStorage persistence
+
+---
+
+## 2. Tech Stack
+
+| Component  | Technology                            |
+| ---------- | ------------------------------------- |
+| Framework  | Next.js 16 (App Router)               |
+| Language   | TypeScript (strict)                   |
+| React      | 19                                    |
+| Styling    | TailwindCSS v4                        |
+| Icons      | Lucide React                          |
+| Validation | Zod                                   |
+| Theme      | Dark default — ByteLyst design tokens |
+| Port       | 3100 (dev)                            |
+
+---
+
+## 3. Architecture
+
+```
+dashboard/
+├── docs/
+│   ├── DASHBOARD_PRD.md           ← This file
+│   ├── DASHBOARD_REVIEW.md        ← Bug/improvement audit (39 items, all complete)
+│   └── DASHBOARD_ROADMAP.md       ← Next-wave implementation plan (N1–N15)
+├── src/
+│   ├── app/
+│   │   ├── page.tsx               ← Main dashboard (1,885 lines, client component)
+│   │   ├── layout.tsx             ← Root layout (dark theme, system fonts)
+│   │   ├── error.tsx              ← React Error Boundary
+│   │   ├── globals.css            ← Design tokens, CSS utilities, skeleton, light theme
+│   │   ├── components/
+│   │   │   ├── StatusDot.tsx      ← Pulsing online/offline indicator
+│   │   │   ├── ProgressBar.tsx    ← Colored progress bar with percentage
+│   │   │   └── Sparkline.tsx      ← SVG sparkline chart for time-series data
+│   │   ├── lib/
+│   │   │   ├── types.ts           ← TypeScript interfaces (OllamaData, SystemData, etc.)
+│   │   │   ├── format.ts          ← formatBytes, formatUptime utilities
+│   │   │   └── ollama-config.ts   ← Shared OLLAMA_URL constant
+│   │   └── api/
+│   │       ├── ollama/
+│   │       │   ├── route.ts       ← REST proxy (list, load, unload, pull, delete, show)
+│   │       │   ├── stream/route.ts ← Streaming generate proxy (NDJSON)
+│   │       │   ├── chat/route.ts  ← Streaming chat proxy (multi-turn)
+│   │       │   ├── pull/route.ts  ← Streaming pull with progress events
+│   │       │   └── logs/route.ts  ← Ollama server log reader
+│   │       ├── whisper/
+│   │       │   ├── route.ts       ← Whisper binary + GGML model discovery
+│   │       │   └── transcribe/route.ts ← Audio transcription via whisper-cli
+│   │       ├── system/route.ts    ← System info (chip, RAM, disk, brew) with TTL cache
+│   │       └── extraction/
+│   │           └── health/route.ts ← Extraction service health proxy
+├── package.json
+├── next.config.ts
+├── tsconfig.json
+└── postcss.config.mjs
+```
+
+**19 source files, ~2,930 lines total.**
+
+---
+
+## 4. Running
+
+```bash
+cd __LOCAL_LLMs/dashboard
+npm install          # first time only
+npm run dev -- -p 3100
+```
+
+Open: **http://localhost:3100**
+
+Prerequisites: Ollama running (`ollama serve` or `brew services start ollama`).
+
+---
+
+## 5. API Routes
+
+### Ollama
+
+| Method | Route                | Purpose                                        |
+| ------ | -------------------- | ---------------------------------------------- |
+| GET    | `/api/ollama`        | List models + running models (status, VRAM)    |
+| POST   | `/api/ollama`        | Load, unload, pull, delete, show model actions |
+| POST   | `/api/ollama/stream` | Streaming generate (NDJSON proxy)              |
+| POST   | `/api/ollama/chat`   | Streaming chat (multi-turn, NDJSON)            |
+| POST   | `/api/ollama/pull`   | Streaming pull with progress events            |
+| GET    | `/api/ollama/logs`   | Read last 100 lines of Ollama log files        |
+
+### System & Services
+
+| Method | Route                     | Purpose                                  |
+| ------ | ------------------------- | ---------------------------------------- |
+| GET    | `/api/system`             | Chip, RAM (vm_stat), disk, brew, uptime  |
+| GET    | `/api/whisper`            | Whisper binary + model discovery         |
+| POST   | `/api/whisper/transcribe` | Upload audio → whisper-cli → text        |
+| GET    | `/api/extraction/health`  | Proxy to extraction-service :4005/health |
+
+### Key Implementation Details
+
+- **Zod validation** on all POST bodies (model names, actions)
+- **AbortSignal timeout** (5s) on all Ollama fetch calls
+- **TTL caching**: static system info (5 min), Ollama disk usage (60s)
+- **Eager cache warming** on module load for system_profiler data
+- **execFile** (not exec) for all shell commands — prevents injection
+- **Server-side proxy** for extraction health — avoids browser CORS errors
+
+---
+
+## 6. Dashboard UI Layout
+
+### Header
+
+- App icon + title ("Local LLM Mission Control")
+- Machine info subtitle (chip, RAM, OS — dynamic)
+- Last refresh timestamp + manual refresh button
+- Dark/light theme toggle (localStorage persistence)
+
+### Top Stats Row (4 cards)
+
+1. **Ollama** — online/offline status with pulsing dot, server URL
+2. **Models** — count + total disk size, number loaded in RAM
+3. **Whisper** — installed/not found, model count
+4. **Memory** — used/total with color-coded progress bar + sparkline history
+
+Skeleton shimmer cards shown during initial load.
+
+### Main Grid (3 columns)
+
+**Left (2/3) — Ollama Models:**
+
+- Search/filter bar (by name, family, quantization)
+- Pull model input with streaming progress bar
+- Per-model cards showing: name, disk size, parameters, quantization
+- Running models: green highlight, LOADED badge, VRAM usage, expiry time
+- Actions: Load, Unload, Prompt, Delete
+- Expanded details: digest, modified date, family, modelfile viewer
+- Model tags (coding, chat, fast, vision, reasoning) — localStorage
+- Auto-load preferred model (star toggle) — localStorage
+
+**Right (1/3):**
+
+- **System** — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
+- **Extraction Service** — health status from extraction-service (port 4005)
+- **Whisper.cpp** — status, binary list, models, transcription test (file upload → text)
+- **Brew Packages** — name + version for ollama, whisper-cpp, ffmpeg
+
+### Prompt Modal (overlay)
+
+- Textarea with Cmd+Enter to send
+- Chat mode toggle — multi-turn conversation with system prompt
+- Prompt history dropdown (last 20, localStorage)
+- Streaming response display with token metrics (tok/s, total, latency)
+- Model comparison — quick-compare buttons to run same prompt on another model
+- Side-by-side response display when comparing
+- Copy response button
+
+### Ollama Logs Panel (below grid)
+
+- Collapsible terminal-style panel
+- Color-coded log levels (error=red, warn=yellow, info=default)
+- Fetches via server-side `/api/ollama/logs`
+
+### Keyboard Shortcuts
+
+| Shortcut    | Action             |
+| ----------- | ------------------ |
+| `Cmd+Enter` | Send prompt        |
+| `Escape`    | Close modal        |
+| `R`         | Refresh dashboard  |
+| `/`         | Focus model search |
+| `?`         | Show shortcuts     |
+
+---
+
+## 7. Design Tokens
+
+ByteLyst design system colors:
+
+| Token                | Hex       | Use               |
+| -------------------- | --------- | ----------------- |
+| `--bg-canvas`        | `#06070A` | Page background   |
+| `--bg-elevated`      | `#0E1118` | Modal background  |
+| `--surface-card`     | `#121725` | Card backgrounds  |
+| `--surface-muted`    | `#1A2335` | Muted areas       |
+| `--text-primary`     | `#EFF4FF` | Main text         |
+| `--text-secondary`   | `#A5B1C7` | Descriptions      |
+| `--text-tertiary`    | `#6C7C98` | Hints, timestamps |
+| `--accent-primary`   | `#5A8CFF` | Primary actions   |
+| `--accent-secondary` | `#2EE6D6` | Secondary accent  |
+| `--success`          | `#34D399` | Online, loaded    |
+| `--warning`          | `#F59E0B` | Warning state     |
+| `--danger`           | `#FF6E6E` | Offline, errors   |
+| `--purple`           | `#A78BFA` | Whisper, disk     |
+
+Light theme overrides defined in `globals.css` under `.light` class.
+
+---
+
+## 8. Data Flow
+
+```
+Browser (page.tsx)
+  ├── GET /api/ollama          → Ollama :11434/api/tags + /api/ps
+  ├── POST /api/ollama/stream  → Ollama :11434/api/generate (NDJSON)
+  ├── POST /api/ollama/chat    → Ollama :11434/api/chat (NDJSON)
+  ├── POST /api/ollama/pull    → Ollama :11434/api/pull (NDJSON progress)
+  ├── GET /api/ollama/logs     → ~/.ollama/logs/ or macOS unified logging
+  ├── GET /api/system          → system_profiler, vm_stat, df, du, brew
+  ├── GET /api/whisper         → which whisper-cli, ls models dirs
+  ├── POST /api/whisper/transcribe → whisper-cli -f <audio> -m <model>
+  └── GET /api/extraction/health   → extraction-service :4005/health
+```
+
+Auto-refresh: every 15 seconds (paused during streaming/pull).
+Request deduplication: `fetchingRef` guard prevents parallel fetchAll calls.
+
+---
+
+## 9. LocalStorage Persistence
+
+| Key              | Type        | Purpose                                |
+| ---------------- | ----------- | -------------------------------------- |
+| `promptHistory`  | JSON array  | Last 20 prompts with model + timestamp |
+| `modelTags`      | JSON object | User-defined tags per model name       |
+| `autoLoadModel`  | string      | Preferred model to auto-load           |
+| `dashboardTheme` | string      | `"dark"` or `"light"`                  |
+
+---
+
+## 10. Security Model
+
+- **Local-only**: Dashboard runs on localhost:3100, no auth required
+- **No CORS headers**: All API routes are same-origin
+- **Input validation**: Zod schemas on all POST bodies; model names validated via regex
+- **Shell safety**: All subprocess calls use `execFile` (never string interpolation)
+- **External calls proxied**: Extraction health check goes through server-side API route, not direct browser fetch
+
+This is documented in code at `page.tsx` (S3 comment).
+
+---
+
+## 11. Changelog
+
+### v3 (Feb 19, 2026) — Sprints 6–7
+
+- Chat mode with multi-turn conversations and system prompt (F4)
+- Model comparison side-by-side UI (F5)
+- Memory sparklines with ring buffer (F7)
+- Ollama logs viewer with color-coded levels (F8)
+- Dark/light theme toggle with localStorage (F10)
+- Whisper transcription test with file upload (F12)
+- Responsive mobile layout improvements (F13)
+- Model tags/labels with localStorage (F14)
+- Extraction service health panel (F15)
+- Auto-load preferred model (F16)
+- CSS utility classes replacing inline styles (CQ2)
+- Skeleton shimmer loading UI (CQ5)
+- Zod validation on API responses (CQ6)
+- Eager cache warming for system_profiler (P5)
+
+### v2 (Feb 19, 2026) — Sprints 1–5
+
+- Fixed 11 bugs (hardcoded specs, streaming escape, auto-refresh, toast IDs, etc.)
+- Streaming pull with progress bar (F1)
+- Token/s metrics after generation (F6)
+- Component extraction (CQ1) — StatusDot, ProgressBar, types, format, config
+- Error boundary (CQ4)
+- Shared ollama-config (CQ3)
+- Model search/filter (F2), prompt history (F3), keyboard shortcuts (F11)
+- Modelfile viewer (F9)
+- Request deduplication + cache TTLs (P1–P3)
+- Input validation (S1) + execFile (S2)
+
+### v1 (Feb 19, 2026) — Initial
+
+- Ollama status, model list, load/unload, basic prompt
+- System info (chip, RAM, disk, brew)
+- Whisper.cpp discovery
+- Auto-refresh every 15 seconds
+- Dark theme with ByteLyst design tokens
--- a/__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md
+++ b/__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md
@ -0,0 +1,329 @@
+# Mission Control Dashboard — Bug & Improvement Review
+
+> Systematic code review of `__LOCAL_LLMs/dashboard/` — 19 source files, ~2,930 lines
+> Last updated: Feb 19, 2026
+>
+> See also: [DASHBOARD_PRD.md](DASHBOARD_PRD.md) · [DASHBOARD_ROADMAP.md](DASHBOARD_ROADMAP.md)
+
+---
+
+## File Inventory
+
+| File                                      | Lines | Purpose                                              |
+| ----------------------------------------- | ----- | ---------------------------------------------------- |
+| **UI**                                    |       |                                                      |
+| `src/app/page.tsx`                        | 1,885 | Main dashboard (client component, all panels)        |
+| `src/app/layout.tsx`                      | 19    | Root layout (dark theme, system fonts)               |
+| `src/app/error.tsx`                       | 50    | React Error Boundary (CQ4)                           |
+| `src/app/globals.css`                     | 163   | Design tokens, utilities, skeleton, light theme      |
+| **Components**                            |       |                                                      |
+| `src/app/components/StatusDot.tsx`        | 10    | Pulsing online/offline indicator                     |
+| `src/app/components/ProgressBar.tsx`      | 18    | Colored progress bar with percentage                 |
+| `src/app/components/Sparkline.tsx`        | 42    | SVG sparkline chart (F7)                             |
+| **Lib**                                   |       |                                                      |
+| `src/app/lib/types.ts`                    | 75    | TypeScript interfaces (OllamaData, SystemData, etc.) |
+| `src/app/lib/format.ts`                   | 16    | formatBytes, formatUptime utilities                  |
+| `src/app/lib/ollama-config.ts`            | 1     | Shared OLLAMA_URL constant (CQ3)                     |
+| **API Routes — Ollama**                   |       |                                                      |
+| `src/app/api/ollama/route.ts`             | 124   | REST proxy (list, load, unload, pull, delete, show)  |
+| `src/app/api/ollama/stream/route.ts`      | 36    | Streaming generate proxy (NDJSON)                    |
+| `src/app/api/ollama/chat/route.ts`        | 42    | Streaming chat proxy (F4)                            |
+| `src/app/api/ollama/pull/route.ts`        | 43    | Streaming pull with progress (F1)                    |
+| `src/app/api/ollama/logs/route.ts`        | 36    | Ollama server log reader (F8)                        |
+| **API Routes — Other**                    |       |                                                      |
+| `src/app/api/system/route.ts`             | 179   | System info (chip, RAM, disk, brew) with TTL cache   |
+| `src/app/api/whisper/route.ts`            | 79    | Whisper binary + GGML model discovery                |
+| `src/app/api/whisper/transcribe/route.ts` | 93    | Whisper transcription test (F12)                     |
+| `src/app/api/extraction/health/route.ts`  | 18    | Extraction service health proxy (F15)                |
+
+**Stack:** Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript, Zod
+
+---
+
+## 1. Bugs
+
+- [x] **B1. Hardcoded machine specs in header** — `page.tsx:317`
+      Subtitle reads `Apple M4 Pro · 48 GB · {system?.platform}` — should use `system?.chip` and `formatBytes(system?.memory.total)` dynamically so it works on any machine.
+
+- [x] **B2. Pull model blocks UI — no progress feedback** — `api/ollama/route.ts:84-92`
+      `handlePull` calls Ollama with `stream: false`. Large models (20+ GB) block for 30+ minutes. The Next.js API route will likely timeout. Must use `stream: true` and pipe progress events to the client. _(Combined with F1.)_
+
+- [x] **B3. Dead code: non-streaming `generate` action** — `api/ollama/route.ts:69-82`
+      The `action === 'generate'` handler is unused — UI only uses `/api/ollama/stream`. Remove or keep as fallback with a comment.
+
+- [x] **B4. Escape key closes modal during active streaming** — `page.tsx:188-197`
+      Global `keydown` handler calls `setPromptModel(null)` unconditionally. Backdrop click correctly checks `!promptLoading`. Escape should also respect `promptLoading` to prevent discarding an in-flight response.
+
+- [x] **B5. Auto-refresh (15s) fires during streaming/pull** — `page.tsx:182-185`
+      `setInterval(fetchAll, 15000)` runs unconditionally. During streaming this causes background churn and potential UI flicker. Should pause while `promptLoading` or `pullLoading` is true.
+
+- [x] **B6. Toast ID collision on HMR remount** — `page.tsx:156-159`
+      `toastId.current` resets to 0 on component remount during dev. Use `Date.now()` or `crypto.randomUUID()` for robust uniqueness.
+
+- [x] **B7. vm_stat page size hardcoded** — `api/system/route.ts:103`
+      Hardcoded `16384`. Should parse from vm_stat's first line: `"(page size of NNNNN bytes)"` for portability.
+
+- [x] **B8. Whisper models dir not configurable** — `api/whisper/route.ts:24`
+      Hardcoded to `~/whisper-models`. Should scan multiple known paths (`/opt/homebrew/share/whisper-cpp/models/`, `~/whisper-models`, `~/.cache/whisper/`) or accept `WHISPER_MODELS_DIR` env var.
+
+- [x] **B9. No AbortController for streaming fetch** — `page.tsx:250-289`
+      Closing the prompt modal doesn't cancel the underlying fetch. The `reader.read()` loop continues in the background wasting CPU/bandwidth until the model finishes generating.
+
+- [x] **B10. Brew shows "Loading..." when array is empty** — `page.tsx:936-940`
+      When `system.brewPackages` is `[]` (all uninstalled), displays "Loading..." instead of "No packages found". Needs to distinguish "still fetching" vs "fetched but empty".
+
+- [x] **B11. Prompt text not cleared on close without send** — `page.tsx:951-957`
+      Backdrop click clears `promptText`, but Escape handler (B4 fix) should also clear it. Otherwise stale text persists when re-opening.
+
+---
+
+## 2. Code Quality
+
+- [x] **CQ1. Monolithic 1,079-line single component** — `page.tsx`
+      All interfaces, utilities, sub-components, and 900+ lines of JSX in one file. Extract to:
+  - `components/` — StatusDot, ProgressBar, ToastContainer, PromptModal, OllamaModelsPanel, SystemPanel, WhisperPanel, BrewPanel
+  - `lib/types.ts` — interfaces (OllamaModel, SystemData, etc.)
+  - `lib/format.ts` — formatBytes, formatUptime
+  - `lib/hooks.ts` — useAutoRefresh, useToasts, useOllamaActions
+
+- [x] **CQ2. Pervasive inline styles instead of CSS/Tailwind classes** — `page.tsx` (100+ occurrences)
+      Every `style={{ color: 'var(--text-tertiary)' }}` should be a utility class. Options: custom Tailwind theme mapping, or CSS utility classes in `globals.css` (e.g., `.text-muted`).
+
+- [x] **CQ3. OLLAMA_URL duplicated** — `api/ollama/route.ts:3` + `api/ollama/stream/route.ts:3`
+      Same `process.env.OLLAMA_URL || 'http://localhost:11434'` in two files. Extract to `lib/ollama-config.ts`.
+
+- [x] **CQ4. No React Error Boundary** — `page.tsx`
+      Unexpected API response shape crashes the entire dashboard. Add an `error.tsx` (Next.js App Router convention) for graceful recovery.
+
+- [x] **CQ5. No loading skeleton / shimmer UI**
+      Initial load shows "..." placeholders. Skeleton cards would be more polished.
+
+- [x] **CQ6. No TypeScript strict null checks in API responses**
+      API route handlers catch errors but return loosely typed JSON. Add Zod validation on the Ollama/system responses to prevent runtime surprises.
+
+---
+
+## 3. Features
+
+- [x] **F1. Streaming pull with progress bar** _(fixes B2)_
+      Use Ollama `stream: true` for `/api/pull`. Create `/api/ollama/pull/route.ts` that pipes NDJSON progress. UI shows progress bar with `completed/total` bytes, speed, and ETA.
+
+- [x] **F2. Model search/filter**
+      Search input above models list. Filter by name, family, quantization. Useful when 10+ models are installed.
+
+- [x] **F3. Prompt history (localStorage)**
+      Store last 20 prompts with model name + timestamp. Dropdown in prompt modal to re-run previous prompts.
+
+- [x] **F4. Chat mode (multi-turn conversation)**
+      Use Ollama `/api/chat` instead of `/api/generate`. Chat bubble layout with message history. System prompt input field.
+
+- [x] **F5. Model comparison (side-by-side)**
+      Send same prompt to 2 models simultaneously. Display responses side-by-side with latency/quality comparison.
+
+- [x] **F6. Token/s metrics after generation**
+      Parse `eval_count` and `eval_duration` from the final NDJSON chunk. Display tokens/second, total tokens, and latency in the response footer.
+
+- [x] **F7. System resource sparklines (time-series)**
+      Ring buffer of memory/CPU snapshots (localStorage). Render mini sparkline charts in the System panel. Spot trends over time.
+
+- [x] **F8. Ollama server logs viewer**
+      Read `~/.ollama/logs/` and display in a collapsible terminal-style panel. Filter by level. Auto-scroll.
+
+- [x] **F9. Modelfile / template viewer**
+      The `show` action already fetches Modelfile, template, and system prompt. Display in a collapsible code block in expanded model details.
+
+- [x] **F10. Dark/light theme toggle**
+      Add `:root.light` CSS variable overrides. Theme toggle with localStorage persistence. Current architecture supports this natively.
+
+- [x] **F11. Keyboard shortcuts panel (`?` key)**
+      Show all shortcuts in a modal: ⌘+Enter (send), Esc (close), R (refresh), / (search models), ? (help).
+
+- [x] **F12. Whisper transcription test**
+      Upload/record a short audio clip, transcribe locally via whisper-cli, display result with latency. Tests the full local STT pipeline.
+
+- [x] **F13. Responsive mobile layout**
+      Better breakpoints for the 4-column stats row and 3-column main grid. Collapsible sidebar on mobile.
+
+- [x] **F14. Model tags/labels (localStorage)**
+      User-defined tags (coding, fast, vision) with colored badges. Persisted in localStorage.
+
+- [x] **F15. Extraction service integration panel**
+      Show extraction-service (port 4005) health status. Run test extractions against loaded Ollama models. Bridges dashboard to LysnrAI pipeline.
+
+- [x] **F16. Auto-load preferred model**
+      Mark a model as "auto-load" (stored in localStorage). When Ollama is online but no models loaded, auto-load the preferred model.
+
+---
+
+## 4. Performance & Reliability
+
+- [x] **P1. No request deduplication on Refresh** — `page.tsx:164-176`
+      Rapid clicks on Refresh fire duplicate `fetchAll()` calls. Add a `fetchingRef` guard or disable the button during fetch (partially done for `actionLoading` but not for `fetchAll`).
+
+- [x] **P2. Static cache never expires** — `api/system/route.ts:81-90`
+      `staticCache` (chip, GPU, brew) lives forever in the server process. Brew package upgrades won't reflect. Add 5-minute TTL.
+
+- [x] **P3. `du -sk ~/.ollama/models` on every refresh** — `api/system/route.ts:41`
+      Traverses entire models directory every 15 seconds. Cache with 60-second TTL.
+
+- [x] **P4. No fetch timeout on Ollama calls** — `api/ollama/route.ts:5-12`
+      `fetchOllama` has no `AbortSignal` or timeout. If Ollama hangs, the dashboard hangs. Add 5-second timeout.
+
+- [x] **P5. `system_profiler` slow on first load** — `api/system/route.ts:52-53`
+      Takes ~2-3 seconds. Cached after first call, but first dashboard load waits. Consider eager background fetch on server start or return placeholder.
+
+---
+
+## 5. Security & Hardening
+
+- [x] **S1. No input validation on model names** — `api/ollama/route.ts:50-51`
+      `model` from request body passed directly to Ollama. Add regex validation: `^[a-zA-Z0-9._:/-]{1,256}$`.
+
+- [x] **S2. Shell command interpolation pattern** — `api/system/route.ts:67`
+      `execAsync(\`brew list --versions ${pkg}\`)`— safe today (hardcoded targets) but fragile. Use`execFile('brew', ['list', '--versions', pkg])` for safety.
+
+- [x] **S3. No CORS or auth** _(acceptable for local-only, documented)_
+      Any local process can call API routes. Fine for dev tool; document the assumption.
+
+---
+
+## 6. Implementation Tracker
+
+### Sprint 1 — Critical Bug Fixes _(est. 1–2 hrs)_
+
+| #   | ID        | Task                                      | Effort | Commit    |
+| --- | --------- | ----------------------------------------- | ------ | --------- |
+| 1   | - [x] B4  | Guard Escape key during streaming         | 5 min  | `2da67c2` |
+| 2   | - [x] B5  | Pause auto-refresh during prompt/pull     | 10 min | `2da67c2` |
+| 3   | - [x] B9  | Add AbortController to streaming fetch    | 15 min | `2da67c2` |
+| 4   | - [x] B1  | Dynamic chip/RAM in header                | 5 min  | `2da67c2` |
+| 5   | - [x] B11 | Clear prompt text on Escape close         | 5 min  | `2da67c2` |
+| 6   | - [x] P4  | Add timeout to Ollama fetch calls         | 10 min | `2da67c2` |
+| 7   | - [x] B3  | Remove dead generate action (or document) | 5 min  | `2da67c2` |
+| 8   | - [x] B6  | Use Date.now() for toast IDs              | 2 min  | `2da67c2` |
+| 9   | - [x] B10 | Fix brew "Loading..." vs "empty" state    | 5 min  | `2da67c2` |
+
+### Sprint 2 — Pull Progress + Metrics _(est. 2–3 hrs)_
+
+| #   | ID          | Task                                | Effort | Commit    |
+| --- | ----------- | ----------------------------------- | ------ | --------- |
+| 10  | - [x] B2+F1 | Streaming pull with progress bar    | 60 min | `2d9475b` |
+| 11  | - [x] F6    | Display tokens/s after generation   | 30 min | `2d9475b` |
+| 12  | - [x] B7    | Parse vm_stat page size dynamically | 10 min | `2d9475b` |
+| 13  | - [x] B8    | Multi-path whisper model discovery  | 15 min | `2d9475b` |
+
+### Sprint 3 — Component Refactor _(est. 2–3 hrs)_
+
+| #   | ID        | Task                                    | Effort | Commit    |
+| --- | --------- | --------------------------------------- | ------ | --------- |
+| 14  | - [x] CQ1 | Extract components into separate files  | 90 min | `75a3cd0` |
+| 15  | - [x] CQ4 | Add error.tsx Error Boundary            | 15 min | `75a3cd0` |
+| 16  | - [x] CQ3 | Shared ollama-config.ts                 | 10 min | `75a3cd0` |
+| 17  | - [x] CQ2 | Consolidate inline styles → CSS classes | 45 min | `ed93a6f` |
+| 18  | - [x] S1  | Add model name input validation         | 10 min | `75a3cd0` |
+| 19  | - [x] S2  | Replace exec → execFile for brew        | 10 min | `75a3cd0` |
+
+### Sprint 4 — UX Enhancements _(est. 3–4 hrs)_
+
+| #   | ID        | Task                                 | Effort | Commit    |
+| --- | --------- | ------------------------------------ | ------ | --------- |
+| 20  | - [x] F3  | Prompt history (localStorage)        | 45 min | `9c2f5f3` |
+| 21  | - [x] F9  | Modelfile viewer in expanded details | 30 min | `9c2f5f3` |
+| 22  | - [x] F4  | Chat mode (multi-turn via /api/chat) | 90 min | `ed93a6f` |
+| 23  | - [x] F2  | Model search/filter                  | 30 min | `9c2f5f3` |
+| 24  | - [x] F11 | Keyboard shortcuts panel             | 20 min | `9c2f5f3` |
+
+### Sprint 5 — Integration & Polish _(est. 2–3 hrs)_
+
+| #   | ID          | Task                       | Effort | Commit    |
+| --- | ----------- | -------------------------- | ------ | --------- |
+| 25  | - [x] F15   | Extraction service panel   | 60 min | `8bdd5ee` |
+| 26  | - [x] F12   | Whisper transcription test | 45 min | `8bdd5ee` |
+| 27  | - [x] F7    | System resource sparklines | 45 min | `8bdd5ee` |
+| 28  | - [x] CQ5   | Loading skeleton UI        | 20 min | `8bdd5ee` |
+| 29  | - [x] P1-P3 | Request dedup + cache TTLs | 30 min | `b1fda3a` |
+| 30  | - [x] F16   | Auto-load preferred model  | 20 min | `ed93a6f` |
+
+### Deferred (nice-to-have)
+
+| ID        | Task                            | Notes     |
+| --------- | ------------------------------- | --------- |
+| - [x] F5  | Model comparison (side-by-side) | `8bdd5ee` |
+| - [x] F10 | Dark/light theme toggle         | `ed93a6f` |
+| - [x] F13 | Responsive mobile layout        | `8bdd5ee` |
+| - [x] F14 | Model tags/labels               | `ed93a6f` |
+| - [x] CQ6 | Zod validation on API responses | `ed93a6f` |
+| - [x] F8  | Ollama server logs viewer       | `8bdd5ee` |
+| - [x] S3  | CORS / auth (documented)        | `8bdd5ee` |
+
+---
+
+## 7. Commit Log
+
+_Commits will be added here as work progresses._
+
+| #   | Date   | Commit    | Sprint   | Items Completed                      |
+| --- | ------ | --------- | -------- | ------------------------------------ |
+| 1   | Feb 19 | `2da67c2` | Sprint 1 | B1, B3, B4, B5, B6, B9, B10, B11, P4 |
+| 2   | Feb 19 | `2d9475b` | Sprint 2 | B2, B7, B8, F1, F6                   |
+| 3   | Feb 19 | `75a3cd0` | Sprint 3 | CQ1, CQ3, CQ4, S1, S2                |
+| 4   | Feb 19 | `9c2f5f3` | Sprint 4 | F2, F3, F9, F11                      |
+| 5   | Feb 19 | `b1fda3a` | Sprint 5 | P1, P2, P3                           |
+| 6   | Feb 19 | `ed93a6f` | Sprint 6 | CQ2, CQ6, P5, F4, F10, F14, F16      |
+| 7   | Feb 19 | `8bdd5ee` | Sprint 7 | F5, F7, F8, F12, F13, F15, CQ5, S3   |
+
+---
+
+> **39 items total:** 11 bugs, 6 code quality, 16 features, 5 performance, 3 security
+> **All 39 items completed** across 7 sprints (9 code commits + doc updates)
+> **Actual total effort:** ~8 hours across 7 sprints
+
+---
+
+## 8. Next Wave — Model Intelligence & Pre-Load Metrics
+
+> Proposed improvements focused on helping users make informed decisions **before** loading a model.
+
+### Tier A — Pre-Load Decision Metrics _(est. 45 min)_
+
+| ID  | Feature                        | Description                                                                                                                                     |
+| --- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |
+| N1  | **Estimated RAM per model**    | Approximate from disk size: Q4_K_M ≈ 1.2×disk in RAM. Show on every model card (e.g., `~22 GB RAM`), not just running models.                   |
+| N2  | **"Will it fit?" indicator**   | Compare estimated RAM vs `system.memory.free + cached`. Color-code: 🟢 Fits, 🟡 Tight (80–100%), 🔴 Won't fit. Show on Load button or as badge. |
+| N3  | **Aggregate loaded model RAM** | Sum VRAM of all running models. Display at top of models panel: "3 models loaded · 28.5 GB VRAM".                                               |
+
+### Tier B — Rich Model Metadata _(est. 60 min)_
+
+| ID  | Feature                 | Description                                                                                                                                |
+| --- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
+| N4  | **RAM budget bar**      | Horizontal stacked bar: `[OS+Apps ▏ Model A (loaded) ▏ Model B (loaded) ▏ Free]`. Instant visual of memory headroom.                       |
+| N5  | **Context window size** | Fetch `context_length` from Ollama `/api/show` → `model_info`. Display on card (e.g., `128k ctx`). Critical for knowing max prompt length. |
+
+### Tier C — Model Intelligence Badges _(est. 45 min)_
+
+| ID  | Feature                     | Description                                                                                                                       |
+| --- | --------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
+| N6  | **`<think>` warning badge** | If model is DeepSeek R1 family, show ⚠️ badge: "Emits `<think>` traces — strip before JSON.parse". Prevents silent JSON failures. |
+| N7  | **Vision model indicator**  | If model is multimodal (llava, qwen2.5vl), show 👁 badge. These need image input — text-only prompts are suboptimal.              |
+| N8  | **Architecture badge**      | Show model arch (llama, qwen2, phi3, deepseek2) as subtle pill on the card. Currently buried in expanded details.                 |
+| N9  | **Sort/order models**       | Dropdown to sort by: name, size, parameters, running status, last modified. Currently uses Ollama's default order.                |
+| N10 | **Ollama version display**  | Call `/api/version`. Show in Ollama status card. Useful for debugging model compatibility.                                        |
+
+### Tier D — Runtime Metrics & UX _(est. 30 min)_
+
+| ID  | Feature                           | Description                                                                                                                                    |
+| --- | --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| N11 | **Last known tok/s per model**    | Persist `StreamMetrics.tokensPerSec` in localStorage keyed by model. Show on card (e.g., `~45 tok/s`). Compare speeds without re-benchmarking. |
+| N12 | **Auto-unload countdown**         | Replace static `Expires: 3:45 PM` with live countdown: `Unloads in 4m 32s`. More actionable.                                                   |
+| N13 | **Session stats per model**       | Track prompts sent + tokens generated per model in session. Show in expanded details.                                                          |
+| N14 | **Delete confirmation + reclaim** | Show "Delete qwen2.5-coder:32b? Reclaim 18.5 GB disk." before deleting. Currently no confirmation.                                             |
+| N15 | **Simultaneous load suggestions** | Based on available RAM, suggest which models can be co-loaded. E.g., "Can co-load llama3.1:8b + qwen2.5-coder:32b (28 GB, 20 GB free)".        |
+
+### Implementation Plan
+
+| Sprint | Items                   | Focus                    | Effort  |
+| ------ | ----------------------- | ------------------------ | ------- |
+| 8      | N1, N2, N3              | Pre-load RAM estimates   | ~45 min |
+| 9      | N4, N5                  | RAM bar + context window | ~60 min |
+| 10     | N6, N7, N8, N9, N10     | Badges + sort + version  | ~45 min |
+| 11     | N11, N12, N13, N14, N15 | Runtime metrics + UX     | ~30 min |
--- a/__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md
+++ b/__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md
@ -0,0 +1,157 @@
+# Mission Control Dashboard — Implementation Roadmap
+
+> Phased plan for the next wave of dashboard improvements.
+> Last updated: Feb 19, 2026
+>
+> See also: [DASHBOARD_PRD.md](DASHBOARD_PRD.md) · [DASHBOARD_REVIEW.md](DASHBOARD_REVIEW.md)
+>
+> **Previous work:** 39 items (11 bugs, 6 code quality, 16 features, 5 performance, 3 security) completed across 7 sprints. See DASHBOARD_REVIEW.md for full audit and commit log.
+
+---
+
+## Vision
+
+Transform the dashboard from a model management tool into a **model intelligence platform** — helping developers make informed decisions about which models to load, how they perform, and how they interact with system resources.
+
+---
+
+## Phase 1 — Pre-Load Intelligence _(Sprint 8)_
+
+**Goal:** Give users the information they need **before** clicking "Load" on a model.
+
+**Estimated effort:** ~45 minutes
+
+| #   | ID  | Task                       | Status | Priority | Notes                                                       |
+| --- | --- | -------------------------- | ------ | -------- | ----------------------------------------------------------- |
+| 1   | N1  | Estimated RAM per model    | [ ]    | High     | Q4_K_M ≈ 1.2× disk size. Show `~22 GB RAM` on every card.   |
+| 2   | N2  | "Will it fit?" indicator   | [ ]    | High     | 🟢 Fits / 🟡 Tight / 🔴 Won't fit based on free+cached RAM. |
+| 3   | N3  | Aggregate loaded model RAM | [ ]    | High     | Sum VRAM at top of panel: "2 loaded · 28.5 GB VRAM".        |
+
+**Implementation details:**
+
+- **N1:** Add `estimateRam(diskSize: number)` to `lib/format.ts`. Returns `diskSize * 1.2`. Display below existing size/params/quant line on each model card.
+- **N2:** Compare `estimateRam(model.size)` against `system.memory.free + system.memory.cached`. Pass `system` into the model list rendering. Add colored dot or badge next to Load button.
+- **N3:** Compute `ollama.running.reduce((sum, r) => sum + r.size_vram, 0)` and display in the models panel header next to "X active".
+
+**Acceptance criteria:**
+
+- Every model card shows estimated RAM requirement
+- Load button has a color-coded fit indicator
+- Panel header shows total VRAM of loaded models
+- TypeScript compiles cleanly
+
+---
+
+## Phase 2 — Rich Metadata _(Sprint 9)_
+
+**Goal:** Surface critical model metadata that's currently hidden behind the Ollama API.
+
+**Estimated effort:** ~60 minutes
+
+| #   | ID  | Task                | Status | Priority | Notes                                                                        |
+| --- | --- | ------------------- | ------ | -------- | ---------------------------------------------------------------------------- |
+| 4   | N4  | RAM budget bar      | [ ]    | Medium   | Stacked horizontal bar: OS+Apps / Loaded models (by name) / Free.            |
+| 5   | N5  | Context window size | [ ]    | High     | Fetch `context_length` from `/api/show` model_info. Show `128k ctx` on card. |
+
+**Implementation details:**
+
+- **N4:** New `RamBudgetBar` component. Inputs: `totalRam`, `appMemory`, `runningModels[]` (each with name + size_vram), `freeRam`. Renders as a CSS flex bar with labeled segments. Place above the models list.
+- **N5:** Extend the `OllamaModel` interface to include optional `context_length?: number`. On expand (or eagerly for all models), call `/api/ollama` POST with `action: 'show'` and extract `model_info.*.context_length`. Cache in component state. Show as badge on card.
+
+**Acceptance criteria:**
+
+- RAM budget bar visually represents memory allocation
+- Each model shows context window length (once fetched)
+- Bar updates when models are loaded/unloaded
+
+---
+
+## Phase 3 — Model Intelligence Badges _(Sprint 10)_
+
+**Goal:** Auto-detect model capabilities and surface warnings so users don't hit surprises.
+
+**Estimated effort:** ~45 minutes
+
+| #   | ID  | Task                    | Status | Priority | Notes                                                                |
+| --- | --- | ----------------------- | ------ | -------- | -------------------------------------------------------------------- |
+| 6   | N6  | `<think>` warning badge | [ ]    | High     | DeepSeek R1 models emit reasoning traces — warn about JSON stripping |
+| 7   | N7  | Vision model indicator  | [ ]    | Medium   | Multimodal models (llava, qwen2.5vl) need image input                |
+| 8   | N8  | Architecture badge      | [ ]    | Low      | Show model arch as pill on card (currently in expanded only)         |
+| 9   | N9  | Sort/order models       | [ ]    | Medium   | Dropdown: name, size, parameters, running, modified                  |
+| 10  | N10 | Ollama version display  | [ ]    | Low      | Show Ollama server version in status card                            |
+
+**Implementation details:**
+
+- **N6:** Pattern match model name: `/deepseek-r1/i` or family containing `deepseek`. Show amber ⚠️ badge with tooltip: "Emits `<think>` traces before JSON output".
+- **N7:** Pattern match: `/llava|qwen.*vl|minicpm-v/i`. Show 👁 badge with tooltip: "Vision model — supports image input".
+- **N8:** Move `model.details.family` from expanded-only to always-visible as a subtle pill badge.
+- **N9:** Add `modelSort` state (`'name' | 'size' | 'params' | 'running' | 'modified'`). Sort the filtered model list before `.map()`. Add dropdown above the model list (next to search bar).
+- **N10:** New API call to Ollama `/api/version` in the GET handler. Return in OllamaData. Display in the Ollama stats card.
+
+**Acceptance criteria:**
+
+- DeepSeek R1 models show `<think>` warning badge
+- Vision models show eye indicator
+- Models can be sorted by any of the 5 criteria
+- Ollama version shown in status card
+
+---
+
+## Phase 4 — Runtime Metrics & Polish _(Sprint 11)_
+
+**Goal:** Improve the experience for users who are actively using models.
+
+**Estimated effort:** ~30 minutes
+
+| #   | ID  | Task                          | Status | Priority | Notes                                                    |
+| --- | --- | ----------------------------- | ------ | -------- | -------------------------------------------------------- |
+| 11  | N11 | Last known tok/s per model    | [ ]    | Medium   | Persist after prompt, show on card                       |
+| 12  | N12 | Auto-unload countdown         | [ ]    | Medium   | Live countdown instead of static expiry time             |
+| 13  | N13 | Session stats per model       | [ ]    | Low      | Prompts sent + tokens generated in this session          |
+| 14  | N14 | Delete confirmation + reclaim | [ ]    | High     | "Delete X? Reclaim Y GB" dialog before deleting          |
+| 15  | N15 | Simultaneous load suggestions | [ ]    | Low      | Suggest which models fit together based on available RAM |
+
+**Implementation details:**
+
+- **N11:** After prompt/chat completes, save `{ model: string, tokPerSec: number, timestamp: number }` to localStorage key `modelBenchmarks`. Show on card as faded text: `~45 tok/s`.
+- **N12:** Replace `Expires: {time}` with a `useEffect` interval that computes `expires_at - Date.now()` and formats as `Xm Ys`. Update every second.
+- **N13:** Track in component state: `Map<string, { prompts: number, tokens: number }>`. Increment on each prompt/chat completion. Display in expanded details.
+- **N14:** Add `confirmDelete` state. When delete is clicked, show inline confirmation with model name and `formatBytes(model.size)` reclaim amount. Second click executes.
+- **N15:** After computing estimated RAM for all unloaded models, filter those that fit in remaining free memory. Show as suggestions in the models panel footer.
+
+**Acceptance criteria:**
+
+- Previously benchmarked models show tok/s on card
+- Running models show live countdown to unload
+- Delete requires explicit confirmation showing disk reclaim
+- TypeScript compiles cleanly after all phases
+
+---
+
+## Phase 5 — Future Considerations _(Backlog)_
+
+Not planned for immediate implementation. Revisit after Phases 1–4.
+
+| ID  | Feature                        | Complexity | Notes                                                             |
+| --- | ------------------------------ | ---------- | ----------------------------------------------------------------- |
+| F17 | WebSocket real-time updates    | High       | Replace 15s polling with push-based updates from Ollama           |
+| F18 | GPU/Metal utilization chart    | Medium     | macOS `powermetrics` or IOKit for GPU load percentage             |
+| F19 | Model download queue           | Medium     | Queue multiple pulls, show progress for each                      |
+| F20 | Inference history log          | Medium     | Persist all prompts/responses to localStorage or file             |
+| F21 | Custom Modelfile editor        | Medium     | Edit and push custom Modelfiles to Ollama                         |
+| F22 | Benchmark suite                | High       | Run standard prompts across all models, generate comparison table |
+| F23 | Component decomposition (CQ1b) | High       | Break 1,885-line page.tsx into feature-based modules              |
+
+---
+
+## Summary
+
+| Phase     | Sprint | Items        | Focus                    | Effort     | Depends On |
+| --------- | ------ | ------------ | ------------------------ | ---------- | ---------- |
+| 1         | 8      | N1, N2, N3   | Pre-load RAM estimates   | ~45 min    | —          |
+| 2         | 9      | N4, N5       | RAM bar + context window | ~60 min    | Phase 1    |
+| 3         | 10     | N6–N10       | Badges + sort + version  | ~45 min    | —          |
+| 4         | 11     | N11–N15      | Runtime metrics + UX     | ~30 min    | —          |
+| **Total** |        | **15 items** |                          | **~3 hrs** |            |
+
+Phases 1 and 3 can run in parallel. Phase 2 depends on Phase 1 (needs RAM estimates for the budget bar). Phase 4 is independent.
--- a/__LOCAL_LLMs/docs/05-mission-control-dashboard.md
+++ b/__LOCAL_LLMs/docs/05-mission-control-dashboard.md
@ -1,27 +1,12 @@
 # 05 — Mission Control Dashboard

-> Next.js dashboard for visualizing and managing the local LLM stack.
+> **Documentation has moved.** All dashboard docs now live in the dashboard directory.

---
+- **PRD:** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md`](../dashboard/docs/DASHBOARD_PRD.md)
+- **Review (39 items):** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md`](../dashboard/docs/DASHBOARD_REVIEW.md)
+- **Roadmap (N1–N15):** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md`](../dashboard/docs/DASHBOARD_ROADMAP.md)

-## Overview
-
-A dark-themed, real-time dashboard built with Next.js 16 that provides:
-
- **Ollama status** — online/offline, model list, loaded models, load/unload actions
- **Streaming prompt interface** — send prompts to any loaded model with real-time streaming responses
- **Model management** — pull new models, delete models (with confirmation), view VRAM/expiry info
- **System resources** — chip info, RAM/disk usage with progress bars, uptime
- **Whisper.cpp status** — installed binaries, downloaded models
- **Brew packages** — version tracking for ollama, whisper-cpp, ffmpeg
- **Auto-refresh** — polls all endpoints every 15 seconds
- **Toast notifications** — success/error/info feedback for all actions
- **Keyboard shortcuts** — Cmd+Enter to send prompt, Escape to close modals
- **Copy response** — one-click copy of model responses to clipboard
-
---
-
-## Running
+## Quick Start

 ```bash
 cd __LOCAL_LLMs/dashboard
@ -30,247 +15,3 @@ npm run dev -- -p 3100
 ```

 Open: **http://localhost:3100**
-
---
-
-## Tech Stack
-
-| Component | Technology                          |
-| --------- | ----------------------------------- |
-| Framework | Next.js 16 (App Router)             |
-| Language  | TypeScript                          |
-| Styling   | TailwindCSS v4                      |
-| Icons     | Lucide React                        |
-| React     | 19                                  |
-| Theme     | Dark — ByteLyst design token colors |
-
---
-
-## Architecture
-
-```
-dashboard/
-├── src/
-│   ├── app/
-│   │   ├── layout.tsx              ← Root layout (dark theme, system fonts)
-│   │   ├── page.tsx                ← Main dashboard (~800 lines, client component)
-│   │   ├── globals.css             ← Dark theme CSS (ByteLyst tokens + animations)
-│   │   └── api/
-│   │       ├── ollama/
-│   │       │   ├── route.ts        ← Ollama proxy (GET: list+ps, POST: load/unload/generate/pull/delete/show)
-│   │       │   └── stream/route.ts ← Streaming generate (NDJSON proxy to Ollama)
-│   │       ├── whisper/route.ts    ← Whisper discovery (binaries, models, version)
-│   │       └── system/route.ts     ← System info (chip, RAM, disk, brew) — cached statics
-│   └── components/                 ← (empty, all inline in page.tsx for now)
-├── package.json
-├── next.config.ts
-├── tsconfig.json
-├── postcss.config.mjs
-└── tailwind.config.ts (auto via v4)
-```
-
---
-
-## API Routes
-
-### GET /api/ollama
-
-Fetches Ollama status, model list, and running models.
-
-**Response:**
-
-```json
-{
-  "status": "online",
-  "url": "http://localhost:11434",
-  "models": [
-    {
-      "name": "qwen2.5-coder:32b",
-      "size": 19000000000,
-      "digest": "b92d6a0bd47e...",
-      "modified_at": "2026-02-19T...",
-      "details": {
-        "family": "qwen2",
-        "parameter_size": "32B",
-        "quantization_level": "Q4_K_M"
-      }
-    }
-  ],
-  "running": [],
-  "totalModels": 2,
-  "totalSize": 23900000000,
-  "runningCount": 0
-}
-```
-
-### POST /api/ollama
-
-Model management actions.
-
-**Load a model into RAM:**
-
-```json
-{ "action": "load", "model": "qwen2.5-coder:32b" }
-```
-
-**Unload a model from RAM:**
-
-```json
-{ "action": "unload", "model": "qwen2.5-coder:32b" }
-```
-
-**Send a prompt:**
-
-```json
-{ "action": "generate", "model": "qwen2.5-coder:32b", "prompt": "Write hello world in Swift" }
-```
-
-### GET /api/whisper
-
-Discovers whisper-cpp installation.
-
-**Response:**
-
-```json
-{
-  "installed": true,
-  "version": "unknown",
-  "binaries": [
-    "whisper-bench",
-    "whisper-cli",
-    "whisper-command",
-    "whisper-lsp",
-    "whisper-quantize",
-    "whisper-server",
-    "whisper-stream",
-    "whisper-talk-llama",
-    "whisper-vad-speech-segments"
-  ],
-  "models": [],
-  "modelsDir": "/Users/sd9235/whisper-models"
-}
-```
-
-### GET /api/system
-
-System hardware and software info.
-
-**Response:**
-
-```json
-{
-  "chip": "Apple M4 Pro",
-  "gpu": "Apple Silicon (integrated)",
-  "memory": { "total": 51539607552, "used": 38000000000, "free": 13539607552 },
-  "disk": { "total": 994662584320, "used": 450000000000, "free": 544662584320 },
-  "ollamaDiskUsage": 24500000000,
-  "cpuCores": 14,
-  "uptime": 86400,
-  "platform": "Darwin 25.3.0",
-  "arch": "arm64",
-  "nodeVersion": "v25.2.1",
-  "brewPackages": [
-    { "name": "ollama", "version": "0.16.2" },
-    { "name": "whisper-cpp", "version": "1.8.3" },
-    { "name": "ffmpeg", "version": "8.0.1_4" }
-  ]
-}
-```
-
---
-
-## Dashboard UI Features
-
-### Header
-
- App icon + title ("Local LLM Mission Control")
- Machine info subtitle (chip, RAM, OS)
- Last refresh timestamp + manual refresh button
-
-### Top Stats Row (4 cards)
-
-1. **Ollama** — online/offline status with pulsing dot, server URL
-2. **Models** — count + total disk size, number loaded in RAM
-3. **Whisper** — installed/not found, model count
-4. **Memory** — used/total with color-coded progress bar (blue → yellow → red)
-
-### Main Grid (2 columns + 1 sidebar)
-
-**Left (2/3 width) — Ollama Models:**
-
- Each model shown as a card with name, size, parameter count, quantization
- Green highlight + "LOADED" badge for models currently in RAM
- Actions per model:
-  - **Load** — loads model into RAM (green button)
-  - **Unload** — evicts from RAM (red button)
-  - **Prompt** — opens prompt modal (blue button, only for loaded models)
- Expandable details (digest, modified date, family)
-
-**Right (1/3 width):**
-
- **System** — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
- **Whisper.cpp** — status, binary list as tags, model list with sizes
- **Brew Packages** — name + version for ollama, whisper-cpp, ffmpeg
-
-### Prompt Modal
-
- Slide-up overlay when clicking "Prompt" on a loaded model
- Textarea with Cmd+Enter shortcut to send
- Sends to Ollama `/api/generate` (non-streaming)
- Response displayed in monospace pre block with scroll
-
---
-
-## Design Tokens
-
-The dashboard uses ByteLyst design system colors:
-
-| Token                | Hex       | Use               |
-| -------------------- | --------- | ----------------- |
-| `--bg-canvas`        | `#06070A` | Page background   |
-| `--bg-elevated`      | `#0E1118` | Modal background  |
-| `--surface-card`     | `#121725` | Card backgrounds  |
-| `--surface-muted`    | `#1A2335` | Muted areas       |
-| `--text-primary`     | `#EFF4FF` | Main text         |
-| `--text-secondary`   | `#A5B1C7` | Descriptions      |
-| `--text-tertiary`    | `#6C7C98` | Hints, timestamps |
-| `--accent-primary`   | `#5A8CFF` | Primary actions   |
-| `--accent-secondary` | `#2EE6D6` | Secondary accent  |
-| `--success`          | `#34D399` | Online, loaded    |
-| `--warning`          | `#F59E0B` | Warning state     |
-| `--danger`           | `#FF6E6E` | Offline, errors   |
-| `--purple`           | `#A78BFA` | Whisper, disk     |
-
---
-
-## Changelog
-
-### v2 (2026-02-19)
-
- Fixed Google Fonts build error (corporate proxy blocks fonts.gstatic.com) — switched to system fonts
- Fixed system API slowness (7.6s → 50ms) by caching static info (chip, GPU, brew packages)
- Added streaming prompt responses via NDJSON proxy to Ollama `/api/generate`
- Added model pull UI — input field + pull button in Ollama Models section
- Added model delete with confirmation dialog in expanded details
- Added VRAM usage and expiry time display for running models
- Added toast notifications (success/error/info) with slide-in animation
- Added copy response button in prompt modal
- Added Escape key to close modals, backdrop click to dismiss
- Added streaming indicator with pulsing dot
-
-### v1 (2026-02-19)
-
- Initial dashboard with Ollama status, model list, system info, Whisper.cpp, brew packages
- Load/unload model actions
- Basic prompt interface (non-streaming)
- Auto-refresh every 15 seconds
- Dark theme with ByteLyst design tokens
-
-## Future Improvements
-
- [ ] Whisper transcription UI (upload audio, get text)
- [ ] GPU utilization chart (Metal usage over time)
- [ ] Model comparison benchmarks (side-by-side prompt same question)
- [ ] Chat mode (multi-turn conversation with history)
- [ ] WebSocket for real-time status updates (replace polling)
- [ ] Component extraction (break page.tsx into smaller components)