docs(local-llm): consolidate dashboard docs into dashboard/docs/
- Created DASHBOARD_PRD.md — full updated PRD with current 19-file architecture, all 10 API routes, UI layout, data flow, localStorage keys, security model, and v1-v3 changelog. - Created DASHBOARD_ROADMAP.md — phased implementation plan for N1-N15 improvements across 4 phases: pre-load intelligence, rich metadata, model intelligence badges, runtime metrics. Includes acceptance criteria and implementation details per item. - Updated DASHBOARD_REVIEW.md — refreshed file inventory to 19 files (~2,930 lines), fixed broken Tier B markdown table, added cross-links. - Replaced __LOCAL_LLMs/docs/05-mission-control-dashboard.md with redirect pointer to new dashboard/docs/ location. Dashboard docs are now co-located at __LOCAL_LLMs/dashboard/docs/: - DASHBOARD_PRD.md (product requirements) - DASHBOARD_REVIEW.md (audit + 39 completed items + N1-N15 proposals) - DASHBOARD_ROADMAP.md (phased implementation plan)
This commit is contained in:
parent
519f348583
commit
cd6e561f1b
307
__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md
Normal file
307
__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md
Normal file
@ -0,0 +1,307 @@
|
||||
# Mission Control Dashboard — Product Requirements Document
|
||||
|
||||
> Local LLM management dashboard for macOS development machines.
|
||||
> Last updated: Feb 19, 2026
|
||||
>
|
||||
> See also: [DASHBOARD_REVIEW.md](DASHBOARD_REVIEW.md) · [DASHBOARD_ROADMAP.md](DASHBOARD_ROADMAP.md)
|
||||
|
||||
---
|
||||
|
||||
## 1. Overview
|
||||
|
||||
A dark-themed, real-time dashboard built with Next.js 16 for managing a local LLM stack (Ollama + Whisper.cpp) on macOS Apple Silicon. Designed as a developer tool — no authentication, no cloud dependencies, runs entirely on localhost.
|
||||
|
||||
### Key Capabilities
|
||||
|
||||
- **Ollama management** — model listing, load/unload, pull/delete, VRAM monitoring
|
||||
- **Streaming prompt interface** — send prompts with real-time NDJSON streaming, token metrics
|
||||
- **Chat mode** — multi-turn conversations with system prompt, chat bubble UI
|
||||
- **Model comparison** — side-by-side prompt responses from two models
|
||||
- **System monitoring** — chip, RAM, disk, uptime, memory sparklines, Ollama disk footprint
|
||||
- **Whisper.cpp integration** — binary/model discovery, audio transcription test
|
||||
- **Extraction service health** — server-side proxy to check extraction-service (port 4005)
|
||||
- **Ollama logs viewer** — collapsible terminal-style panel with color-coded log levels
|
||||
- **Model management UX** — search/filter, tags, auto-load preferred model, prompt history
|
||||
- **Theme support** — dark/light toggle with localStorage persistence
|
||||
|
||||
---
|
||||
|
||||
## 2. Tech Stack
|
||||
|
||||
| Component | Technology |
|
||||
| ---------- | ------------------------------------- |
|
||||
| Framework | Next.js 16 (App Router) |
|
||||
| Language | TypeScript (strict) |
|
||||
| React | 19 |
|
||||
| Styling | TailwindCSS v4 |
|
||||
| Icons | Lucide React |
|
||||
| Validation | Zod |
|
||||
| Theme | Dark default — ByteLyst design tokens |
|
||||
| Port | 3100 (dev) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Architecture
|
||||
|
||||
```
|
||||
dashboard/
|
||||
├── docs/
|
||||
│ ├── DASHBOARD_PRD.md ← This file
|
||||
│ ├── DASHBOARD_REVIEW.md ← Bug/improvement audit (39 items, all complete)
|
||||
│ └── DASHBOARD_ROADMAP.md ← Next-wave implementation plan (N1–N15)
|
||||
├── src/
|
||||
│ ├── app/
|
||||
│ │ ├── page.tsx ← Main dashboard (1,885 lines, client component)
|
||||
│ │ ├── layout.tsx ← Root layout (dark theme, system fonts)
|
||||
│ │ ├── error.tsx ← React Error Boundary
|
||||
│ │ ├── globals.css ← Design tokens, CSS utilities, skeleton, light theme
|
||||
│ │ ├── components/
|
||||
│ │ │ ├── StatusDot.tsx ← Pulsing online/offline indicator
|
||||
│ │ │ ├── ProgressBar.tsx ← Colored progress bar with percentage
|
||||
│ │ │ └── Sparkline.tsx ← SVG sparkline chart for time-series data
|
||||
│ │ ├── lib/
|
||||
│ │ │ ├── types.ts ← TypeScript interfaces (OllamaData, SystemData, etc.)
|
||||
│ │ │ ├── format.ts ← formatBytes, formatUptime utilities
|
||||
│ │ │ └── ollama-config.ts ← Shared OLLAMA_URL constant
|
||||
│ │ └── api/
|
||||
│ │ ├── ollama/
|
||||
│ │ │ ├── route.ts ← REST proxy (list, load, unload, pull, delete, show)
|
||||
│ │ │ ├── stream/route.ts ← Streaming generate proxy (NDJSON)
|
||||
│ │ │ ├── chat/route.ts ← Streaming chat proxy (multi-turn)
|
||||
│ │ │ ├── pull/route.ts ← Streaming pull with progress events
|
||||
│ │ │ └── logs/route.ts ← Ollama server log reader
|
||||
│ │ ├── whisper/
|
||||
│ │ │ ├── route.ts ← Whisper binary + GGML model discovery
|
||||
│ │ │ └── transcribe/route.ts ← Audio transcription via whisper-cli
|
||||
│ │ ├── system/route.ts ← System info (chip, RAM, disk, brew) with TTL cache
|
||||
│ │ └── extraction/
|
||||
│ │ └── health/route.ts ← Extraction service health proxy
|
||||
├── package.json
|
||||
├── next.config.ts
|
||||
├── tsconfig.json
|
||||
└── postcss.config.mjs
|
||||
```
|
||||
|
||||
**19 source files, ~2,930 lines total.**
|
||||
|
||||
---
|
||||
|
||||
## 4. Running
|
||||
|
||||
```bash
|
||||
cd __LOCAL_LLMs/dashboard
|
||||
npm install # first time only
|
||||
npm run dev -- -p 3100
|
||||
```
|
||||
|
||||
Open: **http://localhost:3100**
|
||||
|
||||
Prerequisites: Ollama running (`ollama serve` or `brew services start ollama`).
|
||||
|
||||
---
|
||||
|
||||
## 5. API Routes
|
||||
|
||||
### Ollama
|
||||
|
||||
| Method | Route | Purpose |
|
||||
| ------ | -------------------- | ---------------------------------------------- |
|
||||
| GET | `/api/ollama` | List models + running models (status, VRAM) |
|
||||
| POST | `/api/ollama` | Load, unload, pull, delete, show model actions |
|
||||
| POST | `/api/ollama/stream` | Streaming generate (NDJSON proxy) |
|
||||
| POST | `/api/ollama/chat` | Streaming chat (multi-turn, NDJSON) |
|
||||
| POST | `/api/ollama/pull` | Streaming pull with progress events |
|
||||
| GET | `/api/ollama/logs` | Read last 100 lines of Ollama log files |
|
||||
|
||||
### System & Services
|
||||
|
||||
| Method | Route | Purpose |
|
||||
| ------ | ------------------------- | ---------------------------------------- |
|
||||
| GET | `/api/system` | Chip, RAM (vm_stat), disk, brew, uptime |
|
||||
| GET | `/api/whisper` | Whisper binary + model discovery |
|
||||
| POST | `/api/whisper/transcribe` | Upload audio → whisper-cli → text |
|
||||
| GET | `/api/extraction/health` | Proxy to extraction-service :4005/health |
|
||||
|
||||
### Key Implementation Details
|
||||
|
||||
- **Zod validation** on all POST bodies (model names, actions)
|
||||
- **AbortSignal timeout** (5s) on all Ollama fetch calls
|
||||
- **TTL caching**: static system info (5 min), Ollama disk usage (60s)
|
||||
- **Eager cache warming** on module load for system_profiler data
|
||||
- **execFile** (not exec) for all shell commands — prevents injection
|
||||
- **Server-side proxy** for extraction health — avoids browser CORS errors
|
||||
|
||||
---
|
||||
|
||||
## 6. Dashboard UI Layout
|
||||
|
||||
### Header
|
||||
|
||||
- App icon + title ("Local LLM Mission Control")
|
||||
- Machine info subtitle (chip, RAM, OS — dynamic)
|
||||
- Last refresh timestamp + manual refresh button
|
||||
- Dark/light theme toggle (localStorage persistence)
|
||||
|
||||
### Top Stats Row (4 cards)
|
||||
|
||||
1. **Ollama** — online/offline status with pulsing dot, server URL
|
||||
2. **Models** — count + total disk size, number loaded in RAM
|
||||
3. **Whisper** — installed/not found, model count
|
||||
4. **Memory** — used/total with color-coded progress bar + sparkline history
|
||||
|
||||
Skeleton shimmer cards shown during initial load.
|
||||
|
||||
### Main Grid (3 columns)
|
||||
|
||||
**Left (2/3) — Ollama Models:**
|
||||
|
||||
- Search/filter bar (by name, family, quantization)
|
||||
- Pull model input with streaming progress bar
|
||||
- Per-model cards showing: name, disk size, parameters, quantization
|
||||
- Running models: green highlight, LOADED badge, VRAM usage, expiry time
|
||||
- Actions: Load, Unload, Prompt, Delete
|
||||
- Expanded details: digest, modified date, family, modelfile viewer
|
||||
- Model tags (coding, chat, fast, vision, reasoning) — localStorage
|
||||
- Auto-load preferred model (star toggle) — localStorage
|
||||
|
||||
**Right (1/3):**
|
||||
|
||||
- **System** — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
|
||||
- **Extraction Service** — health status from extraction-service (port 4005)
|
||||
- **Whisper.cpp** — status, binary list, models, transcription test (file upload → text)
|
||||
- **Brew Packages** — name + version for ollama, whisper-cpp, ffmpeg
|
||||
|
||||
### Prompt Modal (overlay)
|
||||
|
||||
- Textarea with Cmd+Enter to send
|
||||
- Chat mode toggle — multi-turn conversation with system prompt
|
||||
- Prompt history dropdown (last 20, localStorage)
|
||||
- Streaming response display with token metrics (tok/s, total, latency)
|
||||
- Model comparison — quick-compare buttons to run same prompt on another model
|
||||
- Side-by-side response display when comparing
|
||||
- Copy response button
|
||||
|
||||
### Ollama Logs Panel (below grid)
|
||||
|
||||
- Collapsible terminal-style panel
|
||||
- Color-coded log levels (error=red, warn=yellow, info=default)
|
||||
- Fetches via server-side `/api/ollama/logs`
|
||||
|
||||
### Keyboard Shortcuts
|
||||
|
||||
| Shortcut | Action |
|
||||
| ----------- | ------------------ |
|
||||
| `Cmd+Enter` | Send prompt |
|
||||
| `Escape` | Close modal |
|
||||
| `R` | Refresh dashboard |
|
||||
| `/` | Focus model search |
|
||||
| `?` | Show shortcuts |
|
||||
|
||||
---
|
||||
|
||||
## 7. Design Tokens
|
||||
|
||||
ByteLyst design system colors:
|
||||
|
||||
| Token | Hex | Use |
|
||||
| -------------------- | --------- | ----------------- |
|
||||
| `--bg-canvas` | `#06070A` | Page background |
|
||||
| `--bg-elevated` | `#0E1118` | Modal background |
|
||||
| `--surface-card` | `#121725` | Card backgrounds |
|
||||
| `--surface-muted` | `#1A2335` | Muted areas |
|
||||
| `--text-primary` | `#EFF4FF` | Main text |
|
||||
| `--text-secondary` | `#A5B1C7` | Descriptions |
|
||||
| `--text-tertiary` | `#6C7C98` | Hints, timestamps |
|
||||
| `--accent-primary` | `#5A8CFF` | Primary actions |
|
||||
| `--accent-secondary` | `#2EE6D6` | Secondary accent |
|
||||
| `--success` | `#34D399` | Online, loaded |
|
||||
| `--warning` | `#F59E0B` | Warning state |
|
||||
| `--danger` | `#FF6E6E` | Offline, errors |
|
||||
| `--purple` | `#A78BFA` | Whisper, disk |
|
||||
|
||||
Light theme overrides defined in `globals.css` under `.light` class.
|
||||
|
||||
---
|
||||
|
||||
## 8. Data Flow
|
||||
|
||||
```
|
||||
Browser (page.tsx)
|
||||
├── GET /api/ollama → Ollama :11434/api/tags + /api/ps
|
||||
├── POST /api/ollama/stream → Ollama :11434/api/generate (NDJSON)
|
||||
├── POST /api/ollama/chat → Ollama :11434/api/chat (NDJSON)
|
||||
├── POST /api/ollama/pull → Ollama :11434/api/pull (NDJSON progress)
|
||||
├── GET /api/ollama/logs → ~/.ollama/logs/ or macOS unified logging
|
||||
├── GET /api/system → system_profiler, vm_stat, df, du, brew
|
||||
├── GET /api/whisper → which whisper-cli, ls models dirs
|
||||
├── POST /api/whisper/transcribe → whisper-cli -f <audio> -m <model>
|
||||
└── GET /api/extraction/health → extraction-service :4005/health
|
||||
```
|
||||
|
||||
Auto-refresh: every 15 seconds (paused during streaming/pull).
|
||||
Request deduplication: `fetchingRef` guard prevents parallel fetchAll calls.
|
||||
|
||||
---
|
||||
|
||||
## 9. LocalStorage Persistence
|
||||
|
||||
| Key | Type | Purpose |
|
||||
| ---------------- | ----------- | -------------------------------------- |
|
||||
| `promptHistory` | JSON array | Last 20 prompts with model + timestamp |
|
||||
| `modelTags` | JSON object | User-defined tags per model name |
|
||||
| `autoLoadModel` | string | Preferred model to auto-load |
|
||||
| `dashboardTheme` | string | `"dark"` or `"light"` |
|
||||
|
||||
---
|
||||
|
||||
## 10. Security Model
|
||||
|
||||
- **Local-only**: Dashboard runs on localhost:3100, no auth required
|
||||
- **No CORS headers**: All API routes are same-origin
|
||||
- **Input validation**: Zod schemas on all POST bodies; model names validated via regex
|
||||
- **Shell safety**: All subprocess calls use `execFile` (never string interpolation)
|
||||
- **External calls proxied**: Extraction health check goes through server-side API route, not direct browser fetch
|
||||
|
||||
This is documented in code at `page.tsx` (S3 comment).
|
||||
|
||||
---
|
||||
|
||||
## 11. Changelog
|
||||
|
||||
### v3 (Feb 19, 2026) — Sprints 6–7
|
||||
|
||||
- Chat mode with multi-turn conversations and system prompt (F4)
|
||||
- Model comparison side-by-side UI (F5)
|
||||
- Memory sparklines with ring buffer (F7)
|
||||
- Ollama logs viewer with color-coded levels (F8)
|
||||
- Dark/light theme toggle with localStorage (F10)
|
||||
- Whisper transcription test with file upload (F12)
|
||||
- Responsive mobile layout improvements (F13)
|
||||
- Model tags/labels with localStorage (F14)
|
||||
- Extraction service health panel (F15)
|
||||
- Auto-load preferred model (F16)
|
||||
- CSS utility classes replacing inline styles (CQ2)
|
||||
- Skeleton shimmer loading UI (CQ5)
|
||||
- Zod validation on API responses (CQ6)
|
||||
- Eager cache warming for system_profiler (P5)
|
||||
|
||||
### v2 (Feb 19, 2026) — Sprints 1–5
|
||||
|
||||
- Fixed 11 bugs (hardcoded specs, streaming escape, auto-refresh, toast IDs, etc.)
|
||||
- Streaming pull with progress bar (F1)
|
||||
- Token/s metrics after generation (F6)
|
||||
- Component extraction (CQ1) — StatusDot, ProgressBar, types, format, config
|
||||
- Error boundary (CQ4)
|
||||
- Shared ollama-config (CQ3)
|
||||
- Model search/filter (F2), prompt history (F3), keyboard shortcuts (F11)
|
||||
- Modelfile viewer (F9)
|
||||
- Request deduplication + cache TTLs (P1–P3)
|
||||
- Input validation (S1) + execFile (S2)
|
||||
|
||||
### v1 (Feb 19, 2026) — Initial
|
||||
|
||||
- Ollama status, model list, load/unload, basic prompt
|
||||
- System info (chip, RAM, disk, brew)
|
||||
- Whisper.cpp discovery
|
||||
- Auto-refresh every 15 seconds
|
||||
- Dark theme with ByteLyst design tokens
|
||||
329
__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md
Normal file
329
__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md
Normal file
@ -0,0 +1,329 @@
|
||||
# Mission Control Dashboard — Bug & Improvement Review
|
||||
|
||||
> Systematic code review of `__LOCAL_LLMs/dashboard/` — 19 source files, ~2,930 lines
|
||||
> Last updated: Feb 19, 2026
|
||||
>
|
||||
> See also: [DASHBOARD_PRD.md](DASHBOARD_PRD.md) · [DASHBOARD_ROADMAP.md](DASHBOARD_ROADMAP.md)
|
||||
|
||||
---
|
||||
|
||||
## File Inventory
|
||||
|
||||
| File | Lines | Purpose |
|
||||
| ----------------------------------------- | ----- | ---------------------------------------------------- |
|
||||
| **UI** | | |
|
||||
| `src/app/page.tsx` | 1,885 | Main dashboard (client component, all panels) |
|
||||
| `src/app/layout.tsx` | 19 | Root layout (dark theme, system fonts) |
|
||||
| `src/app/error.tsx` | 50 | React Error Boundary (CQ4) |
|
||||
| `src/app/globals.css` | 163 | Design tokens, utilities, skeleton, light theme |
|
||||
| **Components** | | |
|
||||
| `src/app/components/StatusDot.tsx` | 10 | Pulsing online/offline indicator |
|
||||
| `src/app/components/ProgressBar.tsx` | 18 | Colored progress bar with percentage |
|
||||
| `src/app/components/Sparkline.tsx` | 42 | SVG sparkline chart (F7) |
|
||||
| **Lib** | | |
|
||||
| `src/app/lib/types.ts` | 75 | TypeScript interfaces (OllamaData, SystemData, etc.) |
|
||||
| `src/app/lib/format.ts` | 16 | formatBytes, formatUptime utilities |
|
||||
| `src/app/lib/ollama-config.ts` | 1 | Shared OLLAMA_URL constant (CQ3) |
|
||||
| **API Routes — Ollama** | | |
|
||||
| `src/app/api/ollama/route.ts` | 124 | REST proxy (list, load, unload, pull, delete, show) |
|
||||
| `src/app/api/ollama/stream/route.ts` | 36 | Streaming generate proxy (NDJSON) |
|
||||
| `src/app/api/ollama/chat/route.ts` | 42 | Streaming chat proxy (F4) |
|
||||
| `src/app/api/ollama/pull/route.ts` | 43 | Streaming pull with progress (F1) |
|
||||
| `src/app/api/ollama/logs/route.ts` | 36 | Ollama server log reader (F8) |
|
||||
| **API Routes — Other** | | |
|
||||
| `src/app/api/system/route.ts` | 179 | System info (chip, RAM, disk, brew) with TTL cache |
|
||||
| `src/app/api/whisper/route.ts` | 79 | Whisper binary + GGML model discovery |
|
||||
| `src/app/api/whisper/transcribe/route.ts` | 93 | Whisper transcription test (F12) |
|
||||
| `src/app/api/extraction/health/route.ts` | 18 | Extraction service health proxy (F15) |
|
||||
|
||||
**Stack:** Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript, Zod
|
||||
|
||||
---
|
||||
|
||||
## 1. Bugs
|
||||
|
||||
- [x] **B1. Hardcoded machine specs in header** — `page.tsx:317`
|
||||
Subtitle reads `Apple M4 Pro · 48 GB · {system?.platform}` — should use `system?.chip` and `formatBytes(system?.memory.total)` dynamically so it works on any machine.
|
||||
|
||||
- [x] **B2. Pull model blocks UI — no progress feedback** — `api/ollama/route.ts:84-92`
|
||||
`handlePull` calls Ollama with `stream: false`. Large models (20+ GB) block for 30+ minutes. The Next.js API route will likely timeout. Must use `stream: true` and pipe progress events to the client. _(Combined with F1.)_
|
||||
|
||||
- [x] **B3. Dead code: non-streaming `generate` action** — `api/ollama/route.ts:69-82`
|
||||
The `action === 'generate'` handler is unused — UI only uses `/api/ollama/stream`. Remove or keep as fallback with a comment.
|
||||
|
||||
- [x] **B4. Escape key closes modal during active streaming** — `page.tsx:188-197`
|
||||
Global `keydown` handler calls `setPromptModel(null)` unconditionally. Backdrop click correctly checks `!promptLoading`. Escape should also respect `promptLoading` to prevent discarding an in-flight response.
|
||||
|
||||
- [x] **B5. Auto-refresh (15s) fires during streaming/pull** — `page.tsx:182-185`
|
||||
`setInterval(fetchAll, 15000)` runs unconditionally. During streaming this causes background churn and potential UI flicker. Should pause while `promptLoading` or `pullLoading` is true.
|
||||
|
||||
- [x] **B6. Toast ID collision on HMR remount** — `page.tsx:156-159`
|
||||
`toastId.current` resets to 0 on component remount during dev. Use `Date.now()` or `crypto.randomUUID()` for robust uniqueness.
|
||||
|
||||
- [x] **B7. vm_stat page size hardcoded** — `api/system/route.ts:103`
|
||||
Hardcoded `16384`. Should parse from vm_stat's first line: `"(page size of NNNNN bytes)"` for portability.
|
||||
|
||||
- [x] **B8. Whisper models dir not configurable** — `api/whisper/route.ts:24`
|
||||
Hardcoded to `~/whisper-models`. Should scan multiple known paths (`/opt/homebrew/share/whisper-cpp/models/`, `~/whisper-models`, `~/.cache/whisper/`) or accept `WHISPER_MODELS_DIR` env var.
|
||||
|
||||
- [x] **B9. No AbortController for streaming fetch** — `page.tsx:250-289`
|
||||
Closing the prompt modal doesn't cancel the underlying fetch. The `reader.read()` loop continues in the background wasting CPU/bandwidth until the model finishes generating.
|
||||
|
||||
- [x] **B10. Brew shows "Loading..." when array is empty** — `page.tsx:936-940`
|
||||
When `system.brewPackages` is `[]` (all uninstalled), displays "Loading..." instead of "No packages found". Needs to distinguish "still fetching" vs "fetched but empty".
|
||||
|
||||
- [x] **B11. Prompt text not cleared on close without send** — `page.tsx:951-957`
|
||||
Backdrop click clears `promptText`, but Escape handler (B4 fix) should also clear it. Otherwise stale text persists when re-opening.
|
||||
|
||||
---
|
||||
|
||||
## 2. Code Quality
|
||||
|
||||
- [x] **CQ1. Monolithic 1,079-line single component** — `page.tsx`
|
||||
All interfaces, utilities, sub-components, and 900+ lines of JSX in one file. Extract to:
|
||||
- `components/` — StatusDot, ProgressBar, ToastContainer, PromptModal, OllamaModelsPanel, SystemPanel, WhisperPanel, BrewPanel
|
||||
- `lib/types.ts` — interfaces (OllamaModel, SystemData, etc.)
|
||||
- `lib/format.ts` — formatBytes, formatUptime
|
||||
- `lib/hooks.ts` — useAutoRefresh, useToasts, useOllamaActions
|
||||
|
||||
- [x] **CQ2. Pervasive inline styles instead of CSS/Tailwind classes** — `page.tsx` (100+ occurrences)
|
||||
Every `style={{ color: 'var(--text-tertiary)' }}` should be a utility class. Options: custom Tailwind theme mapping, or CSS utility classes in `globals.css` (e.g., `.text-muted`).
|
||||
|
||||
- [x] **CQ3. OLLAMA_URL duplicated** — `api/ollama/route.ts:3` + `api/ollama/stream/route.ts:3`
|
||||
Same `process.env.OLLAMA_URL || 'http://localhost:11434'` in two files. Extract to `lib/ollama-config.ts`.
|
||||
|
||||
- [x] **CQ4. No React Error Boundary** — `page.tsx`
|
||||
Unexpected API response shape crashes the entire dashboard. Add an `error.tsx` (Next.js App Router convention) for graceful recovery.
|
||||
|
||||
- [x] **CQ5. No loading skeleton / shimmer UI**
|
||||
Initial load shows "..." placeholders. Skeleton cards would be more polished.
|
||||
|
||||
- [x] **CQ6. No TypeScript strict null checks in API responses**
|
||||
API route handlers catch errors but return loosely typed JSON. Add Zod validation on the Ollama/system responses to prevent runtime surprises.
|
||||
|
||||
---
|
||||
|
||||
## 3. Features
|
||||
|
||||
- [x] **F1. Streaming pull with progress bar** _(fixes B2)_
|
||||
Use Ollama `stream: true` for `/api/pull`. Create `/api/ollama/pull/route.ts` that pipes NDJSON progress. UI shows progress bar with `completed/total` bytes, speed, and ETA.
|
||||
|
||||
- [x] **F2. Model search/filter**
|
||||
Search input above models list. Filter by name, family, quantization. Useful when 10+ models are installed.
|
||||
|
||||
- [x] **F3. Prompt history (localStorage)**
|
||||
Store last 20 prompts with model name + timestamp. Dropdown in prompt modal to re-run previous prompts.
|
||||
|
||||
- [x] **F4. Chat mode (multi-turn conversation)**
|
||||
Use Ollama `/api/chat` instead of `/api/generate`. Chat bubble layout with message history. System prompt input field.
|
||||
|
||||
- [x] **F5. Model comparison (side-by-side)**
|
||||
Send same prompt to 2 models simultaneously. Display responses side-by-side with latency/quality comparison.
|
||||
|
||||
- [x] **F6. Token/s metrics after generation**
|
||||
Parse `eval_count` and `eval_duration` from the final NDJSON chunk. Display tokens/second, total tokens, and latency in the response footer.
|
||||
|
||||
- [x] **F7. System resource sparklines (time-series)**
|
||||
Ring buffer of memory/CPU snapshots (localStorage). Render mini sparkline charts in the System panel. Spot trends over time.
|
||||
|
||||
- [x] **F8. Ollama server logs viewer**
|
||||
Read `~/.ollama/logs/` and display in a collapsible terminal-style panel. Filter by level. Auto-scroll.
|
||||
|
||||
- [x] **F9. Modelfile / template viewer**
|
||||
The `show` action already fetches Modelfile, template, and system prompt. Display in a collapsible code block in expanded model details.
|
||||
|
||||
- [x] **F10. Dark/light theme toggle**
|
||||
Add `:root.light` CSS variable overrides. Theme toggle with localStorage persistence. Current architecture supports this natively.
|
||||
|
||||
- [x] **F11. Keyboard shortcuts panel (`?` key)**
|
||||
Show all shortcuts in a modal: ⌘+Enter (send), Esc (close), R (refresh), / (search models), ? (help).
|
||||
|
||||
- [x] **F12. Whisper transcription test**
|
||||
Upload/record a short audio clip, transcribe locally via whisper-cli, display result with latency. Tests the full local STT pipeline.
|
||||
|
||||
- [x] **F13. Responsive mobile layout**
|
||||
Better breakpoints for the 4-column stats row and 3-column main grid. Collapsible sidebar on mobile.
|
||||
|
||||
- [x] **F14. Model tags/labels (localStorage)**
|
||||
User-defined tags (coding, fast, vision) with colored badges. Persisted in localStorage.
|
||||
|
||||
- [x] **F15. Extraction service integration panel**
|
||||
Show extraction-service (port 4005) health status. Run test extractions against loaded Ollama models. Bridges dashboard to LysnrAI pipeline.
|
||||
|
||||
- [x] **F16. Auto-load preferred model**
|
||||
Mark a model as "auto-load" (stored in localStorage). When Ollama is online but no models loaded, auto-load the preferred model.
|
||||
|
||||
---
|
||||
|
||||
## 4. Performance & Reliability
|
||||
|
||||
- [x] **P1. No request deduplication on Refresh** — `page.tsx:164-176`
|
||||
Rapid clicks on Refresh fire duplicate `fetchAll()` calls. Add a `fetchingRef` guard or disable the button during fetch (partially done for `actionLoading` but not for `fetchAll`).
|
||||
|
||||
- [x] **P2. Static cache never expires** — `api/system/route.ts:81-90`
|
||||
`staticCache` (chip, GPU, brew) lives forever in the server process. Brew package upgrades won't reflect. Add 5-minute TTL.
|
||||
|
||||
- [x] **P3. `du -sk ~/.ollama/models` on every refresh** — `api/system/route.ts:41`
|
||||
Traverses entire models directory every 15 seconds. Cache with 60-second TTL.
|
||||
|
||||
- [x] **P4. No fetch timeout on Ollama calls** — `api/ollama/route.ts:5-12`
|
||||
`fetchOllama` has no `AbortSignal` or timeout. If Ollama hangs, the dashboard hangs. Add 5-second timeout.
|
||||
|
||||
- [x] **P5. `system_profiler` slow on first load** — `api/system/route.ts:52-53`
|
||||
Takes ~2-3 seconds. Cached after first call, but first dashboard load waits. Consider eager background fetch on server start or return placeholder.
|
||||
|
||||
---
|
||||
|
||||
## 5. Security & Hardening
|
||||
|
||||
- [x] **S1. No input validation on model names** — `api/ollama/route.ts:50-51`
|
||||
`model` from request body passed directly to Ollama. Add regex validation: `^[a-zA-Z0-9._:/-]{1,256}$`.
|
||||
|
||||
- [x] **S2. Shell command interpolation pattern** — `api/system/route.ts:67`
|
||||
`execAsync(\`brew list --versions ${pkg}\`)`— safe today (hardcoded targets) but fragile. Use`execFile('brew', ['list', '--versions', pkg])` for safety.
|
||||
|
||||
- [x] **S3. No CORS or auth** _(acceptable for local-only, documented)_
|
||||
Any local process can call API routes. Fine for dev tool; document the assumption.
|
||||
|
||||
---
|
||||
|
||||
## 6. Implementation Tracker
|
||||
|
||||
### Sprint 1 — Critical Bug Fixes _(est. 1–2 hrs)_
|
||||
|
||||
| # | ID | Task | Effort | Commit |
|
||||
| --- | --------- | ----------------------------------------- | ------ | --------- |
|
||||
| 1 | - [x] B4 | Guard Escape key during streaming | 5 min | `2da67c2` |
|
||||
| 2 | - [x] B5 | Pause auto-refresh during prompt/pull | 10 min | `2da67c2` |
|
||||
| 3 | - [x] B9 | Add AbortController to streaming fetch | 15 min | `2da67c2` |
|
||||
| 4 | - [x] B1 | Dynamic chip/RAM in header | 5 min | `2da67c2` |
|
||||
| 5 | - [x] B11 | Clear prompt text on Escape close | 5 min | `2da67c2` |
|
||||
| 6 | - [x] P4 | Add timeout to Ollama fetch calls | 10 min | `2da67c2` |
|
||||
| 7 | - [x] B3 | Remove dead generate action (or document) | 5 min | `2da67c2` |
|
||||
| 8 | - [x] B6 | Use Date.now() for toast IDs | 2 min | `2da67c2` |
|
||||
| 9 | - [x] B10 | Fix brew "Loading..." vs "empty" state | 5 min | `2da67c2` |
|
||||
|
||||
### Sprint 2 — Pull Progress + Metrics _(est. 2–3 hrs)_
|
||||
|
||||
| # | ID | Task | Effort | Commit |
|
||||
| --- | ----------- | ----------------------------------- | ------ | --------- |
|
||||
| 10 | - [x] B2+F1 | Streaming pull with progress bar | 60 min | `2d9475b` |
|
||||
| 11 | - [x] F6 | Display tokens/s after generation | 30 min | `2d9475b` |
|
||||
| 12 | - [x] B7 | Parse vm_stat page size dynamically | 10 min | `2d9475b` |
|
||||
| 13 | - [x] B8 | Multi-path whisper model discovery | 15 min | `2d9475b` |
|
||||
|
||||
### Sprint 3 — Component Refactor _(est. 2–3 hrs)_
|
||||
|
||||
| # | ID | Task | Effort | Commit |
|
||||
| --- | --------- | --------------------------------------- | ------ | --------- |
|
||||
| 14 | - [x] CQ1 | Extract components into separate files | 90 min | `75a3cd0` |
|
||||
| 15 | - [x] CQ4 | Add error.tsx Error Boundary | 15 min | `75a3cd0` |
|
||||
| 16 | - [x] CQ3 | Shared ollama-config.ts | 10 min | `75a3cd0` |
|
||||
| 17 | - [x] CQ2 | Consolidate inline styles → CSS classes | 45 min | `ed93a6f` |
|
||||
| 18 | - [x] S1 | Add model name input validation | 10 min | `75a3cd0` |
|
||||
| 19 | - [x] S2 | Replace exec → execFile for brew | 10 min | `75a3cd0` |
|
||||
|
||||
### Sprint 4 — UX Enhancements _(est. 3–4 hrs)_
|
||||
|
||||
| # | ID | Task | Effort | Commit |
|
||||
| --- | --------- | ------------------------------------ | ------ | --------- |
|
||||
| 20 | - [x] F3 | Prompt history (localStorage) | 45 min | `9c2f5f3` |
|
||||
| 21 | - [x] F9 | Modelfile viewer in expanded details | 30 min | `9c2f5f3` |
|
||||
| 22 | - [x] F4 | Chat mode (multi-turn via /api/chat) | 90 min | `ed93a6f` |
|
||||
| 23 | - [x] F2 | Model search/filter | 30 min | `9c2f5f3` |
|
||||
| 24 | - [x] F11 | Keyboard shortcuts panel | 20 min | `9c2f5f3` |
|
||||
|
||||
### Sprint 5 — Integration & Polish _(est. 2–3 hrs)_
|
||||
|
||||
| # | ID | Task | Effort | Commit |
|
||||
| --- | ----------- | -------------------------- | ------ | --------- |
|
||||
| 25 | - [x] F15 | Extraction service panel | 60 min | `8bdd5ee` |
|
||||
| 26 | - [x] F12 | Whisper transcription test | 45 min | `8bdd5ee` |
|
||||
| 27 | - [x] F7 | System resource sparklines | 45 min | `8bdd5ee` |
|
||||
| 28 | - [x] CQ5 | Loading skeleton UI | 20 min | `8bdd5ee` |
|
||||
| 29 | - [x] P1-P3 | Request dedup + cache TTLs | 30 min | `b1fda3a` |
|
||||
| 30 | - [x] F16 | Auto-load preferred model | 20 min | `ed93a6f` |
|
||||
|
||||
### Deferred (nice-to-have)
|
||||
|
||||
| ID | Task | Notes |
|
||||
| --------- | ------------------------------- | --------- |
|
||||
| - [x] F5 | Model comparison (side-by-side) | `8bdd5ee` |
|
||||
| - [x] F10 | Dark/light theme toggle | `ed93a6f` |
|
||||
| - [x] F13 | Responsive mobile layout | `8bdd5ee` |
|
||||
| - [x] F14 | Model tags/labels | `ed93a6f` |
|
||||
| - [x] CQ6 | Zod validation on API responses | `ed93a6f` |
|
||||
| - [x] F8 | Ollama server logs viewer | `8bdd5ee` |
|
||||
| - [x] S3 | CORS / auth (documented) | `8bdd5ee` |
|
||||
|
||||
---
|
||||
|
||||
## 7. Commit Log
|
||||
|
||||
_Commits will be added here as work progresses._
|
||||
|
||||
| # | Date | Commit | Sprint | Items Completed |
|
||||
| --- | ------ | --------- | -------- | ------------------------------------ |
|
||||
| 1 | Feb 19 | `2da67c2` | Sprint 1 | B1, B3, B4, B5, B6, B9, B10, B11, P4 |
|
||||
| 2 | Feb 19 | `2d9475b` | Sprint 2 | B2, B7, B8, F1, F6 |
|
||||
| 3 | Feb 19 | `75a3cd0` | Sprint 3 | CQ1, CQ3, CQ4, S1, S2 |
|
||||
| 4 | Feb 19 | `9c2f5f3` | Sprint 4 | F2, F3, F9, F11 |
|
||||
| 5 | Feb 19 | `b1fda3a` | Sprint 5 | P1, P2, P3 |
|
||||
| 6 | Feb 19 | `ed93a6f` | Sprint 6 | CQ2, CQ6, P5, F4, F10, F14, F16 |
|
||||
| 7 | Feb 19 | `8bdd5ee` | Sprint 7 | F5, F7, F8, F12, F13, F15, CQ5, S3 |
|
||||
|
||||
---
|
||||
|
||||
> **39 items total:** 11 bugs, 6 code quality, 16 features, 5 performance, 3 security
|
||||
> **All 39 items completed** across 7 sprints (9 code commits + doc updates)
|
||||
> **Actual total effort:** ~8 hours across 7 sprints
|
||||
|
||||
---
|
||||
|
||||
## 8. Next Wave — Model Intelligence & Pre-Load Metrics
|
||||
|
||||
> Proposed improvements focused on helping users make informed decisions **before** loading a model.
|
||||
|
||||
### Tier A — Pre-Load Decision Metrics _(est. 45 min)_
|
||||
|
||||
| ID | Feature | Description |
|
||||
| --- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| N1 | **Estimated RAM per model** | Approximate from disk size: Q4_K_M ≈ 1.2×disk in RAM. Show on every model card (e.g., `~22 GB RAM`), not just running models. |
|
||||
| N2 | **"Will it fit?" indicator** | Compare estimated RAM vs `system.memory.free + cached`. Color-code: 🟢 Fits, 🟡 Tight (80–100%), 🔴 Won't fit. Show on Load button or as badge. |
|
||||
| N3 | **Aggregate loaded model RAM** | Sum VRAM of all running models. Display at top of models panel: "3 models loaded · 28.5 GB VRAM". |
|
||||
|
||||
### Tier B — Rich Model Metadata _(est. 60 min)_
|
||||
|
||||
| ID | Feature | Description |
|
||||
| --- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| N4 | **RAM budget bar** | Horizontal stacked bar: `[OS+Apps ▏ Model A (loaded) ▏ Model B (loaded) ▏ Free]`. Instant visual of memory headroom. |
|
||||
| N5 | **Context window size** | Fetch `context_length` from Ollama `/api/show` → `model_info`. Display on card (e.g., `128k ctx`). Critical for knowing max prompt length. |
|
||||
|
||||
### Tier C — Model Intelligence Badges _(est. 45 min)_
|
||||
|
||||
| ID | Feature | Description |
|
||||
| --- | --------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| N6 | **`<think>` warning badge** | If model is DeepSeek R1 family, show ⚠️ badge: "Emits `<think>` traces — strip before JSON.parse". Prevents silent JSON failures. |
|
||||
| N7 | **Vision model indicator** | If model is multimodal (llava, qwen2.5vl), show 👁 badge. These need image input — text-only prompts are suboptimal. |
|
||||
| N8 | **Architecture badge** | Show model arch (llama, qwen2, phi3, deepseek2) as subtle pill on the card. Currently buried in expanded details. |
|
||||
| N9 | **Sort/order models** | Dropdown to sort by: name, size, parameters, running status, last modified. Currently uses Ollama's default order. |
|
||||
| N10 | **Ollama version display** | Call `/api/version`. Show in Ollama status card. Useful for debugging model compatibility. |
|
||||
|
||||
### Tier D — Runtime Metrics & UX _(est. 30 min)_
|
||||
|
||||
| ID | Feature | Description |
|
||||
| --- | --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| N11 | **Last known tok/s per model** | Persist `StreamMetrics.tokensPerSec` in localStorage keyed by model. Show on card (e.g., `~45 tok/s`). Compare speeds without re-benchmarking. |
|
||||
| N12 | **Auto-unload countdown** | Replace static `Expires: 3:45 PM` with live countdown: `Unloads in 4m 32s`. More actionable. |
|
||||
| N13 | **Session stats per model** | Track prompts sent + tokens generated per model in session. Show in expanded details. |
|
||||
| N14 | **Delete confirmation + reclaim** | Show "Delete qwen2.5-coder:32b? Reclaim 18.5 GB disk." before deleting. Currently no confirmation. |
|
||||
| N15 | **Simultaneous load suggestions** | Based on available RAM, suggest which models can be co-loaded. E.g., "Can co-load llama3.1:8b + qwen2.5-coder:32b (28 GB, 20 GB free)". |
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
| Sprint | Items | Focus | Effort |
|
||||
| ------ | ----------------------- | ------------------------ | ------- |
|
||||
| 8 | N1, N2, N3 | Pre-load RAM estimates | ~45 min |
|
||||
| 9 | N4, N5 | RAM bar + context window | ~60 min |
|
||||
| 10 | N6, N7, N8, N9, N10 | Badges + sort + version | ~45 min |
|
||||
| 11 | N11, N12, N13, N14, N15 | Runtime metrics + UX | ~30 min |
|
||||
157
__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md
Normal file
157
__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md
Normal file
@ -0,0 +1,157 @@
|
||||
# Mission Control Dashboard — Implementation Roadmap
|
||||
|
||||
> Phased plan for the next wave of dashboard improvements.
|
||||
> Last updated: Feb 19, 2026
|
||||
>
|
||||
> See also: [DASHBOARD_PRD.md](DASHBOARD_PRD.md) · [DASHBOARD_REVIEW.md](DASHBOARD_REVIEW.md)
|
||||
>
|
||||
> **Previous work:** 39 items (11 bugs, 6 code quality, 16 features, 5 performance, 3 security) completed across 7 sprints. See DASHBOARD_REVIEW.md for full audit and commit log.
|
||||
|
||||
---
|
||||
|
||||
## Vision
|
||||
|
||||
Transform the dashboard from a model management tool into a **model intelligence platform** — helping developers make informed decisions about which models to load, how they perform, and how they interact with system resources.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Pre-Load Intelligence _(Sprint 8)_
|
||||
|
||||
**Goal:** Give users the information they need **before** clicking "Load" on a model.
|
||||
|
||||
**Estimated effort:** ~45 minutes
|
||||
|
||||
| # | ID | Task | Status | Priority | Notes |
|
||||
| --- | --- | -------------------------- | ------ | -------- | ----------------------------------------------------------- |
|
||||
| 1 | N1 | Estimated RAM per model | [ ] | High | Q4_K_M ≈ 1.2× disk size. Show `~22 GB RAM` on every card. |
|
||||
| 2 | N2 | "Will it fit?" indicator | [ ] | High | 🟢 Fits / 🟡 Tight / 🔴 Won't fit based on free+cached RAM. |
|
||||
| 3 | N3 | Aggregate loaded model RAM | [ ] | High | Sum VRAM at top of panel: "2 loaded · 28.5 GB VRAM". |
|
||||
|
||||
**Implementation details:**
|
||||
|
||||
- **N1:** Add `estimateRam(diskSize: number)` to `lib/format.ts`. Returns `diskSize * 1.2`. Display below existing size/params/quant line on each model card.
|
||||
- **N2:** Compare `estimateRam(model.size)` against `system.memory.free + system.memory.cached`. Pass `system` into the model list rendering. Add colored dot or badge next to Load button.
|
||||
- **N3:** Compute `ollama.running.reduce((sum, r) => sum + r.size_vram, 0)` and display in the models panel header next to "X active".
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Every model card shows estimated RAM requirement
|
||||
- Load button has a color-coded fit indicator
|
||||
- Panel header shows total VRAM of loaded models
|
||||
- TypeScript compiles cleanly
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Rich Metadata _(Sprint 9)_
|
||||
|
||||
**Goal:** Surface critical model metadata that's currently hidden behind the Ollama API.
|
||||
|
||||
**Estimated effort:** ~60 minutes
|
||||
|
||||
| # | ID | Task | Status | Priority | Notes |
|
||||
| --- | --- | ------------------- | ------ | -------- | ---------------------------------------------------------------------------- |
|
||||
| 4 | N4 | RAM budget bar | [ ] | Medium | Stacked horizontal bar: OS+Apps / Loaded models (by name) / Free. |
|
||||
| 5 | N5 | Context window size | [ ] | High | Fetch `context_length` from `/api/show` model_info. Show `128k ctx` on card. |
|
||||
|
||||
**Implementation details:**
|
||||
|
||||
- **N4:** New `RamBudgetBar` component. Inputs: `totalRam`, `appMemory`, `runningModels[]` (each with name + size_vram), `freeRam`. Renders as a CSS flex bar with labeled segments. Place above the models list.
|
||||
- **N5:** Extend the `OllamaModel` interface to include optional `context_length?: number`. On expand (or eagerly for all models), call `/api/ollama` POST with `action: 'show'` and extract `model_info.*.context_length`. Cache in component state. Show as badge on card.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- RAM budget bar visually represents memory allocation
|
||||
- Each model shows context window length (once fetched)
|
||||
- Bar updates when models are loaded/unloaded
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Model Intelligence Badges _(Sprint 10)_
|
||||
|
||||
**Goal:** Auto-detect model capabilities and surface warnings so users don't hit surprises.
|
||||
|
||||
**Estimated effort:** ~45 minutes
|
||||
|
||||
| # | ID | Task | Status | Priority | Notes |
|
||||
| --- | --- | ----------------------- | ------ | -------- | -------------------------------------------------------------------- |
|
||||
| 6 | N6 | `<think>` warning badge | [ ] | High | DeepSeek R1 models emit reasoning traces — warn about JSON stripping |
|
||||
| 7 | N7 | Vision model indicator | [ ] | Medium | Multimodal models (llava, qwen2.5vl) need image input |
|
||||
| 8 | N8 | Architecture badge | [ ] | Low | Show model arch as pill on card (currently in expanded only) |
|
||||
| 9 | N9 | Sort/order models | [ ] | Medium | Dropdown: name, size, parameters, running, modified |
|
||||
| 10 | N10 | Ollama version display | [ ] | Low | Show Ollama server version in status card |
|
||||
|
||||
**Implementation details:**
|
||||
|
||||
- **N6:** Pattern match model name: `/deepseek-r1/i` or family containing `deepseek`. Show amber ⚠️ badge with tooltip: "Emits `<think>` traces before JSON output".
|
||||
- **N7:** Pattern match: `/llava|qwen.*vl|minicpm-v/i`. Show 👁 badge with tooltip: "Vision model — supports image input".
|
||||
- **N8:** Move `model.details.family` from expanded-only to always-visible as a subtle pill badge.
|
||||
- **N9:** Add `modelSort` state (`'name' | 'size' | 'params' | 'running' | 'modified'`). Sort the filtered model list before `.map()`. Add dropdown above the model list (next to search bar).
|
||||
- **N10:** New API call to Ollama `/api/version` in the GET handler. Return in OllamaData. Display in the Ollama stats card.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- DeepSeek R1 models show `<think>` warning badge
|
||||
- Vision models show eye indicator
|
||||
- Models can be sorted by any of the 5 criteria
|
||||
- Ollama version shown in status card
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Runtime Metrics & Polish _(Sprint 11)_
|
||||
|
||||
**Goal:** Improve the experience for users who are actively using models.
|
||||
|
||||
**Estimated effort:** ~30 minutes
|
||||
|
||||
| # | ID | Task | Status | Priority | Notes |
|
||||
| --- | --- | ----------------------------- | ------ | -------- | -------------------------------------------------------- |
|
||||
| 11 | N11 | Last known tok/s per model | [ ] | Medium | Persist after prompt, show on card |
|
||||
| 12 | N12 | Auto-unload countdown | [ ] | Medium | Live countdown instead of static expiry time |
|
||||
| 13 | N13 | Session stats per model | [ ] | Low | Prompts sent + tokens generated in this session |
|
||||
| 14 | N14 | Delete confirmation + reclaim | [ ] | High | "Delete X? Reclaim Y GB" dialog before deleting |
|
||||
| 15 | N15 | Simultaneous load suggestions | [ ] | Low | Suggest which models fit together based on available RAM |
|
||||
|
||||
**Implementation details:**
|
||||
|
||||
- **N11:** After prompt/chat completes, save `{ model: string, tokPerSec: number, timestamp: number }` to localStorage key `modelBenchmarks`. Show on card as faded text: `~45 tok/s`.
|
||||
- **N12:** Replace `Expires: {time}` with a `useEffect` interval that computes `expires_at - Date.now()` and formats as `Xm Ys`. Update every second.
|
||||
- **N13:** Track in component state: `Map<string, { prompts: number, tokens: number }>`. Increment on each prompt/chat completion. Display in expanded details.
|
||||
- **N14:** Add `confirmDelete` state. When delete is clicked, show inline confirmation with model name and `formatBytes(model.size)` reclaim amount. Second click executes.
|
||||
- **N15:** After computing estimated RAM for all unloaded models, filter those that fit in remaining free memory. Show as suggestions in the models panel footer.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Previously benchmarked models show tok/s on card
|
||||
- Running models show live countdown to unload
|
||||
- Delete requires explicit confirmation showing disk reclaim
|
||||
- TypeScript compiles cleanly after all phases
|
||||
|
||||
---
|
||||
|
||||
## Phase 5 — Future Considerations _(Backlog)_
|
||||
|
||||
Not planned for immediate implementation. Revisit after Phases 1–4.
|
||||
|
||||
| ID | Feature | Complexity | Notes |
|
||||
| --- | ------------------------------ | ---------- | ----------------------------------------------------------------- |
|
||||
| F17 | WebSocket real-time updates | High | Replace 15s polling with push-based updates from Ollama |
|
||||
| F18 | GPU/Metal utilization chart | Medium | macOS `powermetrics` or IOKit for GPU load percentage |
|
||||
| F19 | Model download queue | Medium | Queue multiple pulls, show progress for each |
|
||||
| F20 | Inference history log | Medium | Persist all prompts/responses to localStorage or file |
|
||||
| F21 | Custom Modelfile editor | Medium | Edit and push custom Modelfiles to Ollama |
|
||||
| F22 | Benchmark suite | High | Run standard prompts across all models, generate comparison table |
|
||||
| F23 | Component decomposition (CQ1b) | High | Break 1,885-line page.tsx into feature-based modules |
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Phase | Sprint | Items | Focus | Effort | Depends On |
|
||||
| --------- | ------ | ------------ | ------------------------ | ---------- | ---------- |
|
||||
| 1 | 8 | N1, N2, N3 | Pre-load RAM estimates | ~45 min | — |
|
||||
| 2 | 9 | N4, N5 | RAM bar + context window | ~60 min | Phase 1 |
|
||||
| 3 | 10 | N6–N10 | Badges + sort + version | ~45 min | — |
|
||||
| 4 | 11 | N11–N15 | Runtime metrics + UX | ~30 min | — |
|
||||
| **Total** | | **15 items** | | **~3 hrs** | |
|
||||
|
||||
Phases 1 and 3 can run in parallel. Phase 2 depends on Phase 1 (needs RAM estimates for the budget bar). Phase 4 is independent.
|
||||
@ -1,27 +1,12 @@
|
||||
# 05 — Mission Control Dashboard
|
||||
|
||||
> Next.js dashboard for visualizing and managing the local LLM stack.
|
||||
> **Documentation has moved.** All dashboard docs now live in the dashboard directory.
|
||||
|
||||
---
|
||||
- **PRD:** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_PRD.md`](../dashboard/docs/DASHBOARD_PRD.md)
|
||||
- **Review (39 items):** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_REVIEW.md`](../dashboard/docs/DASHBOARD_REVIEW.md)
|
||||
- **Roadmap (N1–N15):** [`__LOCAL_LLMs/dashboard/docs/DASHBOARD_ROADMAP.md`](../dashboard/docs/DASHBOARD_ROADMAP.md)
|
||||
|
||||
## Overview
|
||||
|
||||
A dark-themed, real-time dashboard built with Next.js 16 that provides:
|
||||
|
||||
- **Ollama status** — online/offline, model list, loaded models, load/unload actions
|
||||
- **Streaming prompt interface** — send prompts to any loaded model with real-time streaming responses
|
||||
- **Model management** — pull new models, delete models (with confirmation), view VRAM/expiry info
|
||||
- **System resources** — chip info, RAM/disk usage with progress bars, uptime
|
||||
- **Whisper.cpp status** — installed binaries, downloaded models
|
||||
- **Brew packages** — version tracking for ollama, whisper-cpp, ffmpeg
|
||||
- **Auto-refresh** — polls all endpoints every 15 seconds
|
||||
- **Toast notifications** — success/error/info feedback for all actions
|
||||
- **Keyboard shortcuts** — Cmd+Enter to send prompt, Escape to close modals
|
||||
- **Copy response** — one-click copy of model responses to clipboard
|
||||
|
||||
---
|
||||
|
||||
## Running
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
cd __LOCAL_LLMs/dashboard
|
||||
@ -30,247 +15,3 @@ npm run dev -- -p 3100
|
||||
```
|
||||
|
||||
Open: **http://localhost:3100**
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Component | Technology |
|
||||
| --------- | ----------------------------------- |
|
||||
| Framework | Next.js 16 (App Router) |
|
||||
| Language | TypeScript |
|
||||
| Styling | TailwindCSS v4 |
|
||||
| Icons | Lucide React |
|
||||
| React | 19 |
|
||||
| Theme | Dark — ByteLyst design token colors |
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
dashboard/
|
||||
├── src/
|
||||
│ ├── app/
|
||||
│ │ ├── layout.tsx ← Root layout (dark theme, system fonts)
|
||||
│ │ ├── page.tsx ← Main dashboard (~800 lines, client component)
|
||||
│ │ ├── globals.css ← Dark theme CSS (ByteLyst tokens + animations)
|
||||
│ │ └── api/
|
||||
│ │ ├── ollama/
|
||||
│ │ │ ├── route.ts ← Ollama proxy (GET: list+ps, POST: load/unload/generate/pull/delete/show)
|
||||
│ │ │ └── stream/route.ts ← Streaming generate (NDJSON proxy to Ollama)
|
||||
│ │ ├── whisper/route.ts ← Whisper discovery (binaries, models, version)
|
||||
│ │ └── system/route.ts ← System info (chip, RAM, disk, brew) — cached statics
|
||||
│ └── components/ ← (empty, all inline in page.tsx for now)
|
||||
├── package.json
|
||||
├── next.config.ts
|
||||
├── tsconfig.json
|
||||
├── postcss.config.mjs
|
||||
└── tailwind.config.ts (auto via v4)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Routes
|
||||
|
||||
### GET /api/ollama
|
||||
|
||||
Fetches Ollama status, model list, and running models.
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "online",
|
||||
"url": "http://localhost:11434",
|
||||
"models": [
|
||||
{
|
||||
"name": "qwen2.5-coder:32b",
|
||||
"size": 19000000000,
|
||||
"digest": "b92d6a0bd47e...",
|
||||
"modified_at": "2026-02-19T...",
|
||||
"details": {
|
||||
"family": "qwen2",
|
||||
"parameter_size": "32B",
|
||||
"quantization_level": "Q4_K_M"
|
||||
}
|
||||
}
|
||||
],
|
||||
"running": [],
|
||||
"totalModels": 2,
|
||||
"totalSize": 23900000000,
|
||||
"runningCount": 0
|
||||
}
|
||||
```
|
||||
|
||||
### POST /api/ollama
|
||||
|
||||
Model management actions.
|
||||
|
||||
**Load a model into RAM:**
|
||||
|
||||
```json
|
||||
{ "action": "load", "model": "qwen2.5-coder:32b" }
|
||||
```
|
||||
|
||||
**Unload a model from RAM:**
|
||||
|
||||
```json
|
||||
{ "action": "unload", "model": "qwen2.5-coder:32b" }
|
||||
```
|
||||
|
||||
**Send a prompt:**
|
||||
|
||||
```json
|
||||
{ "action": "generate", "model": "qwen2.5-coder:32b", "prompt": "Write hello world in Swift" }
|
||||
```
|
||||
|
||||
### GET /api/whisper
|
||||
|
||||
Discovers whisper-cpp installation.
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"installed": true,
|
||||
"version": "unknown",
|
||||
"binaries": [
|
||||
"whisper-bench",
|
||||
"whisper-cli",
|
||||
"whisper-command",
|
||||
"whisper-lsp",
|
||||
"whisper-quantize",
|
||||
"whisper-server",
|
||||
"whisper-stream",
|
||||
"whisper-talk-llama",
|
||||
"whisper-vad-speech-segments"
|
||||
],
|
||||
"models": [],
|
||||
"modelsDir": "/Users/sd9235/whisper-models"
|
||||
}
|
||||
```
|
||||
|
||||
### GET /api/system
|
||||
|
||||
System hardware and software info.
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"chip": "Apple M4 Pro",
|
||||
"gpu": "Apple Silicon (integrated)",
|
||||
"memory": { "total": 51539607552, "used": 38000000000, "free": 13539607552 },
|
||||
"disk": { "total": 994662584320, "used": 450000000000, "free": 544662584320 },
|
||||
"ollamaDiskUsage": 24500000000,
|
||||
"cpuCores": 14,
|
||||
"uptime": 86400,
|
||||
"platform": "Darwin 25.3.0",
|
||||
"arch": "arm64",
|
||||
"nodeVersion": "v25.2.1",
|
||||
"brewPackages": [
|
||||
{ "name": "ollama", "version": "0.16.2" },
|
||||
{ "name": "whisper-cpp", "version": "1.8.3" },
|
||||
{ "name": "ffmpeg", "version": "8.0.1_4" }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dashboard UI Features
|
||||
|
||||
### Header
|
||||
|
||||
- App icon + title ("Local LLM Mission Control")
|
||||
- Machine info subtitle (chip, RAM, OS)
|
||||
- Last refresh timestamp + manual refresh button
|
||||
|
||||
### Top Stats Row (4 cards)
|
||||
|
||||
1. **Ollama** — online/offline status with pulsing dot, server URL
|
||||
2. **Models** — count + total disk size, number loaded in RAM
|
||||
3. **Whisper** — installed/not found, model count
|
||||
4. **Memory** — used/total with color-coded progress bar (blue → yellow → red)
|
||||
|
||||
### Main Grid (2 columns + 1 sidebar)
|
||||
|
||||
**Left (2/3 width) — Ollama Models:**
|
||||
|
||||
- Each model shown as a card with name, size, parameter count, quantization
|
||||
- Green highlight + "LOADED" badge for models currently in RAM
|
||||
- Actions per model:
|
||||
- **Load** — loads model into RAM (green button)
|
||||
- **Unload** — evicts from RAM (red button)
|
||||
- **Prompt** — opens prompt modal (blue button, only for loaded models)
|
||||
- Expandable details (digest, modified date, family)
|
||||
|
||||
**Right (1/3 width):**
|
||||
|
||||
- **System** — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
|
||||
- **Whisper.cpp** — status, binary list as tags, model list with sizes
|
||||
- **Brew Packages** — name + version for ollama, whisper-cpp, ffmpeg
|
||||
|
||||
### Prompt Modal
|
||||
|
||||
- Slide-up overlay when clicking "Prompt" on a loaded model
|
||||
- Textarea with Cmd+Enter shortcut to send
|
||||
- Sends to Ollama `/api/generate` (non-streaming)
|
||||
- Response displayed in monospace pre block with scroll
|
||||
|
||||
---
|
||||
|
||||
## Design Tokens
|
||||
|
||||
The dashboard uses ByteLyst design system colors:
|
||||
|
||||
| Token | Hex | Use |
|
||||
| -------------------- | --------- | ----------------- |
|
||||
| `--bg-canvas` | `#06070A` | Page background |
|
||||
| `--bg-elevated` | `#0E1118` | Modal background |
|
||||
| `--surface-card` | `#121725` | Card backgrounds |
|
||||
| `--surface-muted` | `#1A2335` | Muted areas |
|
||||
| `--text-primary` | `#EFF4FF` | Main text |
|
||||
| `--text-secondary` | `#A5B1C7` | Descriptions |
|
||||
| `--text-tertiary` | `#6C7C98` | Hints, timestamps |
|
||||
| `--accent-primary` | `#5A8CFF` | Primary actions |
|
||||
| `--accent-secondary` | `#2EE6D6` | Secondary accent |
|
||||
| `--success` | `#34D399` | Online, loaded |
|
||||
| `--warning` | `#F59E0B` | Warning state |
|
||||
| `--danger` | `#FF6E6E` | Offline, errors |
|
||||
| `--purple` | `#A78BFA` | Whisper, disk |
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
### v2 (2026-02-19)
|
||||
|
||||
- Fixed Google Fonts build error (corporate proxy blocks fonts.gstatic.com) — switched to system fonts
|
||||
- Fixed system API slowness (7.6s → 50ms) by caching static info (chip, GPU, brew packages)
|
||||
- Added streaming prompt responses via NDJSON proxy to Ollama `/api/generate`
|
||||
- Added model pull UI — input field + pull button in Ollama Models section
|
||||
- Added model delete with confirmation dialog in expanded details
|
||||
- Added VRAM usage and expiry time display for running models
|
||||
- Added toast notifications (success/error/info) with slide-in animation
|
||||
- Added copy response button in prompt modal
|
||||
- Added Escape key to close modals, backdrop click to dismiss
|
||||
- Added streaming indicator with pulsing dot
|
||||
|
||||
### v1 (2026-02-19)
|
||||
|
||||
- Initial dashboard with Ollama status, model list, system info, Whisper.cpp, brew packages
|
||||
- Load/unload model actions
|
||||
- Basic prompt interface (non-streaming)
|
||||
- Auto-refresh every 15 seconds
|
||||
- Dark theme with ByteLyst design tokens
|
||||
|
||||
## Future Improvements
|
||||
|
||||
- [ ] Whisper transcription UI (upload audio, get text)
|
||||
- [ ] GPU utilization chart (Metal usage over time)
|
||||
- [ ] Model comparison benchmarks (side-by-side prompt same question)
|
||||
- [ ] Chat mode (multi-turn conversation with history)
|
||||
- [ ] WebSocket for real-time status updates (replace polling)
|
||||
- [ ] Component extraction (break page.tsx into smaller components)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user