- docs/05-mission-control-dashboard.md: complete dashboard reference with architecture diagram, API route docs (request/response examples), UI feature descriptions, design tokens table, v1/v2 changelog, and future improvements roadmap
8.2 KiB
8.2 KiB
05 — Mission Control Dashboard
Next.js dashboard for visualizing and managing the local LLM stack.
Overview
A dark-themed, real-time dashboard built with Next.js 16 that provides:
- Ollama status — online/offline, model list, loaded models, load/unload actions
- Streaming prompt interface — send prompts to any loaded model with real-time streaming responses
- Model management — pull new models, delete models (with confirmation), view VRAM/expiry info
- System resources — chip info, RAM/disk usage with progress bars, uptime
- Whisper.cpp status — installed binaries, downloaded models
- Brew packages — version tracking for ollama, whisper-cpp, ffmpeg
- Auto-refresh — polls all endpoints every 15 seconds
- Toast notifications — success/error/info feedback for all actions
- Keyboard shortcuts — Cmd+Enter to send prompt, Escape to close modals
- Copy response — one-click copy of model responses to clipboard
Running
cd __LOCAL_LLMs/dashboard
npm install # first time only
npm run dev -- -p 3100
Open: http://localhost:3100
Tech Stack
| Component | Technology |
|---|---|
| Framework | Next.js 16 (App Router) |
| Language | TypeScript |
| Styling | TailwindCSS v4 |
| Icons | Lucide React |
| React | 19 |
| Theme | Dark — ByteLyst design token colors |
Architecture
dashboard/
├── src/
│ ├── app/
│ │ ├── layout.tsx ← Root layout (dark theme, system fonts)
│ │ ├── page.tsx ← Main dashboard (~800 lines, client component)
│ │ ├── globals.css ← Dark theme CSS (ByteLyst tokens + animations)
│ │ └── api/
│ │ ├── ollama/
│ │ │ ├── route.ts ← Ollama proxy (GET: list+ps, POST: load/unload/generate/pull/delete/show)
│ │ │ └── stream/route.ts ← Streaming generate (NDJSON proxy to Ollama)
│ │ ├── whisper/route.ts ← Whisper discovery (binaries, models, version)
│ │ └── system/route.ts ← System info (chip, RAM, disk, brew) — cached statics
│ └── components/ ← (empty, all inline in page.tsx for now)
├── package.json
├── next.config.ts
├── tsconfig.json
├── postcss.config.mjs
└── tailwind.config.ts (auto via v4)
API Routes
GET /api/ollama
Fetches Ollama status, model list, and running models.
Response:
{
"status": "online",
"url": "http://localhost:11434",
"models": [
{
"name": "qwen2.5-coder:32b",
"size": 19000000000,
"digest": "b92d6a0bd47e...",
"modified_at": "2026-02-19T...",
"details": {
"family": "qwen2",
"parameter_size": "32B",
"quantization_level": "Q4_K_M"
}
}
],
"running": [],
"totalModels": 2,
"totalSize": 23900000000,
"runningCount": 0
}
POST /api/ollama
Model management actions.
Load a model into RAM:
{ "action": "load", "model": "qwen2.5-coder:32b" }
Unload a model from RAM:
{ "action": "unload", "model": "qwen2.5-coder:32b" }
Send a prompt:
{ "action": "generate", "model": "qwen2.5-coder:32b", "prompt": "Write hello world in Swift" }
GET /api/whisper
Discovers whisper-cpp installation.
Response:
{
"installed": true,
"version": "unknown",
"binaries": [
"whisper-bench",
"whisper-cli",
"whisper-command",
"whisper-lsp",
"whisper-quantize",
"whisper-server",
"whisper-stream",
"whisper-talk-llama",
"whisper-vad-speech-segments"
],
"models": [],
"modelsDir": "/Users/sd9235/whisper-models"
}
GET /api/system
System hardware and software info.
Response:
{
"chip": "Apple M4 Pro",
"gpu": "Apple Silicon (integrated)",
"memory": { "total": 51539607552, "used": 38000000000, "free": 13539607552 },
"disk": { "total": 994662584320, "used": 450000000000, "free": 544662584320 },
"ollamaDiskUsage": 24500000000,
"cpuCores": 14,
"uptime": 86400,
"platform": "Darwin 25.3.0",
"arch": "arm64",
"nodeVersion": "v25.2.1",
"brewPackages": [
{ "name": "ollama", "version": "0.16.2" },
{ "name": "whisper-cpp", "version": "1.8.3" },
{ "name": "ffmpeg", "version": "8.0.1_4" }
]
}
Dashboard UI Features
Header
- App icon + title ("Local LLM Mission Control")
- Machine info subtitle (chip, RAM, OS)
- Last refresh timestamp + manual refresh button
Top Stats Row (4 cards)
- Ollama — online/offline status with pulsing dot, server URL
- Models — count + total disk size, number loaded in RAM
- Whisper — installed/not found, model count
- Memory — used/total with color-coded progress bar (blue → yellow → red)
Main Grid (2 columns + 1 sidebar)
Left (2/3 width) — Ollama Models:
- Each model shown as a card with name, size, parameter count, quantization
- Green highlight + "LOADED" badge for models currently in RAM
- Actions per model:
- Load — loads model into RAM (green button)
- Unload — evicts from RAM (red button)
- Prompt — opens prompt modal (blue button, only for loaded models)
- Expandable details (digest, modified date, family)
Right (1/3 width):
- System — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
- Whisper.cpp — status, binary list as tags, model list with sizes
- Brew Packages — name + version for ollama, whisper-cpp, ffmpeg
Prompt Modal
- Slide-up overlay when clicking "Prompt" on a loaded model
- Textarea with Cmd+Enter shortcut to send
- Sends to Ollama
/api/generate(non-streaming) - Response displayed in monospace pre block with scroll
Design Tokens
The dashboard uses ByteLyst design system colors:
| Token | Hex | Use |
|---|---|---|
--bg-canvas |
#06070A |
Page background |
--bg-elevated |
#0E1118 |
Modal background |
--surface-card |
#121725 |
Card backgrounds |
--surface-muted |
#1A2335 |
Muted areas |
--text-primary |
#EFF4FF |
Main text |
--text-secondary |
#A5B1C7 |
Descriptions |
--text-tertiary |
#6C7C98 |
Hints, timestamps |
--accent-primary |
#5A8CFF |
Primary actions |
--accent-secondary |
#2EE6D6 |
Secondary accent |
--success |
#34D399 |
Online, loaded |
--warning |
#F59E0B |
Warning state |
--danger |
#FF6E6E |
Offline, errors |
--purple |
#A78BFA |
Whisper, disk |
Changelog
v2 (2026-02-19)
- Fixed Google Fonts build error (corporate proxy blocks fonts.gstatic.com) — switched to system fonts
- Fixed system API slowness (7.6s → 50ms) by caching static info (chip, GPU, brew packages)
- Added streaming prompt responses via NDJSON proxy to Ollama
/api/generate - Added model pull UI — input field + pull button in Ollama Models section
- Added model delete with confirmation dialog in expanded details
- Added VRAM usage and expiry time display for running models
- Added toast notifications (success/error/info) with slide-in animation
- Added copy response button in prompt modal
- Added Escape key to close modals, backdrop click to dismiss
- Added streaming indicator with pulsing dot
v1 (2026-02-19)
- Initial dashboard with Ollama status, model list, system info, Whisper.cpp, brew packages
- Load/unload model actions
- Basic prompt interface (non-streaming)
- Auto-refresh every 15 seconds
- Dark theme with ByteLyst design tokens
Future Improvements
- Whisper transcription UI (upload audio, get text)
- GPU utilization chart (Metal usage over time)
- Model comparison benchmarks (side-by-side prompt same question)
- Chat mode (multi-turn conversation with history)
- WebSocket for real-time status updates (replace polling)
- Component extraction (break page.tsx into smaller components)