saravanakumardb1 b77afce9ae docs(local-llm): add Mission Control dashboard documentation

- docs/05-mission-control-dashboard.md: complete dashboard reference with
  architecture diagram, API route docs (request/response examples),
  UI feature descriptions, design tokens table, v1/v2 changelog,
  and future improvements roadmap

2026-02-19 13:03:30 -08:00

8.2 KiB

Raw Blame History

05 — Mission Control Dashboard

Next.js dashboard for visualizing and managing the local LLM stack.

Overview

A dark-themed, real-time dashboard built with Next.js 16 that provides:

Ollama status — online/offline, model list, loaded models, load/unload actions
Streaming prompt interface — send prompts to any loaded model with real-time streaming responses
Model management — pull new models, delete models (with confirmation), view VRAM/expiry info
System resources — chip info, RAM/disk usage with progress bars, uptime
Whisper.cpp status — installed binaries, downloaded models
Brew packages — version tracking for ollama, whisper-cpp, ffmpeg
Auto-refresh — polls all endpoints every 15 seconds
Toast notifications — success/error/info feedback for all actions
Keyboard shortcuts — Cmd+Enter to send prompt, Escape to close modals
Copy response — one-click copy of model responses to clipboard

Running

cd __LOCAL_LLMs/dashboard
npm install          # first time only
npm run dev -- -p 3100

Open: http://localhost:3100

Tech Stack

Component	Technology
Framework	Next.js 16 (App Router)
Language	TypeScript
Styling	TailwindCSS v4
Icons	Lucide React
React	19
Theme	Dark — ByteLyst design token colors

Architecture

dashboard/
├── src/
│   ├── app/
│   │   ├── layout.tsx              ← Root layout (dark theme, system fonts)
│   │   ├── page.tsx                ← Main dashboard (~800 lines, client component)
│   │   ├── globals.css             ← Dark theme CSS (ByteLyst tokens + animations)
│   │   └── api/
│   │       ├── ollama/
│   │       │   ├── route.ts        ← Ollama proxy (GET: list+ps, POST: load/unload/generate/pull/delete/show)
│   │       │   └── stream/route.ts ← Streaming generate (NDJSON proxy to Ollama)
│   │       ├── whisper/route.ts    ← Whisper discovery (binaries, models, version)
│   │       └── system/route.ts     ← System info (chip, RAM, disk, brew) — cached statics
│   └── components/                 ← (empty, all inline in page.tsx for now)
├── package.json
├── next.config.ts
├── tsconfig.json
├── postcss.config.mjs
└── tailwind.config.ts (auto via v4)

API Routes

GET /api/ollama

Fetches Ollama status, model list, and running models.

Response:

{
  "status": "online",
  "url": "http://localhost:11434",
  "models": [
    {
      "name": "qwen2.5-coder:32b",
      "size": 19000000000,
      "digest": "b92d6a0bd47e...",
      "modified_at": "2026-02-19T...",
      "details": {
        "family": "qwen2",
        "parameter_size": "32B",
        "quantization_level": "Q4_K_M"
      }
    }
  ],
  "running": [],
  "totalModels": 2,
  "totalSize": 23900000000,
  "runningCount": 0
}

POST /api/ollama

Model management actions.

Load a model into RAM:

{ "action": "load", "model": "qwen2.5-coder:32b" }

Unload a model from RAM:

{ "action": "unload", "model": "qwen2.5-coder:32b" }

Send a prompt:

{ "action": "generate", "model": "qwen2.5-coder:32b", "prompt": "Write hello world in Swift" }

GET /api/whisper

Discovers whisper-cpp installation.

Response:

{
  "installed": true,
  "version": "unknown",
  "binaries": [
    "whisper-bench",
    "whisper-cli",
    "whisper-command",
    "whisper-lsp",
    "whisper-quantize",
    "whisper-server",
    "whisper-stream",
    "whisper-talk-llama",
    "whisper-vad-speech-segments"
  ],
  "models": [],
  "modelsDir": "/Users/sd9235/whisper-models"
}

GET /api/system

System hardware and software info.

Response:

{
  "chip": "Apple M4 Pro",
  "gpu": "Apple Silicon (integrated)",
  "memory": { "total": 51539607552, "used": 38000000000, "free": 13539607552 },
  "disk": { "total": 994662584320, "used": 450000000000, "free": 544662584320 },
  "ollamaDiskUsage": 24500000000,
  "cpuCores": 14,
  "uptime": 86400,
  "platform": "Darwin 25.3.0",
  "arch": "arm64",
  "nodeVersion": "v25.2.1",
  "brewPackages": [
    { "name": "ollama", "version": "0.16.2" },
    { "name": "whisper-cpp", "version": "1.8.3" },
    { "name": "ffmpeg", "version": "8.0.1_4" }
  ]
}

Dashboard UI Features

Header

App icon + title ("Local LLM Mission Control")
Machine info subtitle (chip, RAM, OS)
Last refresh timestamp + manual refresh button

Top Stats Row (4 cards)

Ollama — online/offline status with pulsing dot, server URL
Models — count + total disk size, number loaded in RAM
Whisper — installed/not found, model count
Memory — used/total with color-coded progress bar (blue → yellow → red)

Main Grid (2 columns + 1 sidebar)

Left (2/3 width) — Ollama Models:

Each model shown as a card with name, size, parameter count, quantization
Green highlight + "LOADED" badge for models currently in RAM
Actions per model:
- Load — loads model into RAM (green button)
- Unload — evicts from RAM (red button)
- Prompt — opens prompt modal (blue button, only for loaded models)
Expandable details (digest, modified date, family)

Right (1/3 width):

System — chip, cores, RAM bar, disk bar, uptime, Ollama disk footprint
Whisper.cpp — status, binary list as tags, model list with sizes
Brew Packages — name + version for ollama, whisper-cpp, ffmpeg

Slide-up overlay when clicking "Prompt" on a loaded model
Textarea with Cmd+Enter shortcut to send
Sends to Ollama /api/generate (non-streaming)
Response displayed in monospace pre block with scroll

Design Tokens

The dashboard uses ByteLyst design system colors:

Token	Hex	Use
`--bg-canvas`	`#06070A`	Page background
`--bg-elevated`	`#0E1118`	Modal background
`--surface-card`	`#121725`	Card backgrounds
`--surface-muted`	`#1A2335`	Muted areas
`--text-primary`	`#EFF4FF`	Main text
`--text-secondary`	`#A5B1C7`	Descriptions
`--text-tertiary`	`#6C7C98`	Hints, timestamps
`--accent-primary`	`#5A8CFF`	Primary actions
`--accent-secondary`	`#2EE6D6`	Secondary accent
`--success`	`#34D399`	Online, loaded
`--warning`	`#F59E0B`	Warning state
`--danger`	`#FF6E6E`	Offline, errors
`--purple`	`#A78BFA`	Whisper, disk

Changelog

v2 (2026-02-19)

Fixed Google Fonts build error (corporate proxy blocks fonts.gstatic.com) — switched to system fonts
Fixed system API slowness (7.6s → 50ms) by caching static info (chip, GPU, brew packages)
Added streaming prompt responses via NDJSON proxy to Ollama /api/generate
Added model pull UI — input field + pull button in Ollama Models section
Added model delete with confirmation dialog in expanded details
Added VRAM usage and expiry time display for running models
Added toast notifications (success/error/info) with slide-in animation
Added copy response button in prompt modal
Added Escape key to close modals, backdrop click to dismiss
Added streaming indicator with pulsing dot

v1 (2026-02-19)

Initial dashboard with Ollama status, model list, system info, Whisper.cpp, brew packages
Load/unload model actions
Basic prompt interface (non-streaming)
Auto-refresh every 15 seconds
Dark theme with ByteLyst design tokens

Future Improvements

Whisper transcription UI (upload audio, get text)
GPU utilization chart (Metal usage over time)
Model comparison benchmarks (side-by-side prompt same question)
Chat mode (multi-turn conversation with history)
WebSocket for real-time status updates (replace polling)
Component extraction (break page.tsx into smaller components)

8.2 KiB Raw Blame History

05 — Mission Control Dashboard

Overview

Running

Tech Stack

Architecture

API Routes

GET /api/ollama

POST /api/ollama

GET /api/whisper

GET /api/system

Dashboard UI Features

Header

Top Stats Row (4 cards)

Main Grid (2 columns + 1 sidebar)

Prompt Modal

Design Tokens

Changelog

v2 (2026-02-19)

v1 (2026-02-19)

Future Improvements

8.2 KiB

Raw Blame History