History

saravanakumardb1 093682eace docs(local-llm): add systematic dashboard bug & improvement review DASHBOARD_REVIEW.md — comprehensive code review of all 6 dashboard files (1,395 lines). Organized into 7 sections: - 10 bugs (B1–B10): hardcoded header, blocking pull, escape during stream, auto-refresh during streaming, no abort controller, vm_stat page size, etc. - 5 code quality issues (CQ1–CQ5): monolithic component, inline styles, duplicated constants, no error boundary, no loading skeleton - 16 feature ideas (F1–F16): pull progress, chat mode, prompt history, token/s metrics, model search, whisper test, extraction integration, etc. - 5 performance items (P1–P5): request deduplication, cache TTL, du latency - 3 security notes (S1–S3): input validation, shell injection pattern, CORS - Priority matrix and 5-sprint implementation roadmap		2026-02-19 14:36:51 -08:00
..
01-hardware-and-prerequisites.md	docs(local-llm): add docs index, hardware specs, and whisper-cpp setup	2026-02-19 13:00:48 -08:00
02-ollama-setup-and-models.md	docs(local-llm): add Ollama setup, extraction evals, and env vars reference	2026-02-19 13:01:05 -08:00
03-whisper-cpp-setup.md	docs(local-llm): add docs index, hardware specs, and whisper-cpp setup	2026-02-19 13:00:48 -08:00
04-multimodal-local-stack.md	docs(local-llm): add multimodal stack, model recommendations, and troubleshooting	2026-02-19 13:01:22 -08:00
05-mission-control-dashboard.md	docs(local-llm): add Mission Control dashboard documentation	2026-02-19 13:03:30 -08:00
06-extraction-service-evals.md	docs(local-llm): add Ollama setup, extraction evals, and env vars reference	2026-02-19 13:01:05 -08:00
07-model-recommendations.md	docs(local-llm): add multimodal stack, model recommendations, and troubleshooting	2026-02-19 13:01:22 -08:00
08-troubleshooting.md	docs(local-llm): add multimodal stack, model recommendations, and troubleshooting	2026-02-19 13:01:22 -08:00
09-environment-variables.md	docs(local-llm): add Ollama setup, extraction evals, and env vars reference	2026-02-19 13:01:05 -08:00
DASHBOARD_REVIEW.md	docs(local-llm): add systematic dashboard bug & improvement review	2026-02-19 14:36:51 -08:00
README.md	docs(local-llm): add docs index, hardware specs, and whisper-cpp setup	2026-02-19 13:00:48 -08:00

README.md

Local LLM Stack — Documentation Index

Complete guide for the local AI inference stack on the ByteLyst development machine. Hardware: Apple M4 Pro · 48 GB LPDDR5 · macOS Tahoe Last updated: 2026-02-19

Quick Start

# 1. Start Ollama
ollama serve                    # or: brew services start ollama

# 2. Load a model
ollama run qwen2.5-coder:32b   # best coding model for this hardware

# 3. Launch Mission Control dashboard
cd __LOCAL_LLMs/dashboard && npm run dev -- -p 3100
# Open http://localhost:3100

Documentation

#	Document	Description
01	Hardware & Prerequisites	Machine specs, installed toolchain, disk/RAM budget
02	Ollama Setup & Models	Installation, server config, model management, memory behavior
03	Whisper.cpp Setup	Speech-to-text: installation, models, CLI usage, real-time streaming
04	Multimodal Local Stack	Vision models, audio pipeline, video understanding status
05	Mission Control Dashboard	Next.js dashboard: architecture, API routes, features, running
06	Extraction Service Evals	promptfoo eval suite, Ollama vs Gemini comparison, Python sidecar
07	Model Recommendations	Tiered model guide by use case, size, and quality for M4 Pro 48GB
08	Troubleshooting & Corporate Proxy	Common issues, Forcepoint proxy workarounds, MLX warnings
09	Environment Variables	All config vars for Ollama, Whisper, dashboard, evals

Directory Structure

__LOCAL_LLMs/
├── README.md                        ← you are here (moved from LOCAL_LLMs_setup_mac_m4_48gb.md)
├── docs/
│   ├── README.md                    ← this index
│   ├── 01-hardware-and-prerequisites.md
│   ├── 02-ollama-setup-and-models.md
│   ├── 03-whisper-cpp-setup.md
│   ├── 04-multimodal-local-stack.md
│   ├── 05-mission-control-dashboard.md
│   ├── 06-extraction-service-evals.md
│   ├── 07-model-recommendations.md
│   ├── 08-troubleshooting.md
│   └── 09-environment-variables.md
├── dashboard/                       ← Next.js Mission Control app (port 3100)
│   ├── src/app/page.tsx             ← main dashboard UI
│   ├── src/app/api/ollama/route.ts  ← Ollama API proxy (list, load, unload, generate)
│   ├── src/app/api/whisper/route.ts ← Whisper binary/model discovery
│   └── src/app/api/system/route.ts  ← System info (chip, RAM, disk, brew)
└── LOCAL_LLMs_setup_mac_m4_48gb.md  ← original doc (preserved, see docs/ for latest)

Current Installation Status (2026-02-19)

Component	Version	Status	Disk Usage
Ollama	0.16.2	✅ Installed via brew	—
qwen2.5-coder:32b	—	✅ Downloaded	19 GB
llama3.1:8b	—	✅ Downloaded	4.9 GB
whisper-cpp	1.8.3	✅ Installed via brew	9.6 MB
whisper model (ggml-large-v3-turbo)	—	❌ Blocked by corporate proxy	—
ffmpeg	8.0.1	✅ Installed via brew	53.3 MB
Mission Control Dashboard	Next.js 16	✅ Built, runs on :3100	—

Extraction service evals: services/extraction-service/evals/
Ollama REST API docs: https://github.com/ollama/ollama/blob/main/docs/api.md
Whisper.cpp: https://github.com/ggerganov/whisper.cpp
Hugging Face models: https://huggingface.co/ggerganov/whisper.cpp/tree/main

README.md

Local LLM Stack — Documentation Index

Quick Start

Documentation

Directory Structure

Current Installation Status (2026-02-19)

Related Resources