# Local LLM Stack — Documentation Index > Complete guide for the local AI inference stack on the ByteLyst development machine. > Hardware: **Apple M4 Pro · 48 GB LPDDR5 · macOS Tahoe** > Last updated: 2026-02-19 --- ## Quick Start ```bash # 1. Start Ollama ollama serve # or: brew services start ollama # 2. Load a model ollama run qwen2.5-coder:32b # best coding model for this hardware # 3. Launch Mission Control dashboard cd __LOCAL_LLMs/dashboard && npm run dev -- -p 3100 # Open http://localhost:3100 ``` --- ## Documentation | # | Document | Description | | --- | ------------------------------------------------------------ | -------------------------------------------------------------------- | | 01 | [Hardware & Prerequisites](01-hardware-and-prerequisites.md) | Machine specs, installed toolchain, disk/RAM budget | | 02 | [Ollama Setup & Models](02-ollama-setup-and-models.md) | Installation, server config, model management, memory behavior | | 03 | [Whisper.cpp Setup](03-whisper-cpp-setup.md) | Speech-to-text: installation, models, CLI usage, real-time streaming | | 04 | [Multimodal Local Stack](04-multimodal-local-stack.md) | Vision models, audio pipeline, video understanding status | | 05 | [Mission Control Dashboard](05-mission-control-dashboard.md) | Next.js dashboard: architecture, API routes, features, running | | 06 | [Extraction Service Evals](06-extraction-service-evals.md) | promptfoo eval suite, Ollama vs Gemini comparison, Python sidecar | | 07 | [Model Recommendations](07-model-recommendations.md) | Tiered model guide by use case, size, and quality for M4 Pro 48GB | | 08 | [Troubleshooting & Corporate Proxy](08-troubleshooting.md) | Common issues, Forcepoint proxy workarounds, MLX warnings | | 09 | [Environment Variables](09-environment-variables.md) | All config vars for Ollama, Whisper, dashboard, evals | --- ## Directory Structure ``` __LOCAL_LLMs/ ├── README.md ← you are here (moved from LOCAL_LLMs_setup_mac_m4_48gb.md) ├── docs/ │ ├── README.md ← this index │ ├── 01-hardware-and-prerequisites.md │ ├── 02-ollama-setup-and-models.md │ ├── 03-whisper-cpp-setup.md │ ├── 04-multimodal-local-stack.md │ ├── 05-mission-control-dashboard.md │ ├── 06-extraction-service-evals.md │ ├── 07-model-recommendations.md │ ├── 08-troubleshooting.md │ └── 09-environment-variables.md ├── dashboard/ ← Next.js Mission Control app (port 3100) │ ├── src/app/page.tsx ← main dashboard UI │ ├── src/app/api/ollama/route.ts ← Ollama API proxy (list, load, unload, generate) │ ├── src/app/api/whisper/route.ts ← Whisper binary/model discovery │ └── src/app/api/system/route.ts ← System info (chip, RAM, disk, brew) └── LOCAL_LLMs_setup_mac_m4_48gb.md ← original doc (preserved, see docs/ for latest) ``` --- ## Current Installation Status (2026-02-19) | Component | Version | Status | Disk Usage | | ----------------------------------- | ---------- | ----------------------------- | ---------- | | Ollama | 0.16.2 | ✅ Installed via brew | — | | qwen2.5-coder:32b | — | ✅ Downloaded | 19 GB | | llama3.1:8b | — | ✅ Downloaded | 4.9 GB | | whisper-cpp | 1.8.3 | ✅ Installed via brew | 9.6 MB | | whisper model (ggml-large-v3-turbo) | — | ❌ Blocked by corporate proxy | — | | ffmpeg | 8.0.1 | ✅ Installed via brew | 53.3 MB | | Mission Control Dashboard | Next.js 16 | ✅ Built, runs on :3100 | — | --- ## Related Resources - **Extraction service evals:** `services/extraction-service/evals/` - **Ollama REST API docs:** https://github.com/ollama/ollama/blob/main/docs/api.md - **Whisper.cpp:** https://github.com/ggerganov/whisper.cpp - **Hugging Face models:** https://huggingface.co/ggerganov/whisper.cpp/tree/main