# Local LLM Stack — Documentation Index

> Complete guide for the local AI inference stack on the ByteLyst development machine.
> Hardware: **Apple M4 Pro · 48 GB LPDDR5 · macOS Tahoe**
> Last updated: 2026-02-19

---

## Quick Start

```bash
# 1. Start Ollama
ollama serve                    # or: brew services start ollama

# 2. Load a model
ollama run qwen2.5-coder:32b   # best coding model for this hardware

# 3. Launch Mission Control dashboard
cd __LOCAL_LLMs/dashboard && npm run dev -- -p 3100
# Open http://localhost:3100
```

---

## Documentation

| #   | Document                                                     | Description                                                          |
| --- | ------------------------------------------------------------ | -------------------------------------------------------------------- |
| 01  | [Hardware & Prerequisites](01-hardware-and-prerequisites.md) | Machine specs, installed toolchain, disk/RAM budget                  |
| 02  | [Ollama Setup & Models](02-ollama-setup-and-models.md)       | Installation, server config, model management, memory behavior       |
| 03  | [Whisper.cpp Setup](03-whisper-cpp-setup.md)                 | Speech-to-text: installation, models, CLI usage, real-time streaming |
| 04  | [Multimodal Local Stack](04-multimodal-local-stack.md)       | Vision models, audio pipeline, video understanding status            |
| 05  | [Mission Control Dashboard](05-mission-control-dashboard.md) | Next.js dashboard: architecture, API routes, features, running       |
| 06  | [Extraction Service Evals](06-extraction-service-evals.md)   | promptfoo eval suite, Ollama vs Gemini comparison, Python sidecar    |
| 07  | [Model Recommendations](07-model-recommendations.md)         | Tiered model guide by use case, size, and quality for M4 Pro 48GB    |
| 08  | [Troubleshooting & Corporate Proxy](08-troubleshooting.md)   | Common issues, Forcepoint proxy workarounds, MLX warnings            |
| 09  | [Environment Variables](09-environment-variables.md)         | All config vars for Ollama, Whisper, dashboard, evals                |

---

## Directory Structure

```
__LOCAL_LLMs/
├── README.md                        ← you are here (moved from LOCAL_LLMs_setup_mac_m4_48gb.md)
├── docs/
│   ├── README.md                    ← this index
│   ├── 01-hardware-and-prerequisites.md
│   ├── 02-ollama-setup-and-models.md
│   ├── 03-whisper-cpp-setup.md
│   ├── 04-multimodal-local-stack.md
│   ├── 05-mission-control-dashboard.md
│   ├── 06-extraction-service-evals.md
│   ├── 07-model-recommendations.md
│   ├── 08-troubleshooting.md
│   └── 09-environment-variables.md
├── dashboard/                       ← Next.js Mission Control app (port 3100)
│   ├── src/app/page.tsx             ← main dashboard UI
│   ├── src/app/api/ollama/route.ts  ← Ollama API proxy (list, load, unload, generate)
│   ├── src/app/api/whisper/route.ts ← Whisper binary/model discovery
│   └── src/app/api/system/route.ts  ← System info (chip, RAM, disk, brew)
└── LOCAL_LLMs_setup_mac_m4_48gb.md  ← original doc (preserved, see docs/ for latest)
```

---

## Current Installation Status (2026-02-19)

| Component                           | Version    | Status                        | Disk Usage |
| ----------------------------------- | ---------- | ----------------------------- | ---------- |
| Ollama                              | 0.16.2     | ✅ Installed via brew         | —          |
| qwen2.5-coder:32b                   | —          | ✅ Downloaded                 | 19 GB      |
| llama3.1:8b                         | —          | ✅ Downloaded                 | 4.9 GB     |
| whisper-cpp                         | 1.8.3      | ✅ Installed via brew         | 9.6 MB     |
| whisper model (ggml-large-v3-turbo) | —          | ❌ Blocked by corporate proxy | —          |
| ffmpeg                              | 8.0.1      | ✅ Installed via brew         | 53.3 MB    |
| Mission Control Dashboard           | Next.js 16 | ✅ Built, runs on :3100       | —          |

---

## Related Resources

- **Extraction service evals:** `services/extraction-service/evals/`
- **Ollama REST API docs:** https://github.com/ollama/ollama/blob/main/docs/api.md
- **Whisper.cpp:** https://github.com/ggerganov/whisper.cpp
- **Hugging Face models:** https://huggingface.co/ggerganov/whisper.cpp/tree/main