Commit Graph

8 Commits

Author SHA1 Message Date
saravanakumardb1
b77afce9ae docs(local-llm): add Mission Control dashboard documentation
- docs/05-mission-control-dashboard.md: complete dashboard reference with
  architecture diagram, API route docs (request/response examples),
  UI feature descriptions, design tokens table, v1/v2 changelog,
  and future improvements roadmap
2026-02-19 13:03:30 -08:00
saravanakumardb1
970b565026 fix(local-llm): dashboard v2 — streaming prompts, model management, perf fixes
Bug fixes:
- Fix Google Fonts build error (corporate proxy blocks fonts.gstatic.com)
  by removing Geist font imports and switching to system font stack
- Fix system API 7.6s latency by caching static info (chip, GPU, brew)
  with timeouts on shell commands — now responds in ~50ms

New features:
- Streaming prompt responses via NDJSON proxy (/api/ollama/stream)
  with typing cursor animation and auto-scroll
- Model pull UI: input field + button to download new models
- Model delete with two-step confirmation dialog
- VRAM usage and expiry time display for loaded models
- Toast notifications (success/error/info) with slide-in animation
- Copy response button in prompt modal
- Escape key closes modals, backdrop click dismisses
- Pull/delete/show actions added to Ollama API route
2026-02-19 13:03:11 -08:00
saravanakumardb1
2565714c52 feat(local-llm): add Mission Control dashboard v1
Next.js 16 dashboard for monitoring and managing the local LLM stack.
Runs on port 3100 with dark theme using ByteLyst design tokens.

API routes:
- GET/POST /api/ollama — model list, running status, load/unload/generate
- GET /api/whisper — binary discovery, GGML model inventory
- GET /api/system — chip info, RAM/disk usage, brew package versions

Dashboard UI:
- Top stats row: Ollama status, model count, Whisper status, RAM usage
- Ollama models panel with load/unload actions, LOADED badge, details
- System panel with progress bars for RAM and disk
- Whisper.cpp panel with binary list and model inventory
- Brew packages panel with version tracking
- Basic prompt modal with Cmd+Enter shortcut
- Auto-refresh every 15 seconds

Also excludes __LOCAL_LLMs/ from root ESLint config (dashboard has its
own config and uses browser globals not available in Node.js context).

Tech: Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript
2026-02-19 13:02:48 -08:00
saravanakumardb1
0c4210f5ff docs(local-llm): update original setup doc to redirect to docs/ structure
- LOCAL_LLMs_setup_mac_m4_48gb.md: replace 279-line monolith with quick start
  + documentation index linking to 9 topic-specific docs in docs/
- Add .gitignore for extraction-service eval logs (generated artifacts)
2026-02-19 13:01:35 -08:00
saravanakumardb1
3561deee52 docs(local-llm): add multimodal stack, model recommendations, and troubleshooting
- docs/04-multimodal-local-stack.md: vision models (llava, qwen2.5vl, moondream2),
  audio pipeline architecture, video understanding status, Kimi alternatives,
  complete local AI stack diagram
- docs/07-model-recommendations.md: 6-tier model guide (coding, fast, general,
  reasoning, vision, embeddings), recommended 10-model stack for M4 Pro 48GB,
  use-case quick reference, hardware scaling guide
- docs/08-troubleshooting.md: corporate Forcepoint proxy workarounds, MLX warning,
  JSON parse errors, slow inference, whisper-cli vs whisper-cpp naming, audio
  format conversion, proxy-corrupted downloads detection
2026-02-19 13:01:22 -08:00
saravanakumardb1
80f794dee7 docs(local-llm): add Ollama setup, extraction evals, and env vars reference
- docs/02-ollama-setup-and-models.md: installation, server config, memory management,
  idle timeout, manual load/unload, OpenAI-compatible API, native API reference,
  performance tuning flags (flash attention, KV cache)
- docs/06-extraction-service-evals.md: promptfoo eval suite against Ollama, 19 cases
  across 5 tasks, assertion patterns for JSON string output, Python sidecar config
- docs/09-environment-variables.md: comprehensive var reference for Ollama server,
  evals, Python sidecar, dashboard, whisper CLI flags, proxy/network settings
2026-02-19 13:01:05 -08:00
saravanakumardb1
464ffb92ec docs(local-llm): add docs index, hardware specs, and whisper-cpp setup
- docs/README.md: documentation index with quick start, file structure, status table
- docs/01-hardware-and-prerequisites.md: M4 Pro 48GB specs, toolchain inventory,
  disk budget, network environment (Forcepoint proxy details)
- docs/03-whisper-cpp-setup.md: whisper-cpp installation, GGML model guide,
  ffmpeg audio conversion, CLI usage, real-time streaming, LysnrAI integration
2026-02-19 13:00:48 -08:00
saravanakumardb1
dd23f6cf96 docs: add local LLM setup guide for Apple Silicon Mac (48GB)
- Add __LOCAL_LLMs/LOCAL_LLMs_setup_mac_m4_48gb.md: comprehensive reference
  for running Ollama on the dev Mac covering installation (v0.16.2 via brew),
  corp proxy handling (AT&T Forcepoint), OpenAI-compat API usage examples
  (curl/Node/Python), extraction-service eval integration, Python sidecar
  wiring, model recommendations by use case, troubleshooting, and env var
  reference
- Models documented: llama3.1:8b (4.9GB, default evals), qwen2.5-coder:32b
  (19GB, code gen / Swift / TS)
2026-02-19 12:19:44 -08:00