saravanakumardb1 3561deee52 docs(local-llm): add multimodal stack, model recommendations, and troubleshooting

- docs/04-multimodal-local-stack.md: vision models (llava, qwen2.5vl, moondream2),
  audio pipeline architecture, video understanding status, Kimi alternatives,
  complete local AI stack diagram
- docs/07-model-recommendations.md: 6-tier model guide (coding, fast, general,
  reasoning, vision, embeddings), recommended 10-model stack for M4 Pro 48GB,
  use-case quick reference, hardware scaling guide
- docs/08-troubleshooting.md: corporate Forcepoint proxy workarounds, MLX warning,
  JSON parse errors, slow inference, whisper-cli vs whisper-cpp naming, audio
  format conversion, proxy-corrupted downloads detection

2026-02-19 13:01:22 -08:00

5.6 KiB

Raw Blame History

08 — Troubleshooting & Corporate Proxy

Common issues, Forcepoint proxy workarounds, MLX warnings, and fixes.

Corporate Proxy (Forcepoint CertChecker)

This machine is behind an AT&T Forcepoint proxy that performs SSL deep packet inspection.

Proxy Details

Setting	Value
Proxy URL	`http://cso.proxy.att.com:8080/`
Agent	Forcepoint CertChecker
Impact	Intercepts HTTPS, replaces TLS certificates
Env vars	`HTTP_PROXY`, `HTTPS_PROXY` set automatically

What Works Through Proxy

Tool	Status	Notes
`ollama pull`	✅ Works	Ollama handles proxy natively
`brew install`	✅ Works	Homebrew handles proxy
`npm install`	✅ Works	With `NODE_TLS_REJECT_UNAUTHORIZED=0`
`curl` to Hugging Face	❌ Blocked	Returns 19 KB HTML redirect page
`curl -k` to Hugging Face	❌ Blocked	Still intercepted even with `-k`
`python requests` to HF	❌ Blocked	SSL_CERTIFICATE_VERIFY_FAILED
`huggingface_hub` download	❌ Blocked	Falls back to cached (broken) files

Workaround: Download Off-Network

For Hugging Face model downloads (e.g., Whisper GGML files):

Disconnect from corporate VPN/Wi-Fi
Connect to personal hotspot or home Wi-Fi

Run the download:

curl -L -o ~/whisper-models/ggml-large-v3-turbo.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin

Reconnect to corporate. The model is stored locally forever.

Detecting a Proxy-Corrupted Download

If a download completed but the file is suspiciously small:

# Check file size (should be ~1.6 GB for large-v3-turbo)
ls -lh ~/whisper-models/ggml-large-v3-turbo.bin

# Check file type (should NOT be HTML)
file ~/whisper-models/ggml-large-v3-turbo.bin

# If it says "HTML document text" — delete and re-download off-network
rm ~/whisper-models/ggml-large-v3-turbo.bin

Ollama Issues

`MLX dynamic library not available`

WARN MLX dynamic library not available error="failed to load MLX dynamic library"

Severity: Harmless Cause: Ollama searches for Apple MLX framework but it's not installed Impact: None — falls back to Metal backend which is fully functional on M4 Pro Fix: None needed. Ignore the warning.

Model Pull Fails (SSL / Proxy)

# Try bypassing proxy for Ollama registry
NO_PROXY="ollama.com,registry.ollama.ai" ollama pull llama3.1:8b

Ollama Not Responding

# Check if running
curl http://localhost:11434/api/tags

# Restart
brew services restart ollama
# or
pkill ollama && ollama serve

JSON Parse Errors in Evals

Model returned markdown-wrapped JSON (```json ... ```). Fix by adding to your prompt:

Return ONLY a valid JSON object — no markdown, no backticks, no explanation.

Slow Inference

Check Activity Monitor — Ollama should be using GPU (Metal). If CPU-only:

# Restart with performance flags
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 ollama serve

Model Won't Unload from RAM

# Force unload via API
curl http://localhost:11434/api/generate -d '{"model": "MODEL_NAME", "prompt": "", "keep_alive": "0"}'

# Or restart Ollama entirely
brew services restart ollama

Disk Space Running Low

# Check Ollama disk usage
du -sh ~/.ollama/models/

# List models with sizes
ollama list

# Remove models you don't need
ollama rm <model-name>

Whisper.cpp Issues

`command not found: whisper-cpp`

The binary is named whisper-cli, NOT whisper-cpp:

# Wrong
whisper-cpp --model ...

# Correct
whisper-cli --model ...

Full list of binaries: ls /opt/homebrew/bin/whisper-*

Audio Format Not Supported

Whisper.cpp requires WAV format. Convert first:

# m4a → wav (16kHz mono)
ffmpeg -i input.m4a -ar 16000 -ac 1 output.wav

# mp3 → wav
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav

Model File Is HTML (Proxy-Corrupted)

file ~/whisper-models/ggml-large-v3-turbo.bin
# If output says "HTML document text" — it's corrupted
rm ~/whisper-models/ggml-large-v3-turbo.bin
# Re-download off corporate network

`ffmpeg: command not found`

brew install ffmpeg

Dashboard Issues

Port Conflict

# Default port 3100, change if needed
npm run dev -- -p 3101

Lockfile Warning

Warning: Next.js inferred your workspace root, but it may not be correct.
We detected multiple lockfiles...

This is harmless — the dashboard has its own package-lock.json inside the pnpm monorepo. Can be silenced by adding turbopack.root to next.config.ts.

API Routes Return Empty Data

Ollama offline: Start with ollama serve or brew services start ollama
Whisper not installed: Run brew install whisper-cpp
No models: Check ollama list and ls ~/whisper-models/

General macOS Issues

Accessibility / Permissions

Some tools (e.g., whisper-stream for mic access) need explicit macOS permissions:

System Settings → Privacy & Security → Microphone → enable Terminal / your IDE

Node.js TLS Warning

Warning: Setting NODE_TLS_REJECT_UNAUTHORIZED to '0' makes TLS connections insecure

This is set in the corporate environment to handle Forcepoint proxy. Harmless for local dev.

5.6 KiB Raw Blame History