- docs/04-multimodal-local-stack.md: vision models (llava, qwen2.5vl, moondream2), audio pipeline architecture, video understanding status, Kimi alternatives, complete local AI stack diagram - docs/07-model-recommendations.md: 6-tier model guide (coding, fast, general, reasoning, vision, embeddings), recommended 10-model stack for M4 Pro 48GB, use-case quick reference, hardware scaling guide - docs/08-troubleshooting.md: corporate Forcepoint proxy workarounds, MLX warning, JSON parse errors, slow inference, whisper-cli vs whisper-cpp naming, audio format conversion, proxy-corrupted downloads detection
5.6 KiB
08 — Troubleshooting & Corporate Proxy
Common issues, Forcepoint proxy workarounds, MLX warnings, and fixes.
Corporate Proxy (Forcepoint CertChecker)
This machine is behind an AT&T Forcepoint proxy that performs SSL deep packet inspection.
Proxy Details
| Setting | Value |
|---|---|
| Proxy URL | http://cso.proxy.att.com:8080/ |
| Agent | Forcepoint CertChecker |
| Impact | Intercepts HTTPS, replaces TLS certificates |
| Env vars | HTTP_PROXY, HTTPS_PROXY set automatically |
What Works Through Proxy
| Tool | Status | Notes |
|---|---|---|
ollama pull |
✅ Works | Ollama handles proxy natively |
brew install |
✅ Works | Homebrew handles proxy |
npm install |
✅ Works | With NODE_TLS_REJECT_UNAUTHORIZED=0 |
curl to Hugging Face |
❌ Blocked | Returns 19 KB HTML redirect page |
curl -k to Hugging Face |
❌ Blocked | Still intercepted even with -k |
python requests to HF |
❌ Blocked | SSL_CERTIFICATE_VERIFY_FAILED |
huggingface_hub download |
❌ Blocked | Falls back to cached (broken) files |
Workaround: Download Off-Network
For Hugging Face model downloads (e.g., Whisper GGML files):
- Disconnect from corporate VPN/Wi-Fi
- Connect to personal hotspot or home Wi-Fi
- Run the download:
curl -L -o ~/whisper-models/ggml-large-v3-turbo.bin \ https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin - Reconnect to corporate. The model is stored locally forever.
Detecting a Proxy-Corrupted Download
If a download completed but the file is suspiciously small:
# Check file size (should be ~1.6 GB for large-v3-turbo)
ls -lh ~/whisper-models/ggml-large-v3-turbo.bin
# Check file type (should NOT be HTML)
file ~/whisper-models/ggml-large-v3-turbo.bin
# If it says "HTML document text" — delete and re-download off-network
rm ~/whisper-models/ggml-large-v3-turbo.bin
Ollama Issues
MLX dynamic library not available
WARN MLX dynamic library not available error="failed to load MLX dynamic library"
Severity: Harmless Cause: Ollama searches for Apple MLX framework but it's not installed Impact: None — falls back to Metal backend which is fully functional on M4 Pro Fix: None needed. Ignore the warning.
Model Pull Fails (SSL / Proxy)
# Try bypassing proxy for Ollama registry
NO_PROXY="ollama.com,registry.ollama.ai" ollama pull llama3.1:8b
Ollama Not Responding
# Check if running
curl http://localhost:11434/api/tags
# Restart
brew services restart ollama
# or
pkill ollama && ollama serve
JSON Parse Errors in Evals
Model returned markdown-wrapped JSON (```json ... ```). Fix by adding to your prompt:
Return ONLY a valid JSON object — no markdown, no backticks, no explanation.
Slow Inference
Check Activity Monitor — Ollama should be using GPU (Metal). If CPU-only:
# Restart with performance flags
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 ollama serve
Model Won't Unload from RAM
# Force unload via API
curl http://localhost:11434/api/generate -d '{"model": "MODEL_NAME", "prompt": "", "keep_alive": "0"}'
# Or restart Ollama entirely
brew services restart ollama
Disk Space Running Low
# Check Ollama disk usage
du -sh ~/.ollama/models/
# List models with sizes
ollama list
# Remove models you don't need
ollama rm <model-name>
Whisper.cpp Issues
command not found: whisper-cpp
The binary is named whisper-cli, NOT whisper-cpp:
# Wrong
whisper-cpp --model ...
# Correct
whisper-cli --model ...
Full list of binaries: ls /opt/homebrew/bin/whisper-*
Audio Format Not Supported
Whisper.cpp requires WAV format. Convert first:
# m4a → wav (16kHz mono)
ffmpeg -i input.m4a -ar 16000 -ac 1 output.wav
# mp3 → wav
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav
Model File Is HTML (Proxy-Corrupted)
file ~/whisper-models/ggml-large-v3-turbo.bin
# If output says "HTML document text" — it's corrupted
rm ~/whisper-models/ggml-large-v3-turbo.bin
# Re-download off corporate network
ffmpeg: command not found
brew install ffmpeg
Dashboard Issues
Port Conflict
# Default port 3100, change if needed
npm run dev -- -p 3101
Lockfile Warning
Warning: Next.js inferred your workspace root, but it may not be correct.
We detected multiple lockfiles...
This is harmless — the dashboard has its own package-lock.json inside the pnpm monorepo. Can be silenced by adding turbopack.root to next.config.ts.
API Routes Return Empty Data
- Ollama offline: Start with
ollama serveorbrew services start ollama - Whisper not installed: Run
brew install whisper-cpp - No models: Check
ollama listandls ~/whisper-models/
General macOS Issues
Accessibility / Permissions
Some tools (e.g., whisper-stream for mic access) need explicit macOS permissions:
System Settings → Privacy & Security → Microphone → enable Terminal / your IDE
Node.js TLS Warning
Warning: Setting NODE_TLS_REJECT_UNAUTHORIZED to '0' makes TLS connections insecure
This is set in the corporate environment to handle Forcepoint proxy. Harmless for local dev.