New features:
- B2/F1: Streaming model pull with real-time progress bar. New
/api/ollama/pull/route.ts pipes NDJSON from Ollama stream:true.
UI shows status, completed/total bytes, and percentage during download.
- F6: Token/s metrics after prompt generation. Parses eval_count and
eval_duration from the final NDJSON chunk. Displays tok/s, total
tokens, and duration in the prompt modal footer.
Bug fixes:
- B7: Parse vm_stat page size from output instead of hardcoding 16384.
Reads 'page size of N bytes' from the first line for portability.
- B8: Whisper model discovery now scans multiple directories:
WHISPER_MODELS_DIR env var, ~/whisper-models, /opt/homebrew/share/
whisper-cpp/models/, ~/.cache/whisper/. Returns the first dir with
.bin files found.