saravanakumardb1
71a7623553
docs(local-llms): expand installed models table with parameters and quantization
...
- Add Parameters, Quantization, and Status columns to models table
- qwen2.5-coder:32b: 32.8B params, Q4_K_M, 18.5 GB disk
- llama3.1:8b: 8B params, Q4_K_M, 4.9 GB disk (confirmed via ollama API)
2026-02-19 16:05:42 -08:00
saravanakumardb1
80f794dee7
docs(local-llm): add Ollama setup, extraction evals, and env vars reference
...
- docs/02-ollama-setup-and-models.md: installation, server config, memory management,
idle timeout, manual load/unload, OpenAI-compatible API, native API reference,
performance tuning flags (flash attention, KV cache)
- docs/06-extraction-service-evals.md: promptfoo eval suite against Ollama, 19 cases
across 5 tasks, assertion patterns for JSON string output, Python sidecar config
- docs/09-environment-variables.md: comprehensive var reference for Ollama server,
evals, Python sidecar, dashboard, whisper CLI flags, proxy/network settings
2026-02-19 13:01:05 -08:00