09 — Environment Variables Reference
All configuration variables for Ollama, Whisper, dashboard, and evals.
Ollama Server
| Variable |
Default |
Purpose |
OLLAMA_HOST |
http://127.0.0.1:11434 |
Server bind address |
OLLAMA_MODELS |
~/.ollama/models |
Model storage path |
OLLAMA_KEEP_ALIVE |
5m |
How long to keep model loaded after last request |
OLLAMA_FLASH_ATTENTION |
false |
Enable flash attention (faster, less RAM) |
OLLAMA_KV_CACHE_TYPE |
(none) |
KV cache quantization (q8_0 = smaller RAM footprint) |
OLLAMA_NUM_PARALLEL |
1 |
Number of concurrent requests |
OLLAMA_MAX_LOADED_MODELS |
1 |
Max models loaded in RAM simultaneously |
OLLAMA_GPU_OVERHEAD |
(auto) |
Reserved GPU memory (bytes) |
OLLAMA_ORIGINS |
* |
Allowed CORS origins |
OLLAMA_DEBUG |
false |
Enable debug logging |
HTTP_PROXY |
(system) |
HTTP proxy for model downloads |
HTTPS_PROXY |
(system) |
HTTPS proxy for model downloads |
NO_PROXY |
(none) |
Hosts to bypass proxy |
Performance Tuning Combo
OLLAMA_FLASH_ATTENTION=1 \
OLLAMA_KV_CACHE_TYPE=q8_0 \
OLLAMA_NUM_PARALLEL=2 \
OLLAMA_KEEP_ALIVE=10m \
ollama serve
| Variable |
Default |
Purpose |
OLLAMA_MODEL |
llama3.1:8b |
Model used by pnpm eval:ollama |
OLLAMA_BASE_URL |
http://localhost:11434/v1 |
OpenAI-compat endpoint for promptfoo |
EXTRACTION_EVAL_TOKEN |
(none) |
Auth token for extraction-service evals |
Usage
# Run evals with a different model
OLLAMA_MODEL=qwen2.5:7b pnpm eval:ollama
# Compare Gemini vs Ollama
EXTRACTION_EVAL_TOKEN=your-token pnpm eval:compare
| Variable |
Default |
Purpose |
LANGEXTRACT_PROVIDER |
gemini |
Switch to openai_compat for Ollama |
LANGEXTRACT_BASE_URL |
(Gemini) |
Set to http://localhost:11434/v1 for Ollama |
LANGEXTRACT_API_KEY |
(Gemini key) |
Set to ollama for local |
LANGEXTRACT_MODEL |
(Gemini model) |
Set to llama3.1:8b or preferred model |
Switch to Ollama
export LANGEXTRACT_PROVIDER=openai_compat
export LANGEXTRACT_BASE_URL=http://localhost:11434/v1
export LANGEXTRACT_API_KEY=ollama
export LANGEXTRACT_MODEL=llama3.1:8b
Mission Control Dashboard
| Variable |
Default |
Purpose |
OLLAMA_URL |
http://localhost:11434 |
Ollama server URL (used by API routes) |
PORT |
3100 |
Dashboard dev server port |
Start with Custom Ollama URL
OLLAMA_URL=http://192.168.1.100:11434 npm run dev -- -p 3100
Whisper.cpp
Whisper.cpp uses CLI flags rather than environment variables:
| Flag |
Purpose |
Example |
--model |
Path to GGML model file |
--model ~/whisper-models/ggml-large-v3-turbo.bin |
--language |
Input language |
--language en |
--file |
Audio file path |
--file recording.wav |
--output-json |
Output in JSON format |
--output-json |
--output-srt |
Output as SRT subtitles |
--output-srt |
--output-vtt |
Output as VTT subtitles |
--output-vtt |
--translate |
Translate to English |
--translate |
--threads |
Number of CPU threads |
--threads 8 |
--processors |
Number of processors |
--processors 1 |
--print-colors |
Colorize output by confidence |
--print-colors |
--no-timestamps |
Omit timestamps |
--no-timestamps |
--port |
Server port (whisper-server) |
--port 8080 |
Proxy / Network (Corporate)
| Variable |
Value on This Machine |
Purpose |
HTTP_PROXY |
http://cso.proxy.att.com:8080/ |
Corporate HTTP proxy |
HTTPS_PROXY |
http://cso.proxy.att.com:8080/ |
Corporate HTTPS proxy |
NODE_TLS_REJECT_UNAUTHORIZED |
0 |
Bypass Forcepoint SSL interception for Node.js |
NO_PROXY |
(not set by default) |
Add ollama.com,registry.ollama.ai if pulls fail |
All Paths
| Path |
Content |
~/.ollama/models/ |
Downloaded Ollama models |
~/whisper-models/ |
Whisper GGML model files |
/opt/homebrew/bin/ollama |
Ollama binary |
/opt/homebrew/bin/whisper-cli |
Whisper CLI binary |
/opt/homebrew/bin/ffmpeg |
FFmpeg binary |
__LOCAL_LLMs/dashboard/ |
Mission Control Next.js app |
__LOCAL_LLMs/docs/ |
This documentation |
services/extraction-service/evals/ |
Promptfoo eval configs |