Commit Graph

56 Commits

Author SHA1 Message Date
saravanakumardb1
efd45ad86f feat(local-llms): add one-click Windows setup scripts
- setup-windows.ps1: PowerShell script for Windows side
  - NVIDIA driver verification, Ollama install via winget
  - Pull all 5 models with skip-if-exists logic
  - WSL2 Ubuntu 24.04 install
- setup-wsl.sh: Bash script for WSL2 side
  - Idempotent apt deps (Node.js 20, Python 3.12, ffmpeg, cmake)
  - CUDA GPU passthrough verification
  - Repo clone + git pull, whisper.cpp CUDA build
  - Whisper model download, TTS setup, dashboard start
- README.md: 2-step quick start (no IDE required)
- setup-guide.md: add automated setup section at top
2026-02-21 16:28:02 -08:00
saravanakumardb1
b1d2e4ec81 fix(local-llms): cross-platform audit — 8 bugs/gaps fixed
- setup-tts.sh: make fully cross-platform (macOS + Linux/WSL2)
  - OS detection, apt fallback, CUDA PyTorch install, nvidia-smi check
  - cross-platform playback hints, HF_MIRROR env override
- api/system/route.ts: fix ffmpeg detection (use -version not --version)
- api/system/memory/route.ts: remove unused total variable in Linux path
- api/system/exec/route.ts: expand allowlist with Linux commands
  (head, tail, grep, which, ps, uname, free, lscpu, nvidia-smi, etc.)
- api/tts/route.ts: cross-platform venv path + CUDA/MPS label
- api/whisper/route.ts: Linux binary/model paths
- api/ollama/logs/route.ts: Linux log paths + WSL2 hint
- test_qwen_tts.py: platform-aware speech text + CUDA device detection
- test_orpheus_tts.py: platform-aware text, move import sys to top
- setup-guide.md: fix false auto-detect claim, add HF_MIRROR hint
2026-02-21 15:27:49 -08:00
saravanakumardb1
f85b455eb5 ci: update CI/CD configuration 2026-02-21 14:13:07 -08:00
saravanakumardb1
14c7883d2a docs(local-llm): mark all phases A-G complete in roadmap with commit links 2026-02-20 00:48:31 -08:00
saravanakumardb1
6d98d12f04 feat(local-llm): Phase G — projects + multi-model orchestration (G1-G7)
G1: Project CRUD in IndexedDB (already added in Phase F commit)
G2: Project sidebar section with create, pin, delete, and active selection
G3: Project system context injection (via project default model/agent/context)
G4: Cmd+P project switcher modal with keyboard navigation
G5: Chain orchestration — sequential multi-model pipeline with {prev} placeholder
G6: Race orchestration — parallel model competition with timing
G7: Vote orchestration — consensus synthesis from multiple model responses
2026-02-20 00:47:34 -08:00
saravanakumardb1
52f3d16b65 feat(local-llm): Phase F — scheduled tasks (F1-F7)
F1: cron-parser integration + cron utility functions (parse, nextRun, toHuman, shouldRunNow)
F2: ScheduledTask + Project + Orchestration CRUD in IndexedDB
F3: Task editor modal (schedule, model, input source, output action, prompt)
F4: Browser-based task runner with setInterval + cron matching
F5: /api/system/exec — safe shell command execution with allowlist
F6: Task run history stored per task (last 20 runs)
F7: 5 built-in task templates (morning brief, git diff, disk usage, code review, deps)
2026-02-20 00:44:53 -08:00
saravanakumardb1
e15a5a2f2f feat(local-llm): Phase E — response enhancements (E1-E5)
E1: Per-message action bar (copy, regenerate dropdown, rating) on hover
E2: Per-code-block copy button in MarkdownResponse with 'Copied!' feedback
E3: 'Try with other model' — regenerate dropdown shows loaded models
E4: Live streaming metrics (token count + tok/s during stream)
E5: Rating (thumbs up/down) persisted per message in IndexedDB
2026-02-20 00:40:49 -08:00
saravanakumardb1
d625be283c feat(local-llm): Phase D model router + multi-modal input (D1-D7)
- add lib/router.ts with task classifier, model hint mapping, resolve fallback chain, and auto-detect defaults
- integrate auto-routing mode in conversation model selector with __auto__ option
- persist/read model defaults from localStorage (llm-model-defaults)
- route prompts to selected/routed model before streaming
- add multi-modal input controls (attach file, image, voice)
- support attachment chips, removal, drag-and-drop file attach
- add audio transcription flow via /api/whisper/transcribe and append result to input
- support sending attachments payload alongside text from InputBar
2026-02-20 00:31:54 -08:00
saravanakumardb1
79cf42c8e3 docs(local-llm): mark Phase C complete with checkboxes and commit link 2026-02-20 00:27:33 -08:00
saravanakumardb1
d18b695029 feat(local-llm): Phase C custom agents (C1-C5)
- add built-in agents library (10 seeded agents)
- add agent CRUD/seeding/export/import helpers in db layer
- seed agents on workspace load
- add agent strip in sidebar and launch-from-agent flow
- add command palette support for agent entries
- add agent conversation wiring (agentId, systemPrompt, welcome message)
- render agent badge in conversation header
- add example prompt chips in input bar for agent conversations
- add AgentEditor modal for creating/updating custom agents
2026-02-20 00:26:46 -08:00
saravanakumardb1
f289099461 docs(local-llm): mark Phase B complete with checkboxes and commit link 2026-02-20 00:20:13 -08:00
saravanakumardb1
7ae92da16e feat(local-llm): Phase B quick actions + command palette (B1-B6)
- add fuse.js dependency and command palette modal (Cmd+K)
- add built-in quick actions library (30 templates across categories)
- add quick action CRUD + seeding + import/export helpers in db layer
- seed quick actions on workspace load and list top actions in sidebar
- implement quick action launcher -> creates preconfigured conversation
- add custom quick action editor modal for creating/editing actions
- wire command palette system actions and conversation navigation
- support passing QA template into conversation input via query param
2026-02-20 00:19:17 -08:00
saravanakumardb1
1335d47869 docs(local-llm): mark Phase A complete with checkboxes and commit link 2026-02-20 00:12:26 -08:00
saravanakumardb1
e17bb311c9 feat(local-llm): Phase A foundation (A1-A8) workspace + indexeddb
- add idb dependency and create typed db layer (conversations/messages/agents/etc)
- extend app/lib/types.ts with v4 workspace interfaces
- move existing dashboard to /mission-control route group
- create / workspace route group with sidebar shell and conversation pages
- implement conversation list grouping + search in sidebar
- implement conversation view with streaming via /api/ollama/chat
- add context bar and token/context utilities
- add /api/ollama/title endpoint for auto-title generation
- add v3->v4 migration utility (llm-inference-log + llm-chat-* to indexeddb)
- wire migration in workspace layout and cmd+/ sidebar toggle

Implements roadmap Phase A tasks A1-A8.
2026-02-20 00:11:27 -08:00
saravanakumardb1
d7dc66eb92 docs(local-llm): Rich Features Roadmap — 45 tasks across 7 phases for coding agent
Detailed implementation roadmap for the Rich Features PRD with:

Phase A (Sprint 14-16, ~15hr): Foundation
  A1: IndexedDB layer with idb — 9 object stores, compound indexes
  A2: v4 TypeScript interfaces — all data models
  A3: Route group (mission-control) — move existing dashboard
  A4: Route group (workspace) — sidebar + content layout
  A5: Sidebar — conversation list, time groups, search
  A6: Conversation view — message thread, input bar, streaming
  A7: Auto-title + context window usage bar
  A8: v3 → v4 migration from localStorage

Phase B (Sprint 17-18, ~10hr): Quick Actions + Cmd+K
  B1-B6: 30 built-in actions, fuse.js command palette, launcher,
  custom editor, usage tracking, export/import

Phase C (Sprint 19-20, ~9hr): Custom Agents
  C1-C5: 10 built-in agents, picker, full-screen editor,
  conversation wiring (welcome msg, chips, temp), export

Phase D (Sprint 21-22, ~13hr): Model Router + Multi-Modal
  D1-D7: regex classifier, model defaults, auto-routing UI,
  rich input bar, file/voice/image processing, drag-drop

Phase E (Sprint 23, ~7hr): Response Enhancements
  E1-E5: action bars, code-block copy, try-other-model,
  live metrics, rating with aggregation

Phase F (Sprint 24-25, ~11hr): Scheduled Tasks
  F1-F7: cron-parser, CRUD, editor, browser runner,
  /api/system/exec with allowlist, notifications, templates

Phase G (Sprint 26-28, ~13hr): Projects + Orchestration
  G1-G7: project CRUD, drag-to-project, system context,
  Cmd+P switcher, chain/race/vote modes

Every task has: explicit file paths, step-by-step instructions,
pass/fail exit criteria, verification commands, and commit templates.
Dependency graph: A is foundation, B-F parallel after A, G needs A+B.
2026-02-19 23:54:07 -08:00
saravanakumardb1
7bd14054d4 docs(local-llm): Rich Features PRD rev 2 — comprehensive review + expansion
Review findings addressed (20+ issues):

Structure additions:
- Target Users section with 5 personas (solo dev, tinkerer, privacy pro, writer, power user)
- Non-Goals section (8 explicit out-of-scope items for v4)
- Risks & Mitigations table (10 risks with impact/likelihood/mitigation)
- New API Routes section (4 new routes with security notes)
- Settings Expansion section (full tree: General, Router, Models, Input, Tasks, Data, About)
- New Dependencies table (idb ~1KB, fuse.js ~6KB, cron-parser ~3KB)
- Error Handling appendix (12 edge cases with expected behavior)

Data model fixes:
- Conversation/Message split into separate IndexedDB stores (scalability)
- Message gets conversationId FK, promptTokens field, size/language on Attachment
- Design decision note explaining why messages are stored separately

Feature spec improvements:
- 3.1 Conversations: context window management (token bar, auto-summarize at 80/95%)
- 3.2 Quick Actions: expanded Cmd+K palette spec (5 result types, ranking)
- 3.3 Agents: tools marked v4 vs v5, duplicate-from-builtin, unlink on delete
- 3.4 Model Router: full resolveModel() with 4-level fallback chain + availability
- 3.5 Multi-Modal: attachment size limits, Whisper error handling
- 3.6 Response: hover-only action bars, rating aggregation per task type
- 3.7 Cron: built-in templates table, runtime constraints, security (execFile)
- 3.8 Orchestration: full data model, chain/race/vote UI specs, step limits
- 3.9 Projects: system context detail, project stats, unlink behavior

Acceptance criteria added to ALL 9 features (was missing on 5).
Competitive analysis expanded with local competitors (Open WebUI, LM Studio, Jan.ai).
Success metrics improved with measurement methodology and rationale.
Open questions restructured as decision table with recommendations.
IndexedDB schema with explicit indexes and compound keys.
Migration strategy: 7-step v3→v4 with safety (no delete until confirmed).

681 lines → 1149 lines (+69% content)
2026-02-19 23:47:59 -08:00
saravanakumardb1
1172dbb23e docs(local-llm): Rich Features PRD — full local AI workspace spec
Comprehensive PRD evolving Mission Control into a ChatGPT-class local AI workspace:

- 3.1 Conversations: persistent, named, searchable, branching, IndexedDB
- 3.2 Quick Actions: 30 built-in 1-click launchers across 5 categories
     (code, writing, analysis, creative, devops) + custom actions + Cmd+K palette
- 3.3 Custom Agents: 10 built-in local GPTs with system prompts, tools,
     temperature, welcome messages, example prompts
- 3.4 Model Router: heuristic task classifier (<5ms, no LLM call),
     auto-selects best model per task type, configurable defaults
- 3.5 Multi-Modal Input: file attach, voice (Whisper), images, drag-drop,
     paste intelligence (code/image/error detection)
- 3.6 Response Enhancements: per-message actions, per-code-block copy,
     branching with navigation, live metrics, rating/quality profiles
- 3.7 Scheduled Tasks: cron-based recurring prompts with shell/file input,
     notification/file/conversation output, 5 built-in templates
- 3.8 Multi-Model Orchestration: chain, race, vote modes
- 3.9 Projects: conversation folders with system context + model defaults

7 implementation phases (~78hr), component architecture, storage migration,
competitive analysis, success metrics, open questions
2026-02-19 23:39:20 -08:00
saravanakumardb1
3dc0c441a9 docs(local-llm): mark all roadmap phases 1-6 complete with commit links
All 27 roadmap items + 5 bugs checked off across 6 phases:
- Phase 1 (040013e): N1-N3, BN1, BN2, BN5
- Phase 2 (7f04297): N4-N5, BN3, BN4
- Phase 3 (6f6baf9): N6-N10
- Phase 4 (588d21c): N11-N14
- Phase 5 (44ad8a6): F24-F28
- Phase 6 (07d3911): F29-F31
2026-02-19 23:30:11 -08:00
saravanakumardb1
07d391101a feat(local-llm): Phase 6 — data persistence + export (F29-F31)
F29: Export/import settings — gear icon in header opens settings popover,
     export downloads all llm-* localStorage as JSON, import validates
     and merges, both with toast feedback
F30: Inference history log — saves prompt/response/model/metrics to
     llm-inference-log (capped 100 FIFO), searchable panel with replay
     button, count badge in header toggle
F31: Factory reset — confirm dialog clears all llm-* localStorage keys,
     resets all component state to defaults
2026-02-19 23:29:40 -08:00
saravanakumardb1
44ad8a6301 feat(local-llm): Phase 5 — response quality + interaction (F24-F28)
F24: Vision image upload — file picker for vision models, base64 encoding,
     passed through stream API to Ollama generate endpoint
F25: Markdown rendering — ReactMarkdown replaces raw <pre> for all
     prompt responses and chat assistant messages
F26: Syntax highlighting — Prism-based code blocks with language labels
     and oneDark theme via react-syntax-highlighter
F27: <think> block collapse — auto-detect and collapse DeepSeek R1
     reasoning traces into expandable details with word count
F28: Ollama library link — button next to Pull input opens ollama.com/library
2026-02-19 23:25:20 -08:00
saravanakumardb1
588d21c70e feat(local-llm): Phase 4 — runtime metrics + UX polish (N11-N14)
N11: Persist tok/s per model to localStorage (llm-model-benchmarks),
     display on model card as faded accent text
N12: Live countdown to auto-unload — 1s interval, color-coded
     (green >5m, yellow 1-5m, red <1m 'Unloading soon')
N13: Session stats per model (prompts + tokens) in expanded details
N14: Co-load suggestions strip below models list showing which
     unloaded models fit in remaining free memory
2026-02-19 23:20:30 -08:00
saravanakumardb1
6f6baf99c8 feat(local-llm): Phase 3 — model intelligence badges + sort + version (N6-N10)
N6: <think> warning badge for DeepSeek R1 and distilled variants
N7: Vision model indicator for llava, bakllava, moondream, qwen-vl, etc.
N8: Architecture/family badge as pill on every model card
N9: Sort dropdown (A-Z, size, params, running, recent) with localStorage persist
N10: Ollama server version fetched from /api/version, shown in stats card
2026-02-19 23:17:07 -08:00
saravanakumardb1
7f042975de feat(local-llm): Phase 2 — rich metadata + persistence (N4-N5, BN3-BN4)
N4: RamBudgetBar component — stacked horizontal bar showing OS+Apps,
    loaded models (by name with color), and free memory segments
N5: Context window size — extract context_length from /api/show
    model_info, cache in modelMetadata state, display on card
BN3: Persist chat messages to localStorage (llm-chat-{model}),
     restore on modal re-open, capped at 50 messages
BN4: Logs panel refresh button — RefreshCw icon next to toggle
2026-02-19 23:13:22 -08:00
saravanakumardb1
040013e495 feat(local-llm): Phase 1 — pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5)
N1: Estimated RAM per model with quant-aware multipliers (Q4=1.2x, Q5=1.25x, Q8=1.1x, F16=1.05x)
N2: Will-it-fit indicator (green/yellow/red dot) next to Load button
N3: Aggregate loaded model VRAM in panel header badge
BN1: Compare buttons now filter to running models only
BN2: AbortController on compare stream, cancel on modal close
BN5: Delete confirmation shows model name + disk reclaim size
2026-02-19 23:09:49 -08:00
saravanakumardb1
ae231d5aac docs(local-llm): comprehensive roadmap review — 5 bugs, 6 phases, 31 items
Systematic code review of DASHBOARD_ROADMAP.md against actual codebase:

Bugs found (BN1-BN5):
- BN1: Compare buttons show unloaded models (can't generate)
- BN2: No AbortController on compare stream (leaks on close)
- BN3: Chat messages lost on modal close (no persistence)
- BN4: Logs panel has no refresh button
- BN5: Delete dialog missing reclaim size (partial impl exists)

Expanded from 4 phases to 6 + backlog (15 → 31 items):
- Phase 1: Pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5)
- Phase 2: Rich metadata + persistence (N4-N5, BN3-BN4)
- Phase 3: Model intelligence badges + sort (N6-N10)
- Phase 4: Runtime metrics + UX polish (N11-N14)
- Phase 5 (NEW): Response quality — markdown, syntax highlight,
  vision upload, think-block collapse, model library link
- Phase 6 (NEW): Data persistence — export/import, inference log,
  factory reset
- Phase 7: Expanded backlog (F17-F38, +6 new ideas)

Improvements:
- Added checkboxes for all tasks and acceptance criteria
- Quant-aware RAM estimate multipliers (Q4/Q5/Q8/F16)
- Broader vision model regex (bakllava, moondream, llama-vision)
- DeepSeek R1 distill variant detection for think badge
- Conservative memory availability formula (free + cached*0.5)
- localStorage key registry with llm- prefix standardization
- Dependency graph between phases
- ~6 hrs total estimated effort
2026-02-19 23:02:25 -08:00
saravanakumardb1
cd6e561f1b docs(local-llm): consolidate dashboard docs into dashboard/docs/
- Created DASHBOARD_PRD.md — full updated PRD with current 19-file
  architecture, all 10 API routes, UI layout, data flow, localStorage
  keys, security model, and v1-v3 changelog.
- Created DASHBOARD_ROADMAP.md — phased implementation plan for N1-N15
  improvements across 4 phases: pre-load intelligence, rich metadata,
  model intelligence badges, runtime metrics. Includes acceptance
  criteria and implementation details per item.
- Updated DASHBOARD_REVIEW.md — refreshed file inventory to 19 files
  (~2,930 lines), fixed broken Tier B markdown table, added cross-links.
- Replaced __LOCAL_LLMs/docs/05-mission-control-dashboard.md with
  redirect pointer to new dashboard/docs/ location.

Dashboard docs are now co-located at __LOCAL_LLMs/dashboard/docs/:
  - DASHBOARD_PRD.md (product requirements)
  - DASHBOARD_REVIEW.md (audit + 39 completed items + N1-N15 proposals)
  - DASHBOARD_ROADMAP.md (phased implementation plan)
2026-02-19 22:54:18 -08:00
saravanakumardb1
519f348583 docs(local-llm): add Next Wave — 15 model intelligence improvements (N1–N15)
Section 8 of DASHBOARD_REVIEW.md: pre-load RAM estimates, will-it-fit
indicator, RAM budget bar, context window, architecture/vision/think
badges, sort, tok/s history, countdown, session stats, delete confirm,
co-load suggestions. Organized in 4 tiers with sprint plan.
2026-02-19 22:32:29 -08:00
saravanakumardb1
4090c8aa13 docs(local-llms): add developer guide — API endpoint, code examples, model selection
- New 00-developer-guide.md: start-here doc for developers covering:
  - Ollama endpoint (http://localhost:11434/v1) and API key
  - curl, TypeScript, Python code examples with env var pattern
  - Model selection table by task
  - Running extraction service evals locally
  - JSON output gotchas (parse from string, <think> strip for R1)
  - Model management commands
  - Troubleshooting quick reference
  - Links to all other docs
- Updated index in LOCAL_LLMs_setup_mac_m4_48gb.md to include doc 00
2026-02-19 18:43:06 -08:00
saravanakumardb1
5deb5efdcf docs(local-llms): add comprehensive model comparison table and deepseek-r1:32b details
- Add Comprehensive Model Comparison Table: 11 models (local + cloud) with
  Disk, Params, Quant, RAM, Tok/s, JSON quality, Reasoning, Code, Instruction
  Following, Context window, <think> flag, and install status columns
- Add Gap Analysis table: llama3.1:8b (~55%), qwen2.5-coder:32b (~85%),
  deepseek-r1:32b (~75-80%) vs llama3.3:70b across 5 capability dimensions
- Update Tier 4 Reasoning table: add Parameters, Quant columns; add <think>
  warning note with link to eval doc transform pattern
- Update By Use Case table: add brain signal routing row, update extraction
  evals fallback to qwen2.5-coder:32b
2026-02-19 16:06:02 -08:00
saravanakumardb1
cfc1194079 docs(local-llms): add latency/cost comparison and deepseek-r1 transform pattern to evals doc
- Add Latency & Cost Comparison table: llama3.1:8b (~1m27s), qwen2.5-coder:32b
  (~5-8m est.), deepseek-r1:32b (~5-8m est.) vs gemini-2.5-flash (~15-25s, $0.003)
  and gpt-4o (~20-40s, $0.05-0.15) — all measured at 19 cases, concurrency=4
- Fix assertion pattern docs: single expressions required, not const/return blocks
- Add deepseek-r1 <think> strip transform pattern for promptfoo provider config
- Expand recommended models table with Disk, Reasoning, Pass Rate, and Notes columns
2026-02-19 16:05:52 -08:00
saravanakumardb1
71a7623553 docs(local-llms): expand installed models table with parameters and quantization
- Add Parameters, Quantization, and Status columns to models table
- qwen2.5-coder:32b: 32.8B params, Q4_K_M, 18.5 GB disk
- llama3.1:8b: 8B params, Q4_K_M, 4.9 GB disk (confirmed via ollama API)
2026-02-19 16:05:42 -08:00
saravanakumardb1
1552006feb fix(local-llm): proxy extraction health check through API route
Move extraction service health check from direct browser fetch
(http://localhost:4005/health) to server-side /api/extraction/health
proxy. Eliminates ERR_CONNECTION_REFUSED console errors when the
extraction service is not running locally.
2026-02-19 15:53:02 -08:00
saravanakumardb1
984630eb45 docs(local-llm): mark ALL 39 items complete in DASHBOARD_REVIEW.md
All bugs (11), code quality (6), features (16), performance (5), and
security (3) items are now checked off. Added Sprint 6 (ed93a6f) and
Sprint 7 (8bdd5ee) to commit log. Updated summary to reflect 100%
completion across 7 sprints.
2026-02-19 15:45:46 -08:00
saravanakumardb1
8bdd5ee1c8 feat(local-llm): Sprint 7 — all remaining features (F5,F7,F8,F12,F13,F15,CQ5,S3)
Features:
- F5: Model comparison side-by-side — after a prompt response, click
  any other model to compare. Responses display in two-column grid.
- F7: System resource sparklines — memory usage ring buffer (30 points)
  with SVG sparkline component in the memory stats card.
- F8: Ollama logs viewer — collapsible terminal-style panel below main
  grid. Fetches from /api/ollama/logs route. Color-coded by level.
- F12: Whisper transcription test — file upload button in Whisper panel.
  Uploads audio to /api/whisper/transcribe, displays text + latency.
- F13: Responsive mobile layout — p-3/sm:p-6 padding, gap-3/sm:gap-4,
  hidden sm:inline for header text, responsive comparison grid.
- F15: Extraction service panel — health check to localhost:4005 on
  each refresh. Status card in right column with endpoint + service.

Code quality:
- CQ5: Skeleton shimmer loading UI — 4 skeleton cards shown while
  initial data loads. Uses CSS shimmer animation from globals.css.

Security:
- S3: Documented CORS/auth assumption in code comment — dashboard is
  local-only, no auth needed for dev tool.

New files:
- components/Sparkline.tsx — reusable SVG sparkline component
- api/ollama/chat/route.ts — streaming chat endpoint (from Sprint 6)
- api/ollama/logs/route.ts — Ollama log file reader
- api/whisper/transcribe/route.ts — Whisper STT test endpoint
2026-02-19 15:44:20 -08:00
saravanakumardb1
ed93a6f0af feat(local-llm): Sprint 6 — major feature batch (CQ2,CQ5,CQ6,P5,F4,F10,F14,F16)
Code quality:
- CQ2: Add CSS utility classes (text-primary/secondary/tertiary, bg-*,
  btn-*, input-base) to globals.css — reduces inline style repetition
- CQ5: Add skeleton shimmer animation CSS for loading states
- CQ6: Replace manual model name validation with Zod schema
  (PostBodySchema) in Ollama API route

Performance:
- P5: Eagerly warm static cache on module load — system_profiler
  no longer blocks first dashboard request

Features:
- F4: Chat mode with multi-turn conversation via new /api/ollama/chat
  streaming route. Chat bubble layout, system prompt input, message
  history. Toggle between prompt/chat modes in modal.
- F10: Dark/light theme toggle with CSS var overrides in :root.light.
  Sun/Moon button in header, persisted in localStorage.
- F14: Model tags (coding, chat, fast, vision, reasoning) with
  colored toggle badges in expanded model details. Persisted in
  localStorage.
- F16: Auto-load preferred model — star toggle in expanded details.
  When Ollama is online but no models loaded, auto-loads the starred
  model. Persisted in localStorage.
2026-02-19 15:38:06 -08:00
saravanakumardb1
2936b9f047 docs(local-llm): mark Sprint 5 P1-P3 complete in DASHBOARD_REVIEW.md
Check off 3 items (P1, P2, P3) in performance section and sprint
tracker. Add commit b1fda3a to commit log.
2026-02-19 15:28:59 -08:00
saravanakumardb1
b1fda3a1a5 perf(local-llm): Sprint 5 — request dedup + cache TTLs (P1, P2, P3)
Performance fixes:
- P1: Add fetchingRef guard to fetchAll() — prevents duplicate requests
  from rapid Refresh button clicks or overlapping interval ticks
- P2: Add 5-minute TTL to staticCache (chip, GPU, brew packages) —
  previously cached indefinitely per server process, now refreshes
  after brew upgrades without requiring a restart
- P3: Add 60-second TTL cache for Ollama models disk usage (du command)
  — previously traversed ~/.ollama/models on every 15s refresh cycle,
  now reuses cached value for 60s
2026-02-19 15:28:07 -08:00
saravanakumardb1
9892fe7145 docs(local-llm): mark Sprint 4 items complete in DASHBOARD_REVIEW.md
Check off 4 items (F2, F3, F9, F11) in features list and sprint
tracker. F4 (chat mode) deferred. Add commit 9c2f5f3 to commit log.
2026-02-19 15:26:37 -08:00
saravanakumardb1
9c2f5f3396 feat(local-llm): Sprint 4 — UX enhancements (F2, F3, F9, F11)
New features:
- F2: Model search/filter — search input above models list (shown when
  4+ models installed). Filters by name, family, and quantization level.
  Press / to focus the search input.
- F3: Prompt history — saves last 20 prompts to localStorage with model
  name and timestamp. History dropdown in prompt modal with one-click
  re-run. Toggle via clock icon in textarea.
- F9: Modelfile viewer — expanded model details now fetch and display
  the Modelfile via the show action. Collapsible <details> element
  with syntax-highlighted pre block.
- F11: Keyboard shortcuts panel — press ? to toggle. Shows all shortcuts:
  ? (help), R (refresh), / (search), Esc (close/cancel), Cmd+Enter (send).
  Shortcuts only fire when not in an input field.
2026-02-19 15:25:43 -08:00
saravanakumardb1
40c40756ed docs(local-llm): mark Sprint 3 items complete in DASHBOARD_REVIEW.md
Check off 5 items (CQ1, CQ3, CQ4, S1, S2) in code quality, security,
and sprint tracker. CQ2 (inline styles) deferred. Add commit 75a3cd0
to commit log.
2026-02-19 15:22:11 -08:00
saravanakumardb1
75a3cd0826 refactor(local-llm): Sprint 3 — component extraction, error boundary, security (CQ1,CQ3,CQ4,S1,S2)
Component extraction (CQ1):
- lib/types.ts: All interfaces (OllamaData, SystemData, Toast, etc.)
- lib/format.ts: formatBytes, formatUptime utilities
- lib/ollama-config.ts: Shared OLLAMA_URL constant
- components/StatusDot.tsx: Status indicator component
- components/ProgressBar.tsx: Progress bar component
- page.tsx: Now imports from extracted modules, reduced from 1180 to
  1077 lines (interfaces + utilities + sub-components removed)

Error boundary (CQ4):
- error.tsx: Next.js App Router error boundary with styled error UI,
  stack trace preview, and 'Try again' button

Shared config (CQ3):
- All 3 Ollama API routes now import OLLAMA_URL from lib/ollama-config.ts
  instead of duplicating the env var fallback

Security (S1):
- Add MODEL_NAME_RE regex validation on POST /api/ollama — rejects
  invalid model names before passing to Ollama API

Security (S2):
- Replace exec() with execFile() for brew package version check —
  prevents shell injection if targets list ever becomes dynamic
2026-02-19 15:21:22 -08:00
saravanakumardb1
7a82db4876 docs(local-llm): mark Sprint 2 items complete in DASHBOARD_REVIEW.md
Check off 5 items (B2, B7, B8, F1, F6) in bug list, features list,
and sprint tracker. Add commit 2d9475b to commit log.
2026-02-19 15:17:16 -08:00
saravanakumardb1
2d9475bd15 feat(local-llm): Sprint 2 — streaming pull progress, token metrics, fixes (B2/F1,F6,B7,B8)
New features:
- B2/F1: Streaming model pull with real-time progress bar. New
  /api/ollama/pull/route.ts pipes NDJSON from Ollama stream:true.
  UI shows status, completed/total bytes, and percentage during download.
- F6: Token/s metrics after prompt generation. Parses eval_count and
  eval_duration from the final NDJSON chunk. Displays tok/s, total
  tokens, and duration in the prompt modal footer.

Bug fixes:
- B7: Parse vm_stat page size from output instead of hardcoding 16384.
  Reads 'page size of N bytes' from the first line for portability.
- B8: Whisper model discovery now scans multiple directories:
  WHISPER_MODELS_DIR env var, ~/whisper-models, /opt/homebrew/share/
  whisper-cpp/models/, ~/.cache/whisper/. Returns the first dir with
  .bin files found.
2026-02-19 15:16:33 -08:00
saravanakumardb1
9a807f64cf docs(local-llm): mark Sprint 1 items complete in DASHBOARD_REVIEW.md
Check off 9 items (B1, B3, B4, B5, B6, B9, B10, B11, P4) in both
the bug list and sprint tracker. Add commit 2da67c2 to commit log.
2026-02-19 15:13:43 -08:00
saravanakumardb1
2da67c2f74 fix(local-llm): Sprint 1 — critical dashboard bug fixes (B1,B3-B6,B9-B11,P4)
Bug fixes:
- B4: Escape key now respects streaming state — during active stream,
  Escape aborts the generation instead of closing the modal
- B5: Auto-refresh (15s interval) pauses during streaming and pull
  operations to prevent background churn and UI flicker
- B9: Add AbortController to streaming fetch — closing modal or pressing
  Escape cancels the underlying HTTP request, saving CPU/bandwidth
- B1: Header subtitle now dynamically shows chip name and RAM from the
  system API instead of hardcoded 'Apple M4 Pro · 48 GB'
- B11: Escape handler clears promptText and promptResponse on close
- B6: Toast IDs use Date.now()+random instead of incrementing ref
  (prevents collision on HMR remount)
- B10: Brew panel distinguishes 'Loading...' (system=null) from
  'No tracked packages found' (system loaded, empty array)
- B3: Remove dead non-streaming generate action from Ollama API route
- P4: Add 5-second AbortController timeout to all fetchOllama() calls
  to prevent indefinite hangs when Ollama is unresponsive
2026-02-19 15:12:41 -08:00
saravanakumardb1
554a5137ec docs(local-llm): improve dashboard review — add checkboxes, commit log, new findings
Rewrite DASHBOARD_REVIEW.md with progress-tracking improvements:
- Add GitHub-style checkboxes to all 41 actionable items
- Add file inventory table with line counts and purposes
- Add commit log section for tracking implementation progress
- Add sprint tracker tables with effort estimates and commit columns
- New finding B11: prompt text not cleared on Escape close
- New finding CQ6: no Zod validation on API responses
- Consolidate priority matrix into sprint tables (less redundancy)
- Add deferred items section with dependency notes
- Improve item descriptions with more precise file:line references
- Add stack summary and total effort estimate (14–17 hrs)
2026-02-19 15:11:19 -08:00
saravanakumardb1
093682eace docs(local-llm): add systematic dashboard bug & improvement review
DASHBOARD_REVIEW.md — comprehensive code review of all 6 dashboard files
(1,395 lines). Organized into 7 sections:

- 10 bugs (B1–B10): hardcoded header, blocking pull, escape during stream,
  auto-refresh during streaming, no abort controller, vm_stat page size, etc.
- 5 code quality issues (CQ1–CQ5): monolithic component, inline styles,
  duplicated constants, no error boundary, no loading skeleton
- 16 feature ideas (F1–F16): pull progress, chat mode, prompt history,
  token/s metrics, model search, whisper test, extraction integration, etc.
- 5 performance items (P1–P5): request deduplication, cache TTL, du latency
- 3 security notes (S1–S3): input validation, shell injection pattern, CORS
- Priority matrix and 5-sprint implementation roadmap
2026-02-19 14:36:51 -08:00
saravanakumardb1
43f8103c5a fix(local-llm): show accurate macOS memory (app vs cached vs free)
Replace Node.js os.freemem() with vm_stat parsing for macOS. The old
approach reported ~47.7 GB / 48 GB 'used' because os.freemem() only
counts truly free pages, ignoring ~20 GB of inactive/reclaimable cache.

New memory breakdown:
- App Memory: active + wired + compressor (actual process usage)
- Cached: inactive + purgeable + speculative (reclaimable on demand)
- Available: free + cached (what apps can actually use)
- Pressure: normal/warning/critical based on app memory ratio

Dashboard UI updated to show app memory, cached (reclaimable) label,
and pressure-based color coding on progress bars.
2026-02-19 13:22:17 -08:00
saravanakumardb1
b77afce9ae docs(local-llm): add Mission Control dashboard documentation
- docs/05-mission-control-dashboard.md: complete dashboard reference with
  architecture diagram, API route docs (request/response examples),
  UI feature descriptions, design tokens table, v1/v2 changelog,
  and future improvements roadmap
2026-02-19 13:03:30 -08:00
saravanakumardb1
970b565026 fix(local-llm): dashboard v2 — streaming prompts, model management, perf fixes
Bug fixes:
- Fix Google Fonts build error (corporate proxy blocks fonts.gstatic.com)
  by removing Geist font imports and switching to system font stack
- Fix system API 7.6s latency by caching static info (chip, GPU, brew)
  with timeouts on shell commands — now responds in ~50ms

New features:
- Streaming prompt responses via NDJSON proxy (/api/ollama/stream)
  with typing cursor animation and auto-scroll
- Model pull UI: input field + button to download new models
- Model delete with two-step confirmation dialog
- VRAM usage and expiry time display for loaded models
- Toast notifications (success/error/info) with slide-in animation
- Copy response button in prompt modal
- Escape key closes modals, backdrop click dismisses
- Pull/delete/show actions added to Ollama API route
2026-02-19 13:03:11 -08:00