Commit Graph

42 Commits

Author SHA1 Message Date
saravanakumardb1
d7dc66eb92 docs(local-llm): Rich Features Roadmap — 45 tasks across 7 phases for coding agent
Detailed implementation roadmap for the Rich Features PRD with:

Phase A (Sprint 14-16, ~15hr): Foundation
  A1: IndexedDB layer with idb — 9 object stores, compound indexes
  A2: v4 TypeScript interfaces — all data models
  A3: Route group (mission-control) — move existing dashboard
  A4: Route group (workspace) — sidebar + content layout
  A5: Sidebar — conversation list, time groups, search
  A6: Conversation view — message thread, input bar, streaming
  A7: Auto-title + context window usage bar
  A8: v3 → v4 migration from localStorage

Phase B (Sprint 17-18, ~10hr): Quick Actions + Cmd+K
  B1-B6: 30 built-in actions, fuse.js command palette, launcher,
  custom editor, usage tracking, export/import

Phase C (Sprint 19-20, ~9hr): Custom Agents
  C1-C5: 10 built-in agents, picker, full-screen editor,
  conversation wiring (welcome msg, chips, temp), export

Phase D (Sprint 21-22, ~13hr): Model Router + Multi-Modal
  D1-D7: regex classifier, model defaults, auto-routing UI,
  rich input bar, file/voice/image processing, drag-drop

Phase E (Sprint 23, ~7hr): Response Enhancements
  E1-E5: action bars, code-block copy, try-other-model,
  live metrics, rating with aggregation

Phase F (Sprint 24-25, ~11hr): Scheduled Tasks
  F1-F7: cron-parser, CRUD, editor, browser runner,
  /api/system/exec with allowlist, notifications, templates

Phase G (Sprint 26-28, ~13hr): Projects + Orchestration
  G1-G7: project CRUD, drag-to-project, system context,
  Cmd+P switcher, chain/race/vote modes

Every task has: explicit file paths, step-by-step instructions,
pass/fail exit criteria, verification commands, and commit templates.
Dependency graph: A is foundation, B-F parallel after A, G needs A+B.
2026-02-19 23:54:07 -08:00
saravanakumardb1
7bd14054d4 docs(local-llm): Rich Features PRD rev 2 — comprehensive review + expansion
Review findings addressed (20+ issues):

Structure additions:
- Target Users section with 5 personas (solo dev, tinkerer, privacy pro, writer, power user)
- Non-Goals section (8 explicit out-of-scope items for v4)
- Risks & Mitigations table (10 risks with impact/likelihood/mitigation)
- New API Routes section (4 new routes with security notes)
- Settings Expansion section (full tree: General, Router, Models, Input, Tasks, Data, About)
- New Dependencies table (idb ~1KB, fuse.js ~6KB, cron-parser ~3KB)
- Error Handling appendix (12 edge cases with expected behavior)

Data model fixes:
- Conversation/Message split into separate IndexedDB stores (scalability)
- Message gets conversationId FK, promptTokens field, size/language on Attachment
- Design decision note explaining why messages are stored separately

Feature spec improvements:
- 3.1 Conversations: context window management (token bar, auto-summarize at 80/95%)
- 3.2 Quick Actions: expanded Cmd+K palette spec (5 result types, ranking)
- 3.3 Agents: tools marked v4 vs v5, duplicate-from-builtin, unlink on delete
- 3.4 Model Router: full resolveModel() with 4-level fallback chain + availability
- 3.5 Multi-Modal: attachment size limits, Whisper error handling
- 3.6 Response: hover-only action bars, rating aggregation per task type
- 3.7 Cron: built-in templates table, runtime constraints, security (execFile)
- 3.8 Orchestration: full data model, chain/race/vote UI specs, step limits
- 3.9 Projects: system context detail, project stats, unlink behavior

Acceptance criteria added to ALL 9 features (was missing on 5).
Competitive analysis expanded with local competitors (Open WebUI, LM Studio, Jan.ai).
Success metrics improved with measurement methodology and rationale.
Open questions restructured as decision table with recommendations.
IndexedDB schema with explicit indexes and compound keys.
Migration strategy: 7-step v3→v4 with safety (no delete until confirmed).

681 lines → 1149 lines (+69% content)
2026-02-19 23:47:59 -08:00
saravanakumardb1
1172dbb23e docs(local-llm): Rich Features PRD — full local AI workspace spec
Comprehensive PRD evolving Mission Control into a ChatGPT-class local AI workspace:

- 3.1 Conversations: persistent, named, searchable, branching, IndexedDB
- 3.2 Quick Actions: 30 built-in 1-click launchers across 5 categories
     (code, writing, analysis, creative, devops) + custom actions + Cmd+K palette
- 3.3 Custom Agents: 10 built-in local GPTs with system prompts, tools,
     temperature, welcome messages, example prompts
- 3.4 Model Router: heuristic task classifier (<5ms, no LLM call),
     auto-selects best model per task type, configurable defaults
- 3.5 Multi-Modal Input: file attach, voice (Whisper), images, drag-drop,
     paste intelligence (code/image/error detection)
- 3.6 Response Enhancements: per-message actions, per-code-block copy,
     branching with navigation, live metrics, rating/quality profiles
- 3.7 Scheduled Tasks: cron-based recurring prompts with shell/file input,
     notification/file/conversation output, 5 built-in templates
- 3.8 Multi-Model Orchestration: chain, race, vote modes
- 3.9 Projects: conversation folders with system context + model defaults

7 implementation phases (~78hr), component architecture, storage migration,
competitive analysis, success metrics, open questions
2026-02-19 23:39:20 -08:00
saravanakumardb1
3dc0c441a9 docs(local-llm): mark all roadmap phases 1-6 complete with commit links
All 27 roadmap items + 5 bugs checked off across 6 phases:
- Phase 1 (040013e): N1-N3, BN1, BN2, BN5
- Phase 2 (7f04297): N4-N5, BN3, BN4
- Phase 3 (6f6baf9): N6-N10
- Phase 4 (588d21c): N11-N14
- Phase 5 (44ad8a6): F24-F28
- Phase 6 (07d3911): F29-F31
2026-02-19 23:30:11 -08:00
saravanakumardb1
07d391101a feat(local-llm): Phase 6 — data persistence + export (F29-F31)
F29: Export/import settings — gear icon in header opens settings popover,
     export downloads all llm-* localStorage as JSON, import validates
     and merges, both with toast feedback
F30: Inference history log — saves prompt/response/model/metrics to
     llm-inference-log (capped 100 FIFO), searchable panel with replay
     button, count badge in header toggle
F31: Factory reset — confirm dialog clears all llm-* localStorage keys,
     resets all component state to defaults
2026-02-19 23:29:40 -08:00
saravanakumardb1
44ad8a6301 feat(local-llm): Phase 5 — response quality + interaction (F24-F28)
F24: Vision image upload — file picker for vision models, base64 encoding,
     passed through stream API to Ollama generate endpoint
F25: Markdown rendering — ReactMarkdown replaces raw <pre> for all
     prompt responses and chat assistant messages
F26: Syntax highlighting — Prism-based code blocks with language labels
     and oneDark theme via react-syntax-highlighter
F27: <think> block collapse — auto-detect and collapse DeepSeek R1
     reasoning traces into expandable details with word count
F28: Ollama library link — button next to Pull input opens ollama.com/library
2026-02-19 23:25:20 -08:00
saravanakumardb1
588d21c70e feat(local-llm): Phase 4 — runtime metrics + UX polish (N11-N14)
N11: Persist tok/s per model to localStorage (llm-model-benchmarks),
     display on model card as faded accent text
N12: Live countdown to auto-unload — 1s interval, color-coded
     (green >5m, yellow 1-5m, red <1m 'Unloading soon')
N13: Session stats per model (prompts + tokens) in expanded details
N14: Co-load suggestions strip below models list showing which
     unloaded models fit in remaining free memory
2026-02-19 23:20:30 -08:00
saravanakumardb1
6f6baf99c8 feat(local-llm): Phase 3 — model intelligence badges + sort + version (N6-N10)
N6: <think> warning badge for DeepSeek R1 and distilled variants
N7: Vision model indicator for llava, bakllava, moondream, qwen-vl, etc.
N8: Architecture/family badge as pill on every model card
N9: Sort dropdown (A-Z, size, params, running, recent) with localStorage persist
N10: Ollama server version fetched from /api/version, shown in stats card
2026-02-19 23:17:07 -08:00
saravanakumardb1
7f042975de feat(local-llm): Phase 2 — rich metadata + persistence (N4-N5, BN3-BN4)
N4: RamBudgetBar component — stacked horizontal bar showing OS+Apps,
    loaded models (by name with color), and free memory segments
N5: Context window size — extract context_length from /api/show
    model_info, cache in modelMetadata state, display on card
BN3: Persist chat messages to localStorage (llm-chat-{model}),
     restore on modal re-open, capped at 50 messages
BN4: Logs panel refresh button — RefreshCw icon next to toggle
2026-02-19 23:13:22 -08:00
saravanakumardb1
040013e495 feat(local-llm): Phase 1 — pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5)
N1: Estimated RAM per model with quant-aware multipliers (Q4=1.2x, Q5=1.25x, Q8=1.1x, F16=1.05x)
N2: Will-it-fit indicator (green/yellow/red dot) next to Load button
N3: Aggregate loaded model VRAM in panel header badge
BN1: Compare buttons now filter to running models only
BN2: AbortController on compare stream, cancel on modal close
BN5: Delete confirmation shows model name + disk reclaim size
2026-02-19 23:09:49 -08:00
saravanakumardb1
ae231d5aac docs(local-llm): comprehensive roadmap review — 5 bugs, 6 phases, 31 items
Systematic code review of DASHBOARD_ROADMAP.md against actual codebase:

Bugs found (BN1-BN5):
- BN1: Compare buttons show unloaded models (can't generate)
- BN2: No AbortController on compare stream (leaks on close)
- BN3: Chat messages lost on modal close (no persistence)
- BN4: Logs panel has no refresh button
- BN5: Delete dialog missing reclaim size (partial impl exists)

Expanded from 4 phases to 6 + backlog (15 → 31 items):
- Phase 1: Pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5)
- Phase 2: Rich metadata + persistence (N4-N5, BN3-BN4)
- Phase 3: Model intelligence badges + sort (N6-N10)
- Phase 4: Runtime metrics + UX polish (N11-N14)
- Phase 5 (NEW): Response quality — markdown, syntax highlight,
  vision upload, think-block collapse, model library link
- Phase 6 (NEW): Data persistence — export/import, inference log,
  factory reset
- Phase 7: Expanded backlog (F17-F38, +6 new ideas)

Improvements:
- Added checkboxes for all tasks and acceptance criteria
- Quant-aware RAM estimate multipliers (Q4/Q5/Q8/F16)
- Broader vision model regex (bakllava, moondream, llama-vision)
- DeepSeek R1 distill variant detection for think badge
- Conservative memory availability formula (free + cached*0.5)
- localStorage key registry with llm- prefix standardization
- Dependency graph between phases
- ~6 hrs total estimated effort
2026-02-19 23:02:25 -08:00
saravanakumardb1
cd6e561f1b docs(local-llm): consolidate dashboard docs into dashboard/docs/
- Created DASHBOARD_PRD.md — full updated PRD with current 19-file
  architecture, all 10 API routes, UI layout, data flow, localStorage
  keys, security model, and v1-v3 changelog.
- Created DASHBOARD_ROADMAP.md — phased implementation plan for N1-N15
  improvements across 4 phases: pre-load intelligence, rich metadata,
  model intelligence badges, runtime metrics. Includes acceptance
  criteria and implementation details per item.
- Updated DASHBOARD_REVIEW.md — refreshed file inventory to 19 files
  (~2,930 lines), fixed broken Tier B markdown table, added cross-links.
- Replaced __LOCAL_LLMs/docs/05-mission-control-dashboard.md with
  redirect pointer to new dashboard/docs/ location.

Dashboard docs are now co-located at __LOCAL_LLMs/dashboard/docs/:
  - DASHBOARD_PRD.md (product requirements)
  - DASHBOARD_REVIEW.md (audit + 39 completed items + N1-N15 proposals)
  - DASHBOARD_ROADMAP.md (phased implementation plan)
2026-02-19 22:54:18 -08:00
saravanakumardb1
519f348583 docs(local-llm): add Next Wave — 15 model intelligence improvements (N1–N15)
Section 8 of DASHBOARD_REVIEW.md: pre-load RAM estimates, will-it-fit
indicator, RAM budget bar, context window, architecture/vision/think
badges, sort, tok/s history, countdown, session stats, delete confirm,
co-load suggestions. Organized in 4 tiers with sprint plan.
2026-02-19 22:32:29 -08:00
saravanakumardb1
4090c8aa13 docs(local-llms): add developer guide — API endpoint, code examples, model selection
- New 00-developer-guide.md: start-here doc for developers covering:
  - Ollama endpoint (http://localhost:11434/v1) and API key
  - curl, TypeScript, Python code examples with env var pattern
  - Model selection table by task
  - Running extraction service evals locally
  - JSON output gotchas (parse from string, <think> strip for R1)
  - Model management commands
  - Troubleshooting quick reference
  - Links to all other docs
- Updated index in LOCAL_LLMs_setup_mac_m4_48gb.md to include doc 00
2026-02-19 18:43:06 -08:00
saravanakumardb1
5deb5efdcf docs(local-llms): add comprehensive model comparison table and deepseek-r1:32b details
- Add Comprehensive Model Comparison Table: 11 models (local + cloud) with
  Disk, Params, Quant, RAM, Tok/s, JSON quality, Reasoning, Code, Instruction
  Following, Context window, <think> flag, and install status columns
- Add Gap Analysis table: llama3.1:8b (~55%), qwen2.5-coder:32b (~85%),
  deepseek-r1:32b (~75-80%) vs llama3.3:70b across 5 capability dimensions
- Update Tier 4 Reasoning table: add Parameters, Quant columns; add <think>
  warning note with link to eval doc transform pattern
- Update By Use Case table: add brain signal routing row, update extraction
  evals fallback to qwen2.5-coder:32b
2026-02-19 16:06:02 -08:00
saravanakumardb1
cfc1194079 docs(local-llms): add latency/cost comparison and deepseek-r1 transform pattern to evals doc
- Add Latency & Cost Comparison table: llama3.1:8b (~1m27s), qwen2.5-coder:32b
  (~5-8m est.), deepseek-r1:32b (~5-8m est.) vs gemini-2.5-flash (~15-25s, $0.003)
  and gpt-4o (~20-40s, $0.05-0.15) — all measured at 19 cases, concurrency=4
- Fix assertion pattern docs: single expressions required, not const/return blocks
- Add deepseek-r1 <think> strip transform pattern for promptfoo provider config
- Expand recommended models table with Disk, Reasoning, Pass Rate, and Notes columns
2026-02-19 16:05:52 -08:00
saravanakumardb1
71a7623553 docs(local-llms): expand installed models table with parameters and quantization
- Add Parameters, Quantization, and Status columns to models table
- qwen2.5-coder:32b: 32.8B params, Q4_K_M, 18.5 GB disk
- llama3.1:8b: 8B params, Q4_K_M, 4.9 GB disk (confirmed via ollama API)
2026-02-19 16:05:42 -08:00
saravanakumardb1
1552006feb fix(local-llm): proxy extraction health check through API route
Move extraction service health check from direct browser fetch
(http://localhost:4005/health) to server-side /api/extraction/health
proxy. Eliminates ERR_CONNECTION_REFUSED console errors when the
extraction service is not running locally.
2026-02-19 15:53:02 -08:00
saravanakumardb1
984630eb45 docs(local-llm): mark ALL 39 items complete in DASHBOARD_REVIEW.md
All bugs (11), code quality (6), features (16), performance (5), and
security (3) items are now checked off. Added Sprint 6 (ed93a6f) and
Sprint 7 (8bdd5ee) to commit log. Updated summary to reflect 100%
completion across 7 sprints.
2026-02-19 15:45:46 -08:00
saravanakumardb1
8bdd5ee1c8 feat(local-llm): Sprint 7 — all remaining features (F5,F7,F8,F12,F13,F15,CQ5,S3)
Features:
- F5: Model comparison side-by-side — after a prompt response, click
  any other model to compare. Responses display in two-column grid.
- F7: System resource sparklines — memory usage ring buffer (30 points)
  with SVG sparkline component in the memory stats card.
- F8: Ollama logs viewer — collapsible terminal-style panel below main
  grid. Fetches from /api/ollama/logs route. Color-coded by level.
- F12: Whisper transcription test — file upload button in Whisper panel.
  Uploads audio to /api/whisper/transcribe, displays text + latency.
- F13: Responsive mobile layout — p-3/sm:p-6 padding, gap-3/sm:gap-4,
  hidden sm:inline for header text, responsive comparison grid.
- F15: Extraction service panel — health check to localhost:4005 on
  each refresh. Status card in right column with endpoint + service.

Code quality:
- CQ5: Skeleton shimmer loading UI — 4 skeleton cards shown while
  initial data loads. Uses CSS shimmer animation from globals.css.

Security:
- S3: Documented CORS/auth assumption in code comment — dashboard is
  local-only, no auth needed for dev tool.

New files:
- components/Sparkline.tsx — reusable SVG sparkline component
- api/ollama/chat/route.ts — streaming chat endpoint (from Sprint 6)
- api/ollama/logs/route.ts — Ollama log file reader
- api/whisper/transcribe/route.ts — Whisper STT test endpoint
2026-02-19 15:44:20 -08:00
saravanakumardb1
ed93a6f0af feat(local-llm): Sprint 6 — major feature batch (CQ2,CQ5,CQ6,P5,F4,F10,F14,F16)
Code quality:
- CQ2: Add CSS utility classes (text-primary/secondary/tertiary, bg-*,
  btn-*, input-base) to globals.css — reduces inline style repetition
- CQ5: Add skeleton shimmer animation CSS for loading states
- CQ6: Replace manual model name validation with Zod schema
  (PostBodySchema) in Ollama API route

Performance:
- P5: Eagerly warm static cache on module load — system_profiler
  no longer blocks first dashboard request

Features:
- F4: Chat mode with multi-turn conversation via new /api/ollama/chat
  streaming route. Chat bubble layout, system prompt input, message
  history. Toggle between prompt/chat modes in modal.
- F10: Dark/light theme toggle with CSS var overrides in :root.light.
  Sun/Moon button in header, persisted in localStorage.
- F14: Model tags (coding, chat, fast, vision, reasoning) with
  colored toggle badges in expanded model details. Persisted in
  localStorage.
- F16: Auto-load preferred model — star toggle in expanded details.
  When Ollama is online but no models loaded, auto-loads the starred
  model. Persisted in localStorage.
2026-02-19 15:38:06 -08:00
saravanakumardb1
2936b9f047 docs(local-llm): mark Sprint 5 P1-P3 complete in DASHBOARD_REVIEW.md
Check off 3 items (P1, P2, P3) in performance section and sprint
tracker. Add commit b1fda3a to commit log.
2026-02-19 15:28:59 -08:00
saravanakumardb1
b1fda3a1a5 perf(local-llm): Sprint 5 — request dedup + cache TTLs (P1, P2, P3)
Performance fixes:
- P1: Add fetchingRef guard to fetchAll() — prevents duplicate requests
  from rapid Refresh button clicks or overlapping interval ticks
- P2: Add 5-minute TTL to staticCache (chip, GPU, brew packages) —
  previously cached indefinitely per server process, now refreshes
  after brew upgrades without requiring a restart
- P3: Add 60-second TTL cache for Ollama models disk usage (du command)
  — previously traversed ~/.ollama/models on every 15s refresh cycle,
  now reuses cached value for 60s
2026-02-19 15:28:07 -08:00
saravanakumardb1
9892fe7145 docs(local-llm): mark Sprint 4 items complete in DASHBOARD_REVIEW.md
Check off 4 items (F2, F3, F9, F11) in features list and sprint
tracker. F4 (chat mode) deferred. Add commit 9c2f5f3 to commit log.
2026-02-19 15:26:37 -08:00
saravanakumardb1
9c2f5f3396 feat(local-llm): Sprint 4 — UX enhancements (F2, F3, F9, F11)
New features:
- F2: Model search/filter — search input above models list (shown when
  4+ models installed). Filters by name, family, and quantization level.
  Press / to focus the search input.
- F3: Prompt history — saves last 20 prompts to localStorage with model
  name and timestamp. History dropdown in prompt modal with one-click
  re-run. Toggle via clock icon in textarea.
- F9: Modelfile viewer — expanded model details now fetch and display
  the Modelfile via the show action. Collapsible <details> element
  with syntax-highlighted pre block.
- F11: Keyboard shortcuts panel — press ? to toggle. Shows all shortcuts:
  ? (help), R (refresh), / (search), Esc (close/cancel), Cmd+Enter (send).
  Shortcuts only fire when not in an input field.
2026-02-19 15:25:43 -08:00
saravanakumardb1
40c40756ed docs(local-llm): mark Sprint 3 items complete in DASHBOARD_REVIEW.md
Check off 5 items (CQ1, CQ3, CQ4, S1, S2) in code quality, security,
and sprint tracker. CQ2 (inline styles) deferred. Add commit 75a3cd0
to commit log.
2026-02-19 15:22:11 -08:00
saravanakumardb1
75a3cd0826 refactor(local-llm): Sprint 3 — component extraction, error boundary, security (CQ1,CQ3,CQ4,S1,S2)
Component extraction (CQ1):
- lib/types.ts: All interfaces (OllamaData, SystemData, Toast, etc.)
- lib/format.ts: formatBytes, formatUptime utilities
- lib/ollama-config.ts: Shared OLLAMA_URL constant
- components/StatusDot.tsx: Status indicator component
- components/ProgressBar.tsx: Progress bar component
- page.tsx: Now imports from extracted modules, reduced from 1180 to
  1077 lines (interfaces + utilities + sub-components removed)

Error boundary (CQ4):
- error.tsx: Next.js App Router error boundary with styled error UI,
  stack trace preview, and 'Try again' button

Shared config (CQ3):
- All 3 Ollama API routes now import OLLAMA_URL from lib/ollama-config.ts
  instead of duplicating the env var fallback

Security (S1):
- Add MODEL_NAME_RE regex validation on POST /api/ollama — rejects
  invalid model names before passing to Ollama API

Security (S2):
- Replace exec() with execFile() for brew package version check —
  prevents shell injection if targets list ever becomes dynamic
2026-02-19 15:21:22 -08:00
saravanakumardb1
7a82db4876 docs(local-llm): mark Sprint 2 items complete in DASHBOARD_REVIEW.md
Check off 5 items (B2, B7, B8, F1, F6) in bug list, features list,
and sprint tracker. Add commit 2d9475b to commit log.
2026-02-19 15:17:16 -08:00
saravanakumardb1
2d9475bd15 feat(local-llm): Sprint 2 — streaming pull progress, token metrics, fixes (B2/F1,F6,B7,B8)
New features:
- B2/F1: Streaming model pull with real-time progress bar. New
  /api/ollama/pull/route.ts pipes NDJSON from Ollama stream:true.
  UI shows status, completed/total bytes, and percentage during download.
- F6: Token/s metrics after prompt generation. Parses eval_count and
  eval_duration from the final NDJSON chunk. Displays tok/s, total
  tokens, and duration in the prompt modal footer.

Bug fixes:
- B7: Parse vm_stat page size from output instead of hardcoding 16384.
  Reads 'page size of N bytes' from the first line for portability.
- B8: Whisper model discovery now scans multiple directories:
  WHISPER_MODELS_DIR env var, ~/whisper-models, /opt/homebrew/share/
  whisper-cpp/models/, ~/.cache/whisper/. Returns the first dir with
  .bin files found.
2026-02-19 15:16:33 -08:00
saravanakumardb1
9a807f64cf docs(local-llm): mark Sprint 1 items complete in DASHBOARD_REVIEW.md
Check off 9 items (B1, B3, B4, B5, B6, B9, B10, B11, P4) in both
the bug list and sprint tracker. Add commit 2da67c2 to commit log.
2026-02-19 15:13:43 -08:00
saravanakumardb1
2da67c2f74 fix(local-llm): Sprint 1 — critical dashboard bug fixes (B1,B3-B6,B9-B11,P4)
Bug fixes:
- B4: Escape key now respects streaming state — during active stream,
  Escape aborts the generation instead of closing the modal
- B5: Auto-refresh (15s interval) pauses during streaming and pull
  operations to prevent background churn and UI flicker
- B9: Add AbortController to streaming fetch — closing modal or pressing
  Escape cancels the underlying HTTP request, saving CPU/bandwidth
- B1: Header subtitle now dynamically shows chip name and RAM from the
  system API instead of hardcoded 'Apple M4 Pro · 48 GB'
- B11: Escape handler clears promptText and promptResponse on close
- B6: Toast IDs use Date.now()+random instead of incrementing ref
  (prevents collision on HMR remount)
- B10: Brew panel distinguishes 'Loading...' (system=null) from
  'No tracked packages found' (system loaded, empty array)
- B3: Remove dead non-streaming generate action from Ollama API route
- P4: Add 5-second AbortController timeout to all fetchOllama() calls
  to prevent indefinite hangs when Ollama is unresponsive
2026-02-19 15:12:41 -08:00
saravanakumardb1
554a5137ec docs(local-llm): improve dashboard review — add checkboxes, commit log, new findings
Rewrite DASHBOARD_REVIEW.md with progress-tracking improvements:
- Add GitHub-style checkboxes to all 41 actionable items
- Add file inventory table with line counts and purposes
- Add commit log section for tracking implementation progress
- Add sprint tracker tables with effort estimates and commit columns
- New finding B11: prompt text not cleared on Escape close
- New finding CQ6: no Zod validation on API responses
- Consolidate priority matrix into sprint tables (less redundancy)
- Add deferred items section with dependency notes
- Improve item descriptions with more precise file:line references
- Add stack summary and total effort estimate (14–17 hrs)
2026-02-19 15:11:19 -08:00
saravanakumardb1
093682eace docs(local-llm): add systematic dashboard bug & improvement review
DASHBOARD_REVIEW.md — comprehensive code review of all 6 dashboard files
(1,395 lines). Organized into 7 sections:

- 10 bugs (B1–B10): hardcoded header, blocking pull, escape during stream,
  auto-refresh during streaming, no abort controller, vm_stat page size, etc.
- 5 code quality issues (CQ1–CQ5): monolithic component, inline styles,
  duplicated constants, no error boundary, no loading skeleton
- 16 feature ideas (F1–F16): pull progress, chat mode, prompt history,
  token/s metrics, model search, whisper test, extraction integration, etc.
- 5 performance items (P1–P5): request deduplication, cache TTL, du latency
- 3 security notes (S1–S3): input validation, shell injection pattern, CORS
- Priority matrix and 5-sprint implementation roadmap
2026-02-19 14:36:51 -08:00
saravanakumardb1
43f8103c5a fix(local-llm): show accurate macOS memory (app vs cached vs free)
Replace Node.js os.freemem() with vm_stat parsing for macOS. The old
approach reported ~47.7 GB / 48 GB 'used' because os.freemem() only
counts truly free pages, ignoring ~20 GB of inactive/reclaimable cache.

New memory breakdown:
- App Memory: active + wired + compressor (actual process usage)
- Cached: inactive + purgeable + speculative (reclaimable on demand)
- Available: free + cached (what apps can actually use)
- Pressure: normal/warning/critical based on app memory ratio

Dashboard UI updated to show app memory, cached (reclaimable) label,
and pressure-based color coding on progress bars.
2026-02-19 13:22:17 -08:00
saravanakumardb1
b77afce9ae docs(local-llm): add Mission Control dashboard documentation
- docs/05-mission-control-dashboard.md: complete dashboard reference with
  architecture diagram, API route docs (request/response examples),
  UI feature descriptions, design tokens table, v1/v2 changelog,
  and future improvements roadmap
2026-02-19 13:03:30 -08:00
saravanakumardb1
970b565026 fix(local-llm): dashboard v2 — streaming prompts, model management, perf fixes
Bug fixes:
- Fix Google Fonts build error (corporate proxy blocks fonts.gstatic.com)
  by removing Geist font imports and switching to system font stack
- Fix system API 7.6s latency by caching static info (chip, GPU, brew)
  with timeouts on shell commands — now responds in ~50ms

New features:
- Streaming prompt responses via NDJSON proxy (/api/ollama/stream)
  with typing cursor animation and auto-scroll
- Model pull UI: input field + button to download new models
- Model delete with two-step confirmation dialog
- VRAM usage and expiry time display for loaded models
- Toast notifications (success/error/info) with slide-in animation
- Copy response button in prompt modal
- Escape key closes modals, backdrop click dismisses
- Pull/delete/show actions added to Ollama API route
2026-02-19 13:03:11 -08:00
saravanakumardb1
2565714c52 feat(local-llm): add Mission Control dashboard v1
Next.js 16 dashboard for monitoring and managing the local LLM stack.
Runs on port 3100 with dark theme using ByteLyst design tokens.

API routes:
- GET/POST /api/ollama — model list, running status, load/unload/generate
- GET /api/whisper — binary discovery, GGML model inventory
- GET /api/system — chip info, RAM/disk usage, brew package versions

Dashboard UI:
- Top stats row: Ollama status, model count, Whisper status, RAM usage
- Ollama models panel with load/unload actions, LOADED badge, details
- System panel with progress bars for RAM and disk
- Whisper.cpp panel with binary list and model inventory
- Brew packages panel with version tracking
- Basic prompt modal with Cmd+Enter shortcut
- Auto-refresh every 15 seconds

Also excludes __LOCAL_LLMs/ from root ESLint config (dashboard has its
own config and uses browser globals not available in Node.js context).

Tech: Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript
2026-02-19 13:02:48 -08:00
saravanakumardb1
0c4210f5ff docs(local-llm): update original setup doc to redirect to docs/ structure
- LOCAL_LLMs_setup_mac_m4_48gb.md: replace 279-line monolith with quick start
  + documentation index linking to 9 topic-specific docs in docs/
- Add .gitignore for extraction-service eval logs (generated artifacts)
2026-02-19 13:01:35 -08:00
saravanakumardb1
3561deee52 docs(local-llm): add multimodal stack, model recommendations, and troubleshooting
- docs/04-multimodal-local-stack.md: vision models (llava, qwen2.5vl, moondream2),
  audio pipeline architecture, video understanding status, Kimi alternatives,
  complete local AI stack diagram
- docs/07-model-recommendations.md: 6-tier model guide (coding, fast, general,
  reasoning, vision, embeddings), recommended 10-model stack for M4 Pro 48GB,
  use-case quick reference, hardware scaling guide
- docs/08-troubleshooting.md: corporate Forcepoint proxy workarounds, MLX warning,
  JSON parse errors, slow inference, whisper-cli vs whisper-cpp naming, audio
  format conversion, proxy-corrupted downloads detection
2026-02-19 13:01:22 -08:00
saravanakumardb1
80f794dee7 docs(local-llm): add Ollama setup, extraction evals, and env vars reference
- docs/02-ollama-setup-and-models.md: installation, server config, memory management,
  idle timeout, manual load/unload, OpenAI-compatible API, native API reference,
  performance tuning flags (flash attention, KV cache)
- docs/06-extraction-service-evals.md: promptfoo eval suite against Ollama, 19 cases
  across 5 tasks, assertion patterns for JSON string output, Python sidecar config
- docs/09-environment-variables.md: comprehensive var reference for Ollama server,
  evals, Python sidecar, dashboard, whisper CLI flags, proxy/network settings
2026-02-19 13:01:05 -08:00
saravanakumardb1
464ffb92ec docs(local-llm): add docs index, hardware specs, and whisper-cpp setup
- docs/README.md: documentation index with quick start, file structure, status table
- docs/01-hardware-and-prerequisites.md: M4 Pro 48GB specs, toolchain inventory,
  disk budget, network environment (Forcepoint proxy details)
- docs/03-whisper-cpp-setup.md: whisper-cpp installation, GGML model guide,
  ffmpeg audio conversion, CLI usage, real-time streaming, LysnrAI integration
2026-02-19 13:00:48 -08:00
saravanakumardb1
dd23f6cf96 docs: add local LLM setup guide for Apple Silicon Mac (48GB)
- Add __LOCAL_LLMs/LOCAL_LLMs_setup_mac_m4_48gb.md: comprehensive reference
  for running Ollama on the dev Mac covering installation (v0.16.2 via brew),
  corp proxy handling (AT&T Forcepoint), OpenAI-compat API usage examples
  (curl/Node/Python), extraction-service eval integration, Python sidecar
  wiring, model recommendations by use case, troubleshooting, and env var
  reference
- Models documented: llama3.1:8b (4.9GB, default evals), qwen2.5-coder:32b
  (19GB, code gen / Swift / TS)
2026-02-19 12:19:44 -08:00