learning_ai_common_plat

Author	SHA1	Message	Date
saravanakumardb1	d7dc66eb92	docs(local-llm): Rich Features Roadmap — 45 tasks across 7 phases for coding agent Detailed implementation roadmap for the Rich Features PRD with: Phase A (Sprint 14-16, ~15hr): Foundation A1: IndexedDB layer with idb — 9 object stores, compound indexes A2: v4 TypeScript interfaces — all data models A3: Route group (mission-control) — move existing dashboard A4: Route group (workspace) — sidebar + content layout A5: Sidebar — conversation list, time groups, search A6: Conversation view — message thread, input bar, streaming A7: Auto-title + context window usage bar A8: v3 → v4 migration from localStorage Phase B (Sprint 17-18, ~10hr): Quick Actions + Cmd+K B1-B6: 30 built-in actions, fuse.js command palette, launcher, custom editor, usage tracking, export/import Phase C (Sprint 19-20, ~9hr): Custom Agents C1-C5: 10 built-in agents, picker, full-screen editor, conversation wiring (welcome msg, chips, temp), export Phase D (Sprint 21-22, ~13hr): Model Router + Multi-Modal D1-D7: regex classifier, model defaults, auto-routing UI, rich input bar, file/voice/image processing, drag-drop Phase E (Sprint 23, ~7hr): Response Enhancements E1-E5: action bars, code-block copy, try-other-model, live metrics, rating with aggregation Phase F (Sprint 24-25, ~11hr): Scheduled Tasks F1-F7: cron-parser, CRUD, editor, browser runner, /api/system/exec with allowlist, notifications, templates Phase G (Sprint 26-28, ~13hr): Projects + Orchestration G1-G7: project CRUD, drag-to-project, system context, Cmd+P switcher, chain/race/vote modes Every task has: explicit file paths, step-by-step instructions, pass/fail exit criteria, verification commands, and commit templates. Dependency graph: A is foundation, B-F parallel after A, G needs A+B.	2026-02-19 23:54:07 -08:00
saravanakumardb1	7bd14054d4	docs(local-llm): Rich Features PRD rev 2 — comprehensive review + expansion Review findings addressed (20+ issues): Structure additions: - Target Users section with 5 personas (solo dev, tinkerer, privacy pro, writer, power user) - Non-Goals section (8 explicit out-of-scope items for v4) - Risks & Mitigations table (10 risks with impact/likelihood/mitigation) - New API Routes section (4 new routes with security notes) - Settings Expansion section (full tree: General, Router, Models, Input, Tasks, Data, About) - New Dependencies table (idb ~1KB, fuse.js ~6KB, cron-parser ~3KB) - Error Handling appendix (12 edge cases with expected behavior) Data model fixes: - Conversation/Message split into separate IndexedDB stores (scalability) - Message gets conversationId FK, promptTokens field, size/language on Attachment - Design decision note explaining why messages are stored separately Feature spec improvements: - 3.1 Conversations: context window management (token bar, auto-summarize at 80/95%) - 3.2 Quick Actions: expanded Cmd+K palette spec (5 result types, ranking) - 3.3 Agents: tools marked v4 vs v5, duplicate-from-builtin, unlink on delete - 3.4 Model Router: full resolveModel() with 4-level fallback chain + availability - 3.5 Multi-Modal: attachment size limits, Whisper error handling - 3.6 Response: hover-only action bars, rating aggregation per task type - 3.7 Cron: built-in templates table, runtime constraints, security (execFile) - 3.8 Orchestration: full data model, chain/race/vote UI specs, step limits - 3.9 Projects: system context detail, project stats, unlink behavior Acceptance criteria added to ALL 9 features (was missing on 5). Competitive analysis expanded with local competitors (Open WebUI, LM Studio, Jan.ai). Success metrics improved with measurement methodology and rationale. Open questions restructured as decision table with recommendations. IndexedDB schema with explicit indexes and compound keys. Migration strategy: 7-step v3→v4 with safety (no delete until confirmed). 681 lines → 1149 lines (+69% content)	2026-02-19 23:47:59 -08:00
saravanakumardb1	1172dbb23e	docs(local-llm): Rich Features PRD — full local AI workspace spec Comprehensive PRD evolving Mission Control into a ChatGPT-class local AI workspace: - 3.1 Conversations: persistent, named, searchable, branching, IndexedDB - 3.2 Quick Actions: 30 built-in 1-click launchers across 5 categories (code, writing, analysis, creative, devops) + custom actions + Cmd+K palette - 3.3 Custom Agents: 10 built-in local GPTs with system prompts, tools, temperature, welcome messages, example prompts - 3.4 Model Router: heuristic task classifier (<5ms, no LLM call), auto-selects best model per task type, configurable defaults - 3.5 Multi-Modal Input: file attach, voice (Whisper), images, drag-drop, paste intelligence (code/image/error detection) - 3.6 Response Enhancements: per-message actions, per-code-block copy, branching with navigation, live metrics, rating/quality profiles - 3.7 Scheduled Tasks: cron-based recurring prompts with shell/file input, notification/file/conversation output, 5 built-in templates - 3.8 Multi-Model Orchestration: chain, race, vote modes - 3.9 Projects: conversation folders with system context + model defaults 7 implementation phases (~78hr), component architecture, storage migration, competitive analysis, success metrics, open questions	2026-02-19 23:39:20 -08:00
saravanakumardb1	3dc0c441a9	docs(local-llm): mark all roadmap phases 1-6 complete with commit links All 27 roadmap items + 5 bugs checked off across 6 phases: - Phase 1 (`040013e`): N1-N3, BN1, BN2, BN5 - Phase 2 (`7f04297`): N4-N5, BN3, BN4 - Phase 3 (`6f6baf9`): N6-N10 - Phase 4 (`588d21c`): N11-N14 - Phase 5 (`44ad8a6`): F24-F28 - Phase 6 (`07d3911`): F29-F31	2026-02-19 23:30:11 -08:00
saravanakumardb1	07d391101a	feat(local-llm): Phase 6 — data persistence + export (F29-F31) F29: Export/import settings — gear icon in header opens settings popover, export downloads all llm-* localStorage as JSON, import validates and merges, both with toast feedback F30: Inference history log — saves prompt/response/model/metrics to llm-inference-log (capped 100 FIFO), searchable panel with replay button, count badge in header toggle F31: Factory reset — confirm dialog clears all llm-* localStorage keys, resets all component state to defaults	2026-02-19 23:29:40 -08:00
saravanakumardb1	44ad8a6301	feat(local-llm): Phase 5 — response quality + interaction (F24-F28) F24: Vision image upload — file picker for vision models, base64 encoding, passed through stream API to Ollama generate endpoint F25: Markdown rendering — ReactMarkdown replaces raw <pre> for all prompt responses and chat assistant messages F26: Syntax highlighting — Prism-based code blocks with language labels and oneDark theme via react-syntax-highlighter F27: <think> block collapse — auto-detect and collapse DeepSeek R1 reasoning traces into expandable details with word count F28: Ollama library link — button next to Pull input opens ollama.com/library	2026-02-19 23:25:20 -08:00
saravanakumardb1	588d21c70e	feat(local-llm): Phase 4 — runtime metrics + UX polish (N11-N14) N11: Persist tok/s per model to localStorage (llm-model-benchmarks), display on model card as faded accent text N12: Live countdown to auto-unload — 1s interval, color-coded (green >5m, yellow 1-5m, red <1m 'Unloading soon') N13: Session stats per model (prompts + tokens) in expanded details N14: Co-load suggestions strip below models list showing which unloaded models fit in remaining free memory	2026-02-19 23:20:30 -08:00
saravanakumardb1	6f6baf99c8	feat(local-llm): Phase 3 — model intelligence badges + sort + version (N6-N10) N6: <think> warning badge for DeepSeek R1 and distilled variants N7: Vision model indicator for llava, bakllava, moondream, qwen-vl, etc. N8: Architecture/family badge as pill on every model card N9: Sort dropdown (A-Z, size, params, running, recent) with localStorage persist N10: Ollama server version fetched from /api/version, shown in stats card	2026-02-19 23:17:07 -08:00
saravanakumardb1	7f042975de	feat(local-llm): Phase 2 — rich metadata + persistence (N4-N5, BN3-BN4) N4: RamBudgetBar component — stacked horizontal bar showing OS+Apps, loaded models (by name with color), and free memory segments N5: Context window size — extract context_length from /api/show model_info, cache in modelMetadata state, display on card BN3: Persist chat messages to localStorage (llm-chat-{model}), restore on modal re-open, capped at 50 messages BN4: Logs panel refresh button — RefreshCw icon next to toggle	2026-02-19 23:13:22 -08:00
saravanakumardb1	040013e495	feat(local-llm): Phase 1 — pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5) N1: Estimated RAM per model with quant-aware multipliers (Q4=1.2x, Q5=1.25x, Q8=1.1x, F16=1.05x) N2: Will-it-fit indicator (green/yellow/red dot) next to Load button N3: Aggregate loaded model VRAM in panel header badge BN1: Compare buttons now filter to running models only BN2: AbortController on compare stream, cancel on modal close BN5: Delete confirmation shows model name + disk reclaim size	2026-02-19 23:09:49 -08:00
saravanakumardb1	ae231d5aac	docs(local-llm): comprehensive roadmap review — 5 bugs, 6 phases, 31 items Systematic code review of DASHBOARD_ROADMAP.md against actual codebase: Bugs found (BN1-BN5): - BN1: Compare buttons show unloaded models (can't generate) - BN2: No AbortController on compare stream (leaks on close) - BN3: Chat messages lost on modal close (no persistence) - BN4: Logs panel has no refresh button - BN5: Delete dialog missing reclaim size (partial impl exists) Expanded from 4 phases to 6 + backlog (15 → 31 items): - Phase 1: Pre-load intelligence + bug fixes (N1-N3, BN1-BN2, BN5) - Phase 2: Rich metadata + persistence (N4-N5, BN3-BN4) - Phase 3: Model intelligence badges + sort (N6-N10) - Phase 4: Runtime metrics + UX polish (N11-N14) - Phase 5 (NEW): Response quality — markdown, syntax highlight, vision upload, think-block collapse, model library link - Phase 6 (NEW): Data persistence — export/import, inference log, factory reset - Phase 7: Expanded backlog (F17-F38, +6 new ideas) Improvements: - Added checkboxes for all tasks and acceptance criteria - Quant-aware RAM estimate multipliers (Q4/Q5/Q8/F16) - Broader vision model regex (bakllava, moondream, llama-vision) - DeepSeek R1 distill variant detection for think badge - Conservative memory availability formula (free + cached*0.5) - localStorage key registry with llm- prefix standardization - Dependency graph between phases - ~6 hrs total estimated effort	2026-02-19 23:02:25 -08:00
saravanakumardb1	cd6e561f1b	docs(local-llm): consolidate dashboard docs into dashboard/docs/ - Created DASHBOARD_PRD.md — full updated PRD with current 19-file architecture, all 10 API routes, UI layout, data flow, localStorage keys, security model, and v1-v3 changelog. - Created DASHBOARD_ROADMAP.md — phased implementation plan for N1-N15 improvements across 4 phases: pre-load intelligence, rich metadata, model intelligence badges, runtime metrics. Includes acceptance criteria and implementation details per item. - Updated DASHBOARD_REVIEW.md — refreshed file inventory to 19 files (~2,930 lines), fixed broken Tier B markdown table, added cross-links. - Replaced __LOCAL_LLMs/docs/05-mission-control-dashboard.md with redirect pointer to new dashboard/docs/ location. Dashboard docs are now co-located at __LOCAL_LLMs/dashboard/docs/: - DASHBOARD_PRD.md (product requirements) - DASHBOARD_REVIEW.md (audit + 39 completed items + N1-N15 proposals) - DASHBOARD_ROADMAP.md (phased implementation plan)	2026-02-19 22:54:18 -08:00
saravanakumardb1	519f348583	docs(local-llm): add Next Wave — 15 model intelligence improvements (N1–N15) Section 8 of DASHBOARD_REVIEW.md: pre-load RAM estimates, will-it-fit indicator, RAM budget bar, context window, architecture/vision/think badges, sort, tok/s history, countdown, session stats, delete confirm, co-load suggestions. Organized in 4 tiers with sprint plan.	2026-02-19 22:32:29 -08:00
saravanakumardb1	4090c8aa13	docs(local-llms): add developer guide — API endpoint, code examples, model selection - New 00-developer-guide.md: start-here doc for developers covering: - Ollama endpoint (http://localhost:11434/v1) and API key - curl, TypeScript, Python code examples with env var pattern - Model selection table by task - Running extraction service evals locally - JSON output gotchas (parse from string, <think> strip for R1) - Model management commands - Troubleshooting quick reference - Links to all other docs - Updated index in LOCAL_LLMs_setup_mac_m4_48gb.md to include doc 00	2026-02-19 18:43:06 -08:00
saravanakumardb1	5deb5efdcf	docs(local-llms): add comprehensive model comparison table and deepseek-r1:32b details - Add Comprehensive Model Comparison Table: 11 models (local + cloud) with Disk, Params, Quant, RAM, Tok/s, JSON quality, Reasoning, Code, Instruction Following, Context window, <think> flag, and install status columns - Add Gap Analysis table: llama3.1:8b (~55%), qwen2.5-coder:32b (~85%), deepseek-r1:32b (~75-80%) vs llama3.3:70b across 5 capability dimensions - Update Tier 4 Reasoning table: add Parameters, Quant columns; add <think> warning note with link to eval doc transform pattern - Update By Use Case table: add brain signal routing row, update extraction evals fallback to qwen2.5-coder:32b	2026-02-19 16:06:02 -08:00
saravanakumardb1	cfc1194079	docs(local-llms): add latency/cost comparison and deepseek-r1 transform pattern to evals doc - Add Latency & Cost Comparison table: llama3.1:8b (~1m27s), qwen2.5-coder:32b (~5-8m est.), deepseek-r1:32b (~5-8m est.) vs gemini-2.5-flash (~15-25s, $0.003) and gpt-4o (~20-40s, $0.05-0.15) — all measured at 19 cases, concurrency=4 - Fix assertion pattern docs: single expressions required, not const/return blocks - Add deepseek-r1 <think> strip transform pattern for promptfoo provider config - Expand recommended models table with Disk, Reasoning, Pass Rate, and Notes columns	2026-02-19 16:05:52 -08:00
saravanakumardb1	71a7623553	docs(local-llms): expand installed models table with parameters and quantization - Add Parameters, Quantization, and Status columns to models table - qwen2.5-coder:32b: 32.8B params, Q4_K_M, 18.5 GB disk - llama3.1:8b: 8B params, Q4_K_M, 4.9 GB disk (confirmed via ollama API)	2026-02-19 16:05:42 -08:00
saravanakumardb1	1552006feb	fix(local-llm): proxy extraction health check through API route Move extraction service health check from direct browser fetch (http://localhost:4005/health) to server-side /api/extraction/health proxy. Eliminates ERR_CONNECTION_REFUSED console errors when the extraction service is not running locally.	2026-02-19 15:53:02 -08:00
saravanakumardb1	984630eb45	docs(local-llm): mark ALL 39 items complete in DASHBOARD_REVIEW.md All bugs (11), code quality (6), features (16), performance (5), and security (3) items are now checked off. Added Sprint 6 (`ed93a6f`) and Sprint 7 (`8bdd5ee`) to commit log. Updated summary to reflect 100% completion across 7 sprints.	2026-02-19 15:45:46 -08:00
saravanakumardb1	8bdd5ee1c8	feat(local-llm): Sprint 7 — all remaining features (F5,F7,F8,F12,F13,F15,CQ5,S3) Features: - F5: Model comparison side-by-side — after a prompt response, click any other model to compare. Responses display in two-column grid. - F7: System resource sparklines — memory usage ring buffer (30 points) with SVG sparkline component in the memory stats card. - F8: Ollama logs viewer — collapsible terminal-style panel below main grid. Fetches from /api/ollama/logs route. Color-coded by level. - F12: Whisper transcription test — file upload button in Whisper panel. Uploads audio to /api/whisper/transcribe, displays text + latency. - F13: Responsive mobile layout — p-3/sm:p-6 padding, gap-3/sm:gap-4, hidden sm:inline for header text, responsive comparison grid. - F15: Extraction service panel — health check to localhost:4005 on each refresh. Status card in right column with endpoint + service. Code quality: - CQ5: Skeleton shimmer loading UI — 4 skeleton cards shown while initial data loads. Uses CSS shimmer animation from globals.css. Security: - S3: Documented CORS/auth assumption in code comment — dashboard is local-only, no auth needed for dev tool. New files: - components/Sparkline.tsx — reusable SVG sparkline component - api/ollama/chat/route.ts — streaming chat endpoint (from Sprint 6) - api/ollama/logs/route.ts — Ollama log file reader - api/whisper/transcribe/route.ts — Whisper STT test endpoint	2026-02-19 15:44:20 -08:00
saravanakumardb1	ed93a6f0af	feat(local-llm): Sprint 6 — major feature batch (CQ2,CQ5,CQ6,P5,F4,F10,F14,F16) Code quality: - CQ2: Add CSS utility classes (text-primary/secondary/tertiary, bg-, btn-, input-base) to globals.css — reduces inline style repetition - CQ5: Add skeleton shimmer animation CSS for loading states - CQ6: Replace manual model name validation with Zod schema (PostBodySchema) in Ollama API route Performance: - P5: Eagerly warm static cache on module load — system_profiler no longer blocks first dashboard request Features: - F4: Chat mode with multi-turn conversation via new /api/ollama/chat streaming route. Chat bubble layout, system prompt input, message history. Toggle between prompt/chat modes in modal. - F10: Dark/light theme toggle with CSS var overrides in :root.light. Sun/Moon button in header, persisted in localStorage. - F14: Model tags (coding, chat, fast, vision, reasoning) with colored toggle badges in expanded model details. Persisted in localStorage. - F16: Auto-load preferred model — star toggle in expanded details. When Ollama is online but no models loaded, auto-loads the starred model. Persisted in localStorage.	2026-02-19 15:38:06 -08:00
saravanakumardb1	2936b9f047	docs(local-llm): mark Sprint 5 P1-P3 complete in DASHBOARD_REVIEW.md Check off 3 items (P1, P2, P3) in performance section and sprint tracker. Add commit `b1fda3a` to commit log.	2026-02-19 15:28:59 -08:00
saravanakumardb1	b1fda3a1a5	perf(local-llm): Sprint 5 — request dedup + cache TTLs (P1, P2, P3) Performance fixes: - P1: Add fetchingRef guard to fetchAll() — prevents duplicate requests from rapid Refresh button clicks or overlapping interval ticks - P2: Add 5-minute TTL to staticCache (chip, GPU, brew packages) — previously cached indefinitely per server process, now refreshes after brew upgrades without requiring a restart - P3: Add 60-second TTL cache for Ollama models disk usage (du command) — previously traversed ~/.ollama/models on every 15s refresh cycle, now reuses cached value for 60s	2026-02-19 15:28:07 -08:00
saravanakumardb1	9892fe7145	docs(local-llm): mark Sprint 4 items complete in DASHBOARD_REVIEW.md Check off 4 items (F2, F3, F9, F11) in features list and sprint tracker. F4 (chat mode) deferred. Add commit `9c2f5f3` to commit log.	2026-02-19 15:26:37 -08:00
saravanakumardb1	9c2f5f3396	feat(local-llm): Sprint 4 — UX enhancements (F2, F3, F9, F11) New features: - F2: Model search/filter — search input above models list (shown when 4+ models installed). Filters by name, family, and quantization level. Press / to focus the search input. - F3: Prompt history — saves last 20 prompts to localStorage with model name and timestamp. History dropdown in prompt modal with one-click re-run. Toggle via clock icon in textarea. - F9: Modelfile viewer — expanded model details now fetch and display the Modelfile via the show action. Collapsible <details> element with syntax-highlighted pre block. - F11: Keyboard shortcuts panel — press ? to toggle. Shows all shortcuts: ? (help), R (refresh), / (search), Esc (close/cancel), Cmd+Enter (send). Shortcuts only fire when not in an input field.	2026-02-19 15:25:43 -08:00
saravanakumardb1	40c40756ed	docs(local-llm): mark Sprint 3 items complete in DASHBOARD_REVIEW.md Check off 5 items (CQ1, CQ3, CQ4, S1, S2) in code quality, security, and sprint tracker. CQ2 (inline styles) deferred. Add commit `75a3cd0` to commit log.	2026-02-19 15:22:11 -08:00
saravanakumardb1	75a3cd0826	refactor(local-llm): Sprint 3 — component extraction, error boundary, security (CQ1,CQ3,CQ4,S1,S2) Component extraction (CQ1): - lib/types.ts: All interfaces (OllamaData, SystemData, Toast, etc.) - lib/format.ts: formatBytes, formatUptime utilities - lib/ollama-config.ts: Shared OLLAMA_URL constant - components/StatusDot.tsx: Status indicator component - components/ProgressBar.tsx: Progress bar component - page.tsx: Now imports from extracted modules, reduced from 1180 to 1077 lines (interfaces + utilities + sub-components removed) Error boundary (CQ4): - error.tsx: Next.js App Router error boundary with styled error UI, stack trace preview, and 'Try again' button Shared config (CQ3): - All 3 Ollama API routes now import OLLAMA_URL from lib/ollama-config.ts instead of duplicating the env var fallback Security (S1): - Add MODEL_NAME_RE regex validation on POST /api/ollama — rejects invalid model names before passing to Ollama API Security (S2): - Replace exec() with execFile() for brew package version check — prevents shell injection if targets list ever becomes dynamic	2026-02-19 15:21:22 -08:00
saravanakumardb1	7a82db4876	docs(local-llm): mark Sprint 2 items complete in DASHBOARD_REVIEW.md Check off 5 items (B2, B7, B8, F1, F6) in bug list, features list, and sprint tracker. Add commit `2d9475b` to commit log.	2026-02-19 15:17:16 -08:00
saravanakumardb1	2d9475bd15	feat(local-llm): Sprint 2 — streaming pull progress, token metrics, fixes (B2/F1,F6,B7,B8) New features: - B2/F1: Streaming model pull with real-time progress bar. New /api/ollama/pull/route.ts pipes NDJSON from Ollama stream:true. UI shows status, completed/total bytes, and percentage during download. - F6: Token/s metrics after prompt generation. Parses eval_count and eval_duration from the final NDJSON chunk. Displays tok/s, total tokens, and duration in the prompt modal footer. Bug fixes: - B7: Parse vm_stat page size from output instead of hardcoding 16384. Reads 'page size of N bytes' from the first line for portability. - B8: Whisper model discovery now scans multiple directories: WHISPER_MODELS_DIR env var, ~/whisper-models, /opt/homebrew/share/ whisper-cpp/models/, ~/.cache/whisper/. Returns the first dir with .bin files found.	2026-02-19 15:16:33 -08:00
saravanakumardb1	9a807f64cf	docs(local-llm): mark Sprint 1 items complete in DASHBOARD_REVIEW.md Check off 9 items (B1, B3, B4, B5, B6, B9, B10, B11, P4) in both the bug list and sprint tracker. Add commit `2da67c2` to commit log.	2026-02-19 15:13:43 -08:00
saravanakumardb1	2da67c2f74	fix(local-llm): Sprint 1 — critical dashboard bug fixes (B1,B3-B6,B9-B11,P4) Bug fixes: - B4: Escape key now respects streaming state — during active stream, Escape aborts the generation instead of closing the modal - B5: Auto-refresh (15s interval) pauses during streaming and pull operations to prevent background churn and UI flicker - B9: Add AbortController to streaming fetch — closing modal or pressing Escape cancels the underlying HTTP request, saving CPU/bandwidth - B1: Header subtitle now dynamically shows chip name and RAM from the system API instead of hardcoded 'Apple M4 Pro · 48 GB' - B11: Escape handler clears promptText and promptResponse on close - B6: Toast IDs use Date.now()+random instead of incrementing ref (prevents collision on HMR remount) - B10: Brew panel distinguishes 'Loading...' (system=null) from 'No tracked packages found' (system loaded, empty array) - B3: Remove dead non-streaming generate action from Ollama API route - P4: Add 5-second AbortController timeout to all fetchOllama() calls to prevent indefinite hangs when Ollama is unresponsive	2026-02-19 15:12:41 -08:00
saravanakumardb1	554a5137ec	docs(local-llm): improve dashboard review — add checkboxes, commit log, new findings Rewrite DASHBOARD_REVIEW.md with progress-tracking improvements: - Add GitHub-style checkboxes to all 41 actionable items - Add file inventory table with line counts and purposes - Add commit log section for tracking implementation progress - Add sprint tracker tables with effort estimates and commit columns - New finding B11: prompt text not cleared on Escape close - New finding CQ6: no Zod validation on API responses - Consolidate priority matrix into sprint tables (less redundancy) - Add deferred items section with dependency notes - Improve item descriptions with more precise file:line references - Add stack summary and total effort estimate (14–17 hrs)	2026-02-19 15:11:19 -08:00
saravanakumardb1	093682eace	docs(local-llm): add systematic dashboard bug & improvement review DASHBOARD_REVIEW.md — comprehensive code review of all 6 dashboard files (1,395 lines). Organized into 7 sections: - 10 bugs (B1–B10): hardcoded header, blocking pull, escape during stream, auto-refresh during streaming, no abort controller, vm_stat page size, etc. - 5 code quality issues (CQ1–CQ5): monolithic component, inline styles, duplicated constants, no error boundary, no loading skeleton - 16 feature ideas (F1–F16): pull progress, chat mode, prompt history, token/s metrics, model search, whisper test, extraction integration, etc. - 5 performance items (P1–P5): request deduplication, cache TTL, du latency - 3 security notes (S1–S3): input validation, shell injection pattern, CORS - Priority matrix and 5-sprint implementation roadmap	2026-02-19 14:36:51 -08:00
saravanakumardb1	43f8103c5a	fix(local-llm): show accurate macOS memory (app vs cached vs free) Replace Node.js os.freemem() with vm_stat parsing for macOS. The old approach reported ~47.7 GB / 48 GB 'used' because os.freemem() only counts truly free pages, ignoring ~20 GB of inactive/reclaimable cache. New memory breakdown: - App Memory: active + wired + compressor (actual process usage) - Cached: inactive + purgeable + speculative (reclaimable on demand) - Available: free + cached (what apps can actually use) - Pressure: normal/warning/critical based on app memory ratio Dashboard UI updated to show app memory, cached (reclaimable) label, and pressure-based color coding on progress bars.	2026-02-19 13:22:17 -08:00
saravanakumardb1	b77afce9ae	docs(local-llm): add Mission Control dashboard documentation - docs/05-mission-control-dashboard.md: complete dashboard reference with architecture diagram, API route docs (request/response examples), UI feature descriptions, design tokens table, v1/v2 changelog, and future improvements roadmap	2026-02-19 13:03:30 -08:00
saravanakumardb1	970b565026	fix(local-llm): dashboard v2 — streaming prompts, model management, perf fixes Bug fixes: - Fix Google Fonts build error (corporate proxy blocks fonts.gstatic.com) by removing Geist font imports and switching to system font stack - Fix system API 7.6s latency by caching static info (chip, GPU, brew) with timeouts on shell commands — now responds in ~50ms New features: - Streaming prompt responses via NDJSON proxy (/api/ollama/stream) with typing cursor animation and auto-scroll - Model pull UI: input field + button to download new models - Model delete with two-step confirmation dialog - VRAM usage and expiry time display for loaded models - Toast notifications (success/error/info) with slide-in animation - Copy response button in prompt modal - Escape key closes modals, backdrop click dismisses - Pull/delete/show actions added to Ollama API route	2026-02-19 13:03:11 -08:00
saravanakumardb1	2565714c52	feat(local-llm): add Mission Control dashboard v1 Next.js 16 dashboard for monitoring and managing the local LLM stack. Runs on port 3100 with dark theme using ByteLyst design tokens. API routes: - GET/POST /api/ollama — model list, running status, load/unload/generate - GET /api/whisper — binary discovery, GGML model inventory - GET /api/system — chip info, RAM/disk usage, brew package versions Dashboard UI: - Top stats row: Ollama status, model count, Whisper status, RAM usage - Ollama models panel with load/unload actions, LOADED badge, details - System panel with progress bars for RAM and disk - Whisper.cpp panel with binary list and model inventory - Brew packages panel with version tracking - Basic prompt modal with Cmd+Enter shortcut - Auto-refresh every 15 seconds Also excludes __LOCAL_LLMs/ from root ESLint config (dashboard has its own config and uses browser globals not available in Node.js context). Tech: Next.js 16, React 19, TailwindCSS v4, Lucide icons, TypeScript	2026-02-19 13:02:48 -08:00
saravanakumardb1	0c4210f5ff	docs(local-llm): update original setup doc to redirect to docs/ structure - LOCAL_LLMs_setup_mac_m4_48gb.md: replace 279-line monolith with quick start + documentation index linking to 9 topic-specific docs in docs/ - Add .gitignore for extraction-service eval logs (generated artifacts)	2026-02-19 13:01:35 -08:00
saravanakumardb1	3561deee52	docs(local-llm): add multimodal stack, model recommendations, and troubleshooting - docs/04-multimodal-local-stack.md: vision models (llava, qwen2.5vl, moondream2), audio pipeline architecture, video understanding status, Kimi alternatives, complete local AI stack diagram - docs/07-model-recommendations.md: 6-tier model guide (coding, fast, general, reasoning, vision, embeddings), recommended 10-model stack for M4 Pro 48GB, use-case quick reference, hardware scaling guide - docs/08-troubleshooting.md: corporate Forcepoint proxy workarounds, MLX warning, JSON parse errors, slow inference, whisper-cli vs whisper-cpp naming, audio format conversion, proxy-corrupted downloads detection	2026-02-19 13:01:22 -08:00
saravanakumardb1	80f794dee7	docs(local-llm): add Ollama setup, extraction evals, and env vars reference - docs/02-ollama-setup-and-models.md: installation, server config, memory management, idle timeout, manual load/unload, OpenAI-compatible API, native API reference, performance tuning flags (flash attention, KV cache) - docs/06-extraction-service-evals.md: promptfoo eval suite against Ollama, 19 cases across 5 tasks, assertion patterns for JSON string output, Python sidecar config - docs/09-environment-variables.md: comprehensive var reference for Ollama server, evals, Python sidecar, dashboard, whisper CLI flags, proxy/network settings	2026-02-19 13:01:05 -08:00
saravanakumardb1	464ffb92ec	docs(local-llm): add docs index, hardware specs, and whisper-cpp setup - docs/README.md: documentation index with quick start, file structure, status table - docs/01-hardware-and-prerequisites.md: M4 Pro 48GB specs, toolchain inventory, disk budget, network environment (Forcepoint proxy details) - docs/03-whisper-cpp-setup.md: whisper-cpp installation, GGML model guide, ffmpeg audio conversion, CLI usage, real-time streaming, LysnrAI integration	2026-02-19 13:00:48 -08:00
saravanakumardb1	dd23f6cf96	docs: add local LLM setup guide for Apple Silicon Mac (48GB) - Add __LOCAL_LLMs/LOCAL_LLMs_setup_mac_m4_48gb.md: comprehensive reference for running Ollama on the dev Mac covering installation (v0.16.2 via brew), corp proxy handling (AT&T Forcepoint), OpenAI-compat API usage examples (curl/Node/Python), extraction-service eval integration, Python sidecar wiring, model recommendations by use case, troubleshooting, and env var reference - Models documented: llama3.1:8b (4.9GB, default evals), qwen2.5-coder:32b (19GB, code gen / Swift / TS)	2026-02-19 12:19:44 -08:00

42 Commits