saravanakumardb1 efd45ad86f feat(local-llms): add one-click Windows setup scripts

- setup-windows.ps1: PowerShell script for Windows side
  - NVIDIA driver verification, Ollama install via winget
  - Pull all 5 models with skip-if-exists logic
  - WSL2 Ubuntu 24.04 install
- setup-wsl.sh: Bash script for WSL2 side
  - Idempotent apt deps (Node.js 20, Python 3.12, ffmpeg, cmake)
  - CUDA GPU passthrough verification
  - Repo clone + git pull, whisper.cpp CUDA build
  - Whisper model download, TTS setup, dashboard start
- README.md: 2-step quick start (no IDE required)
- setup-guide.md: add automated setup section at top

2026-02-21 16:28:02 -08:00

3.2 KiB

Raw Blame History

Windows Setup — Local LLM Stack

Two scripts. Zero IDE required. Fresh machine to running dashboard in ~30 minutes.

Quick Start

Step 1: Windows Side (PowerShell as Admin)

# Allow script execution for this session
Set-ExecutionPolicy -Scope Process Bypass

# Run the Windows setup (installs Ollama, pulls models, installs WSL2)
.\setup-windows.ps1

What it does: Verifies NVIDIA drivers, installs Ollama, pulls 5 models (~52 GB), installs WSL2 Ubuntu 24.04.

After WSL2 install you may need to reboot. Ubuntu will ask for a username/password on first launch.

Step 2: WSL2 Side (Ubuntu terminal)

# One-liner — downloads and runs the WSL2 setup script
curl -fsSL https://raw.githubusercontent.com/saravanakumardb1/learning_ai_common_plat/main/__LOCAL_LLMs/windows_specific/setup-wsl.sh | bash

What it does: Installs Node.js, Python, ffmpeg, cmake → builds Whisper.cpp with CUDA → sets up TTS → starts the dashboard.

Step 3: Open Browser

http://localhost:3000

Dashboard should show all green. Done.

What Gets Installed

Component	Where	Size
NVIDIA drivers	Windows	pre-installed
Ollama	Windows (native)	~200 MB
5 LLM models	Windows (`%USERPROFILE%\.ollama\`)	~52 GB
WSL2 Ubuntu 24.04	Windows	~1 GB
Node.js 20 LTS	WSL2	~50 MB
Python 3.12 + venv	WSL2	~200 MB
whisper.cpp (CUDA)	WSL2 `/usr/local/bin/`	~50 MB
Whisper model	WSL2 `~/whisper-models/`	~1.5 GB
SNAC decoder	WSL2 (repo `models/`)	~76 MB
Qwen3-TTS 0.6B	WSL2 (repo `models/`)	~1.7 GB
PyTorch CUDA	WSL2 (`.venv-qwen-tts/`)	~2.5 GB
Dashboard deps	WSL2 (`dashboard/node_modules/`)	~200 MB

Total: ~60 GB (mostly Ollama models)

Files in This Directory

File	Purpose
README.md	This file — quick start guide
setup-windows.ps1	PowerShell script — Windows-side setup
setup-wsl.sh	Bash script — WSL2-side setup
setup-guide.md	Detailed manual guide with troubleshooting
razer-blade-18-spec.md	Full hardware specs for the Razer Blade 18

After Setup

# Daily usage — start everything
cd ~/code/mygh/learning_ai_common_plat/__LOCAL_LLMs
bash start-dashboard.sh

# Check status
bash start-dashboard.sh status

# Stop
bash start-dashboard.sh stop

# Test TTS
.venv-qwen-tts/bin/python test_orpheus_tts.py
.venv-qwen-tts/bin/python test_qwen_tts.py

Troubleshooting

See setup-guide.md for common issues:

Ollama not accessible from WSL2
CUDA not visible in WSL2
Slow filesystem performance

3.2 KiB Raw Blame History