move oss_llm/ from learning_ai_2nd_brain
This commit is contained in:
parent
a32978f9c3
commit
1a4b3c1fb3
383
__LOCAL_LLMs/oss_llm/README.md
Normal file
383
__LOCAL_LLMs/oss_llm/README.md
Normal file
@ -0,0 +1,383 @@
|
|||||||
|
# Kimi “2.5” local deployment on macOS (what’s реально possible)
|
||||||
|
|
||||||
|
## Executive summary
|
||||||
|
|
||||||
|
- **Running the real “Kimi 2.5 / Kimi K2-class” model fully locally on a Mac is not practical**: the official open-weight Kimi K2 deployment guidance targets **multi-node NVIDIA GPU clusters** (vLLM/SGLang/TensorRT-LLM) and assumes CUDA.
|
||||||
|
- **What _is_ practical on a Mac:** use **Kimi Code CLI** (official) or the **Moonshot API** (official). That’s not “local inference” (weights on your laptop), but it is “local usage” (client runs on your laptop).
|
||||||
|
- **VPN / proxy:** yes, clients can usually work through VPN/proxy, but you must be able to reach the provider’s API endpoints. If your network blocks access, you’ll need allowlisting or a different route.
|
||||||
|
|
||||||
|
## What GitHub shows (official sources)
|
||||||
|
|
||||||
|
### 1) Kimi K2 (open-weight model series)
|
||||||
|
|
||||||
|
- Official repo: https://github.com/MoonshotAI/Kimi-K2
|
||||||
|
- The repo includes a deployment guide at `docs/deploy_guidance.md`.
|
||||||
|
- Key reality check from the guide: **the smallest FP8 128k-seqlen deployment is described as ~16 GPUs (H200/H20-class)** for mainstream setups (vLLM/SGLang). This is fundamentally not macOS-laptop friendly.
|
||||||
|
|
||||||
|
### 2) Kimi Code CLI (best option for macOS)
|
||||||
|
|
||||||
|
- Official repo: https://github.com/MoonshotAI/kimi-cli
|
||||||
|
- Docs: https://moonshotai.github.io/kimi-cli/en/guides/getting-started.html
|
||||||
|
- Kimi Code CLI is an **agent client** (terminal tool) that talks to a remote provider.
|
||||||
|
- First-run authentication options:
|
||||||
|
- `/login` (browser login; auto-configures models)
|
||||||
|
- `/setup` (API key flow)
|
||||||
|
|
||||||
|
## Can we locally deploy “Kimi 2.5” on this Mac?
|
||||||
|
|
||||||
|
### Practically: no (for K2 / K2-class)
|
||||||
|
|
||||||
|
- **macOS has no CUDA**, so GPU-first inference stacks referenced for K2 deployment (vLLM / TensorRT-LLM) are not an option.
|
||||||
|
- Even if you tried CPU inference, the K2-class model scale (MoE, 1T total params / 32B active) is far beyond what is reasonable on a laptop.
|
||||||
|
|
||||||
|
### What’s feasible instead
|
||||||
|
|
||||||
|
1. **Use Kimi Code CLI on macOS** (recommended)
|
||||||
|
2. **Use Moonshot/Kimi APIs from your own scripts** (Python/Node/etc) once your network allows access
|
||||||
|
3. If you truly need **local weights on Mac**, use a Mac-friendly model/runtime (e.g., MLX/Ollama) — but that would be **a different model**, not Kimi 2.5/K2.
|
||||||
|
|
||||||
|
## Options: smaller, Mac-runnable open models (local inference)
|
||||||
|
|
||||||
|
If your goal is “no network calls at runtime”, pick a **local runtime** + a **small-enough model + quantization**.
|
||||||
|
|
||||||
|
## If you can only access GitHub
|
||||||
|
|
||||||
|
If your enterprise network only allows `github.com`, that severely limits how you obtain model weights because most model hosting is **not** on GitHub.
|
||||||
|
|
||||||
|
What still works:
|
||||||
|
|
||||||
|
- **GitHub Releases assets**: some projects publish quantized model files (often `.gguf`) as release assets.
|
||||||
|
- **Git LFS inside a repo**: occasionally weights are stored in-repo via LFS (still uncommon for large models).
|
||||||
|
|
||||||
|
What usually won’t work:
|
||||||
|
|
||||||
|
- Downloading weights from external model registries / hosting sites (blocked by policy).
|
||||||
|
|
||||||
|
Reality check: GitHub has practical size limits, so **very large models are rarely hosted there**. Your best bet is to use **small models (7B–14B) in quantized form**.
|
||||||
|
|
||||||
|
### How to find downloadable models on GitHub
|
||||||
|
|
||||||
|
- Use [oss_llm/find_models_on_github.py](oss_llm/find_models_on_github.py) to search GitHub repos and list any release assets that look like model files.
|
||||||
|
- Prefer assets ending in `.gguf` if you plan to run with `llama.cpp`.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python3 oss_llm/find_models_on_github.py --query "gguf qwen 2.5" --limit 20
|
||||||
|
```
|
||||||
|
|
||||||
|
### Recommended enterprise pattern
|
||||||
|
|
||||||
|
If you need a specific model but can’t download it directly:
|
||||||
|
|
||||||
|
1. Ask security for an **approved internal mirror** / artifact store.
|
||||||
|
2. Mirror the model files there.
|
||||||
|
3. Point your local runtime (Ollama/llama.cpp/MLX) at the internal location.
|
||||||
|
|
||||||
|
### “No security review” learning path (best effort): train a tiny model yourself
|
||||||
|
|
||||||
|
If you want to avoid downloading any third-party model weights at all, the cleanest option is to:
|
||||||
|
|
||||||
|
- clone training code from GitHub
|
||||||
|
- train a small model on a local/public-domain text file
|
||||||
|
|
||||||
|
This won’t produce a state-of-the-art assistant, but it’s excellent for learning tokenization, training loops, sampling, and basic eval.
|
||||||
|
|
||||||
|
One popular repo for this is **karpathy/nanoGPT**.
|
||||||
|
|
||||||
|
High-level steps:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
git clone https://github.com/karpathy/nanoGPT.git
|
||||||
|
cd nanoGPT
|
||||||
|
|
||||||
|
# create a python env (choose your preferred method)
|
||||||
|
python3 -m venv .venv && source .venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# run the built-in Shakespeare example (it downloads a small text file)
|
||||||
|
python data/shakespeare_char/prepare.py
|
||||||
|
python train.py data/shakespeare_char --device=cpu --compile=False
|
||||||
|
|
||||||
|
# sample
|
||||||
|
python sample.py --out_dir=out-shakespeare-char --device=cpu --compile=False
|
||||||
|
```
|
||||||
|
|
||||||
|
If even small downloads are restricted, replace the dataset step with your own local text file and adjust the dataset script accordingly.
|
||||||
|
|
||||||
|
### nanoGPT demo (this repo)
|
||||||
|
|
||||||
|
This workspace includes a reproducible nanoGPT workflow (CPU and Apple Silicon MPS) plus sampling demo commands:
|
||||||
|
|
||||||
|
- See [oss_llm/testNanoGPT/README.md](oss_llm/testNanoGPT/README.md)
|
||||||
|
|
||||||
|
### If you do download GGUF weights from GitHub
|
||||||
|
|
||||||
|
It can work (some repos commit a `.gguf` directly), but treat it like any third-party binary:
|
||||||
|
|
||||||
|
- Prefer repos with **clear licensing** (repo license + explicit model license/provenance)
|
||||||
|
- Prefer “original publisher” repos over re-uploads
|
||||||
|
- Keep models small (e.g., ~100M–3B) for macOS learning
|
||||||
|
|
||||||
|
### Recommended runtimes (macOS)
|
||||||
|
|
||||||
|
- **Ollama**: easiest “download + chat + local HTTP API” experience.
|
||||||
|
- **LM Studio**: easy GUI; also exposes a local API server.
|
||||||
|
- **llama.cpp**: most portable; great for CPU/Metal, quantized GGUF models.
|
||||||
|
- **MLX** (Apple): best when you want Python-native workflows on Apple Silicon.
|
||||||
|
|
||||||
|
### Model size guidance (rule of thumb)
|
||||||
|
|
||||||
|
Quantized models are what make laptops viable.
|
||||||
|
|
||||||
|
- **8–10B @ 4-bit**: typically comfortable on 16GB unified memory.
|
||||||
|
- **14B @ 4-bit**: better with 24–32GB unified memory.
|
||||||
|
- **30B+**: usually needs 64GB+ and will still be slow.
|
||||||
|
|
||||||
|
### Good “starter” model families (pick one)
|
||||||
|
|
||||||
|
These are widely supported by the runtimes above and have strong general utility:
|
||||||
|
|
||||||
|
- **Llama 3.x (8B class)**: strong general chat + coding for the size.
|
||||||
|
- **Qwen 2.5 (7B/14B class)**: strong multilingual + coding.
|
||||||
|
- **Mistral 7B class**: fast and solid baseline.
|
||||||
|
- **Gemma 2 (9B class)**: good general-purpose quality.
|
||||||
|
- **Phi-3.x (mini/small class)**: very fast and lightweight.
|
||||||
|
|
||||||
|
### Suggested picks by Mac memory
|
||||||
|
|
||||||
|
- **16GB unified memory**: start with an 8–9B model at 4-bit.
|
||||||
|
- **32GB unified memory**: 14B at 4-bit is a good sweet spot.
|
||||||
|
- **64GB unified memory**: 27–34B at 4-bit becomes feasible (still slower).
|
||||||
|
|
||||||
|
### Practical setup: Ollama (quickest)
|
||||||
|
|
||||||
|
1. Install runtime (choose the install method your enterprise allows; Homebrew is common):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
brew install ollama
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Start the local service:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
ollama serve
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Pull/run a model (example placeholder):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
ollama run <model-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Use the local API (optional):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
curl http://127.0.0.1:11434/api/tags
|
||||||
|
```
|
||||||
|
|
||||||
|
### Practical setup: llama.cpp (most controllable)
|
||||||
|
|
||||||
|
If you can obtain a quantized GGUF model file via an approved internal mirror:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
brew install llama.cpp
|
||||||
|
llama-cli -m /path/to/model.gguf -p "Hello" -n 256
|
||||||
|
```
|
||||||
|
|
||||||
|
### Practical setup: MLX (Python-centric on Apple Silicon)
|
||||||
|
|
||||||
|
If your environment allows Python packages and you have an MLX-converted model available internally:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python3 -m venv .venv && source .venv/bin/activate
|
||||||
|
pip install mlx mlx-lm
|
||||||
|
python -m mlx_lm.generate --model /path/to/mlx-model --prompt "Hello" --max-tokens 256
|
||||||
|
```
|
||||||
|
|
||||||
|
### Enterprise note (important)
|
||||||
|
|
||||||
|
Because model hosting sites may be blocked in your network category, the usual pattern in enterprise is:
|
||||||
|
|
||||||
|
1. Security-approved model list
|
||||||
|
2. Internal artifact store / mirror for model files
|
||||||
|
3. Local runtime (Ollama/llama.cpp/MLX) pointing to those internally hosted artifacts
|
||||||
|
|
||||||
|
## Steps (macOS): run Kimi Code CLI locally (client-side)
|
||||||
|
|
||||||
|
Source: Kimi CLI docs.
|
||||||
|
|
||||||
|
1. Install
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Install via uv (Python package manager)
|
||||||
|
uv tool install --python 3.13 kimi-cli
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Verify
|
||||||
|
|
||||||
|
```sh
|
||||||
|
kimi --version
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Start in a project directory
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd /Users/sd9235/code/mygh/learning_ai_2nd_brain
|
||||||
|
kimi
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Authenticate
|
||||||
|
|
||||||
|
- Preferred:
|
||||||
|
- Run `/login` inside the CLI and complete the browser auth.
|
||||||
|
- Alternative:
|
||||||
|
- Run `/setup` and choose an API platform + API key + model.
|
||||||
|
|
||||||
|
5. If models don’t show up
|
||||||
|
|
||||||
|
- Kimi CLI FAQ: verify network access to your configured provider’s API endpoints.
|
||||||
|
|
||||||
|
## Steps (NOT macOS): deploy Kimi K2 weights (server-side)
|
||||||
|
|
||||||
|
If you have access to Linux + NVIDIA GPUs, use the official K2 deployment guide:
|
||||||
|
|
||||||
|
- vLLM (requires CUDA; the guide notes vLLM v0.10.0rc1+)
|
||||||
|
- SGLang
|
||||||
|
- TensorRT-LLM
|
||||||
|
|
||||||
|
This is the realistic path if you truly need “local” (self-hosted) K2/K2-class inference: **run the model on a GPU box/cluster** and call it from your Mac.
|
||||||
|
|
||||||
|
## VPN / proxy: are we able to access through it?
|
||||||
|
|
||||||
|
### What you need
|
||||||
|
|
||||||
|
You generally need outbound access (through your VPN/proxy) to at least:
|
||||||
|
|
||||||
|
- your chosen provider’s API host (varies by provider)
|
||||||
|
- and, if downloading open weights, the model hosting site you plan to use.
|
||||||
|
|
||||||
|
### Quick connectivity checks
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# DNS + HTTPS reachability
|
||||||
|
curl -I https://<YOUR_PROVIDER_API_HOST>
|
||||||
|
|
||||||
|
# If you plan to download weights later
|
||||||
|
curl -I https://<YOUR_MODEL_HOSTING_SITE>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configure proxy in a shell (typical)
|
||||||
|
|
||||||
|
```sh
|
||||||
|
export HTTP_PROXY="http://127.0.0.1:7890"
|
||||||
|
export HTTPS_PROXY="http://127.0.0.1:7890"
|
||||||
|
export ALL_PROXY="socks5://127.0.0.1:7890"
|
||||||
|
export NO_PROXY="localhost,127.0.0.1"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configure proxy for Git
|
||||||
|
|
||||||
|
```sh
|
||||||
|
git config --global http.proxy "$HTTPS_PROXY"
|
||||||
|
git config --global https.proxy "$HTTPS_PROXY"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Configure proxy for Python/pip
|
||||||
|
|
||||||
|
```sh
|
||||||
|
pip config set global.proxy "$HTTPS_PROXY"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Notes about your current network
|
||||||
|
|
||||||
|
From within this VS Code environment, requests to some provider/model-hosting sites were redirected to a corporate/web-filter “blockpage” URL. If you see that on your Mac too, you’ll need one of:
|
||||||
|
|
||||||
|
- VPN that routes around the filter
|
||||||
|
- proxy that’s allowed
|
||||||
|
- allowlist/exception for those domains
|
||||||
|
|
||||||
|
## Recommendation
|
||||||
|
|
||||||
|
- If your goal is to _use_ Kimi on this Mac: **install Kimi Code CLI** and make sure your VPN/proxy allows access to your configured provider.
|
||||||
|
- If your goal is “true local inference”: **host Kimi K2 on a CUDA GPU server** (or use a smaller Mac-native model instead).
|
||||||
|
|
||||||
|
## Your Mac (detected)
|
||||||
|
|
||||||
|
- macOS: 15.7.3 (24G419)
|
||||||
|
- CPU arch: arm64
|
||||||
|
- Machine: MacBook Pro (Mac16,7)
|
||||||
|
- Chip: Apple M4 Pro (14 cores)
|
||||||
|
- Memory: 48 GB
|
||||||
|
- Python: 3.13.10
|
||||||
|
|
||||||
|
## Will nanoGPT work on this laptop?
|
||||||
|
|
||||||
|
Yes for learning, with a couple of caveats.
|
||||||
|
|
||||||
|
What will work well:
|
||||||
|
|
||||||
|
- **Small CPU/MPS runs** (toy datasets like Shakespeare, short experiments, sampling).
|
||||||
|
- With 48GB RAM and Apple Silicon, you have plenty of headroom for nanoGPT-style demos.
|
||||||
|
|
||||||
|
What might block you:
|
||||||
|
|
||||||
|
- Installing dependencies (notably PyTorch) typically requires access to package indexes that may be blocked in your network.
|
||||||
|
- In this workspace, PyTorch was successfully installed into the venv and `mps_available` is `True`.
|
||||||
|
- If you can’t reach package indexes in a different environment, use an internal Python package mirror, or install from a pre-approved wheelhouse.
|
||||||
|
|
||||||
|
Practical suggestion:
|
||||||
|
|
||||||
|
- Start with CPU (`--device=cpu`) to keep it simple.
|
||||||
|
- If your PyTorch build supports Apple Metal (MPS), you can later try `--device=mps` for speed.
|
||||||
|
|
||||||
|
Quick verification (after installing torch):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python3 -c "import torch; print(torch.__version__); print('mps', torch.backends.mps.is_available())"
|
||||||
|
```
|
||||||
|
|
||||||
|
## nanoGPT: validated end-to-end in this workspace
|
||||||
|
|
||||||
|
This is a minimal, fast run that was verified on this machine.
|
||||||
|
|
||||||
|
### 1) Install deps (workspace venv)
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# from the workspace root
|
||||||
|
/Users/sd9235/code/mygh/learning_ai_2nd_brain/.venv/bin/python -m pip install torch numpy transformers datasets tiktoken wandb tqdm
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) Clone nanoGPT
|
||||||
|
|
||||||
|
```sh
|
||||||
|
git clone https://github.com/karpathy/nanoGPT.git oss_llm/nanoGPT
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3) Prepare dataset
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd oss_llm/nanoGPT
|
||||||
|
/Users/sd9235/code/mygh/learning_ai_2nd_brain/.venv/bin/python data/shakespeare_char/prepare.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) Short CPU training (writes `out-shakespeare-char/ckpt.pt`)
|
||||||
|
|
||||||
|
```sh
|
||||||
|
/Users/sd9235/code/mygh/learning_ai_2nd_brain/.venv/bin/python train.py config/train_shakespeare_char.py \
|
||||||
|
--device=cpu --compile=False \
|
||||||
|
--eval_interval=10 --eval_iters=10 --log_interval=10 \
|
||||||
|
--block_size=64 --batch_size=12 \
|
||||||
|
--n_layer=4 --n_head=4 --n_embd=128 \
|
||||||
|
--max_iters=60 --lr_decay_iters=60 --dropout=0.0 \
|
||||||
|
--always_save_checkpoint=True
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5) Sample
|
||||||
|
|
||||||
|
```sh
|
||||||
|
/Users/sd9235/code/mygh/learning_ai_2nd_brain/.venv/bin/python sample.py \
|
||||||
|
--out_dir=out-shakespeare-char --device=cpu --max_new_tokens=200
|
||||||
|
```
|
||||||
|
|
||||||
|
Tip: for speed on Apple Silicon, try `--device=mps` once you’re comfortable.
|
||||||
251
__LOCAL_LLMs/oss_llm/find_models_on_github.py
Normal file
251
__LOCAL_LLMs/oss_llm/find_models_on_github.py
Normal file
@ -0,0 +1,251 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Find model-like files hosted on GitHub.
|
||||||
|
|
||||||
|
This script ONLY talks to GitHub (api.github.com) and is designed for
|
||||||
|
restricted enterprise networks where only GitHub is reachable.
|
||||||
|
|
||||||
|
It searches repositories by keyword, then inspects recent releases to find
|
||||||
|
assets that look like model files (e.g., .gguf, .safetensors, .bin, .zip).
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python oss_llm/find_models_on_github.py --query "qwen2.5 gguf" --limit 10
|
||||||
|
|
||||||
|
Optional:
|
||||||
|
- Set GITHUB_TOKEN to increase API rate limits.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- GitHub rarely hosts very large model weights due to size constraints.
|
||||||
|
- Prefer smaller models and quantized artifacts.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import textwrap
|
||||||
|
import urllib.parse
|
||||||
|
import json
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
MODEL_EXTENSIONS = (
|
||||||
|
".gguf",
|
||||||
|
".safetensors",
|
||||||
|
".bin",
|
||||||
|
".pt",
|
||||||
|
".pth",
|
||||||
|
".onnx",
|
||||||
|
".zip",
|
||||||
|
".tar.gz",
|
||||||
|
".tgz",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _http_get_json(url: str) -> object:
|
||||||
|
token = os.environ.get("GITHUB_TOKEN") or os.environ.get("GH_TOKEN")
|
||||||
|
|
||||||
|
cmd = [
|
||||||
|
"curl",
|
||||||
|
"-L",
|
||||||
|
"--fail",
|
||||||
|
"--silent",
|
||||||
|
"--show-error",
|
||||||
|
"-H",
|
||||||
|
"Accept: application/vnd.github+json",
|
||||||
|
]
|
||||||
|
if token:
|
||||||
|
cmd += ["-H", f"Authorization: Bearer {token}"]
|
||||||
|
cmd.append(url)
|
||||||
|
|
||||||
|
proc = subprocess.run(cmd, capture_output=True, text=True)
|
||||||
|
if proc.returncode != 0:
|
||||||
|
raise RuntimeError(proc.stderr.strip() or f"curl failed with exit code {proc.returncode}")
|
||||||
|
return json.loads(proc.stdout)
|
||||||
|
|
||||||
|
|
||||||
|
def _looks_like_model_asset(name: str) -> bool:
|
||||||
|
lower = name.lower()
|
||||||
|
return any(lower.endswith(ext) for ext in MODEL_EXTENSIONS)
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_int(value: object) -> int | None:
|
||||||
|
try:
|
||||||
|
return int(value) # type: ignore[arg-type]
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _scan_repo_tree_for_models(full_name: str, default_branch: str) -> list[str]:
|
||||||
|
"""Return matching file paths from the repo tree.
|
||||||
|
|
||||||
|
This is useful when model files are stored in-repo (often via Git LFS),
|
||||||
|
and therefore don't show up as release assets.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# The Trees API can be large; callers should keep repo counts low.
|
||||||
|
url = f"https://api.github.com/repos/{full_name}/git/trees/{urllib.parse.quote(default_branch)}?recursive=1"
|
||||||
|
data = _http_get_json(url)
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
return []
|
||||||
|
tree = data.get("tree")
|
||||||
|
if not isinstance(tree, list):
|
||||||
|
return []
|
||||||
|
|
||||||
|
hits: list[str] = []
|
||||||
|
for node in tree:
|
||||||
|
if not isinstance(node, dict):
|
||||||
|
continue
|
||||||
|
path = node.get("path")
|
||||||
|
ntype = node.get("type")
|
||||||
|
if ntype != "blob" or not isinstance(path, str):
|
||||||
|
continue
|
||||||
|
if _looks_like_model_asset(path):
|
||||||
|
hits.append(path)
|
||||||
|
return hits
|
||||||
|
|
||||||
|
|
||||||
|
def _human_size(num_bytes: int) -> str:
|
||||||
|
n = float(num_bytes)
|
||||||
|
for unit in ("B", "KB", "MB", "GB", "TB"):
|
||||||
|
if n < 1024.0:
|
||||||
|
return f"{n:.1f}{unit}"
|
||||||
|
n /= 1024.0
|
||||||
|
return f"{n:.1f}PB"
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
p = argparse.ArgumentParser(
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
description="Search GitHub releases for model-like assets.",
|
||||||
|
epilog=textwrap.dedent(
|
||||||
|
"""
|
||||||
|
Examples:
|
||||||
|
python oss_llm/find_models_on_github.py --query "llama gguf" --limit 20
|
||||||
|
python oss_llm/find_models_on_github.py --query "qwen 2.5 gguf" --limit 10
|
||||||
|
|
||||||
|
Tips:
|
||||||
|
- Add qualifiers to narrow results, e.g. `language:python`, `in:name`, `topic:llama-cpp`.
|
||||||
|
- Set GITHUB_TOKEN to avoid low unauthenticated rate limits.
|
||||||
|
"""
|
||||||
|
),
|
||||||
|
)
|
||||||
|
p.add_argument("--query", required=True, help="GitHub search query")
|
||||||
|
p.add_argument("--limit", type=int, default=10, help="Max repositories to inspect")
|
||||||
|
p.add_argument(
|
||||||
|
"--per-page",
|
||||||
|
type=int,
|
||||||
|
default=10,
|
||||||
|
help="Repos per page from search API (max 100)",
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--max-releases",
|
||||||
|
type=int,
|
||||||
|
default=5,
|
||||||
|
help="Max recent releases per repo to inspect",
|
||||||
|
)
|
||||||
|
p.add_argument(
|
||||||
|
"--scan-tree",
|
||||||
|
action="store_true",
|
||||||
|
help="Also scan repository file trees for model-like files (slower, more API calls)",
|
||||||
|
)
|
||||||
|
args = p.parse_args()
|
||||||
|
|
||||||
|
query = args.query.strip()
|
||||||
|
if not query:
|
||||||
|
print("--query must be non-empty", file=sys.stderr)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
per_page = max(1, min(100, args.per_page))
|
||||||
|
limit = max(1, args.limit)
|
||||||
|
|
||||||
|
search_q = urllib.parse.quote(query)
|
||||||
|
search_url = f"https://api.github.com/search/repositories?q={search_q}&per_page={per_page}"
|
||||||
|
|
||||||
|
try:
|
||||||
|
search = _http_get_json(search_url)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"GitHub search failed: {e}", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
items = search.get("items") if isinstance(search, dict) else None
|
||||||
|
if not items:
|
||||||
|
print("No repositories found.")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
inspected = 0
|
||||||
|
found_any = False
|
||||||
|
|
||||||
|
for repo in items:
|
||||||
|
if inspected >= limit:
|
||||||
|
break
|
||||||
|
|
||||||
|
full_name = repo.get("full_name")
|
||||||
|
html_url = repo.get("html_url")
|
||||||
|
default_branch = repo.get("default_branch")
|
||||||
|
if not full_name or not html_url:
|
||||||
|
continue
|
||||||
|
|
||||||
|
inspected += 1
|
||||||
|
print(f"\n== {full_name} ==")
|
||||||
|
print(html_url)
|
||||||
|
|
||||||
|
releases_url = f"https://api.github.com/repos/{full_name}/releases?per_page={args.max_releases}"
|
||||||
|
try:
|
||||||
|
releases = _http_get_json(releases_url)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" releases: error: {e}")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not isinstance(releases, list) or len(releases) == 0:
|
||||||
|
print(" releases: none")
|
||||||
|
releases = []
|
||||||
|
|
||||||
|
repo_hit = False
|
||||||
|
for rel in releases[: args.max_releases]:
|
||||||
|
tag = rel.get("tag_name") or "(no tag)"
|
||||||
|
name = rel.get("name") or "(no name)"
|
||||||
|
assets = rel.get("assets") or []
|
||||||
|
model_assets = [a for a in assets if _looks_like_model_asset(a.get("name", ""))]
|
||||||
|
|
||||||
|
if not model_assets:
|
||||||
|
continue
|
||||||
|
|
||||||
|
repo_hit = True
|
||||||
|
found_any = True
|
||||||
|
print(f" release: {tag} — {name}")
|
||||||
|
for a in model_assets:
|
||||||
|
aname = a.get("name") or "(no name)"
|
||||||
|
size = a.get("size")
|
||||||
|
url = a.get("browser_download_url") or ""
|
||||||
|
size_str = _human_size(int(size)) if isinstance(size, int) else "?"
|
||||||
|
print(f" - {aname} ({size_str})")
|
||||||
|
if url:
|
||||||
|
print(f" {url}")
|
||||||
|
|
||||||
|
if releases and not repo_hit:
|
||||||
|
print(" releases: present, but no model-like assets found")
|
||||||
|
|
||||||
|
if args.scan_tree and isinstance(default_branch, str) and default_branch:
|
||||||
|
try:
|
||||||
|
paths = _scan_repo_tree_for_models(full_name, default_branch)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" repo files: error: {e}")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if paths:
|
||||||
|
found_any = True
|
||||||
|
print(f" repo files: found {len(paths)} model-like paths")
|
||||||
|
for pth in paths[:30]:
|
||||||
|
print(f" - {pth}")
|
||||||
|
if len(paths) > 30:
|
||||||
|
print(" - (more omitted)")
|
||||||
|
else:
|
||||||
|
print(" repo files: no model-like paths found")
|
||||||
|
|
||||||
|
if not found_any:
|
||||||
|
print("\nNo model-like release assets found in inspected repositories.")
|
||||||
|
print("Try refining your query (add 'gguf', model family name, or 'quant').")
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
25
__LOCAL_LLMs/oss_llm/quick_checks.sh
Normal file
25
__LOCAL_LLMs/oss_llm/quick_checks.sh
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
echo "== Network reachability =="
|
||||||
|
|
||||||
|
: "${CHECK_URLS:=}"
|
||||||
|
|
||||||
|
if [[ -z "${CHECK_URLS}" ]]; then
|
||||||
|
cat <<'TXT'
|
||||||
|
No URLs configured.
|
||||||
|
|
||||||
|
Set CHECK_URLS to a space-separated list of HTTPS URLs you want to test, e.g.:
|
||||||
|
CHECK_URLS="https://github.com https://example.com" ./oss_llm/quick_checks.sh
|
||||||
|
|
||||||
|
This script intentionally does not hardcode any provider endpoints.
|
||||||
|
TXT
|
||||||
|
else
|
||||||
|
for url in ${CHECK_URLS}; do
|
||||||
|
echo "\n--- $url"
|
||||||
|
curl -I --max-time 10 "$url" | head -n 5 || true
|
||||||
|
done
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "\n== Proxy env vars (current) =="
|
||||||
|
env | grep -E '^(HTTP|HTTPS|ALL|NO)_PROXY=' || echo "(none set)"
|
||||||
12
__LOCAL_LLMs/oss_llm/testNanoGPT/00_setup_env.sh
Executable file
12
__LOCAL_LLMs/oss_llm/testNanoGPT/00_setup_env.sh
Executable file
@ -0,0 +1,12 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_venv
|
||||||
|
|
||||||
|
# nanoGPT README lists these deps.
|
||||||
|
"${VENV_PY}" -m pip install --upgrade pip
|
||||||
|
"${VENV_PY}" -m pip install torch numpy transformers datasets tiktoken wandb tqdm
|
||||||
|
|
||||||
|
# Quick sanity check
|
||||||
|
"${VENV_PY}" -c "import torch; print('torch', torch.__version__); print('mps', torch.backends.mps.is_available())"
|
||||||
17
__LOCAL_LLMs/oss_llm/testNanoGPT/10_clone_nanogpt.sh
Executable file
17
__LOCAL_LLMs/oss_llm/testNanoGPT/10_clone_nanogpt.sh
Executable file
@ -0,0 +1,17 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_git
|
||||||
|
|
||||||
|
mkdir -p "${WORKSPACE_ROOT}/oss_llm"
|
||||||
|
|
||||||
|
if [[ -d "${NANOGPT_DIR}/.git" ]]; then
|
||||||
|
echo "nanoGPT already cloned; updating…"
|
||||||
|
cd "${NANOGPT_DIR}"
|
||||||
|
git pull --ff-only
|
||||||
|
else
|
||||||
|
echo "Cloning nanoGPT into ${NANOGPT_DIR}…"
|
||||||
|
rm -rf "${NANOGPT_DIR}"
|
||||||
|
git clone https://github.com/karpathy/nanoGPT.git "${NANOGPT_DIR}"
|
||||||
|
fi
|
||||||
20
__LOCAL_LLMs/oss_llm/testNanoGPT/20_prepare_shakespeare.sh
Executable file
20
__LOCAL_LLMs/oss_llm/testNanoGPT/20_prepare_shakespeare.sh
Executable file
@ -0,0 +1,20 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_venv
|
||||||
|
require_curl
|
||||||
|
cd_nanogpt
|
||||||
|
|
||||||
|
# Prefer a github.com URL (some networks block raw.githubusercontent.com).
|
||||||
|
# This writes the input file where nanoGPT's prepare script expects it.
|
||||||
|
INPUT_TXT="data/shakespeare_char/input.txt"
|
||||||
|
if [[ ! -f "${INPUT_TXT}" ]]; then
|
||||||
|
mkdir -p "$(dirname "${INPUT_TXT}")"
|
||||||
|
echo "Downloading tiny Shakespeare to ${INPUT_TXT}" >&2
|
||||||
|
curl -fL --retry 3 --retry-delay 1 \
|
||||||
|
-o "${INPUT_TXT}" \
|
||||||
|
"https://github.com/karpathy/char-rnn/raw/master/data/tinyshakespeare/input.txt"
|
||||||
|
fi
|
||||||
|
|
||||||
|
"${VENV_PY}" data/shakespeare_char/prepare.py
|
||||||
16
__LOCAL_LLMs/oss_llm/testNanoGPT/30_train_cpu_quick.sh
Executable file
16
__LOCAL_LLMs/oss_llm/testNanoGPT/30_train_cpu_quick.sh
Executable file
@ -0,0 +1,16 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_venv
|
||||||
|
cd_nanogpt
|
||||||
|
|
||||||
|
"${VENV_PY}" train.py config/train_shakespeare_char.py \
|
||||||
|
--device=cpu --compile=False --dtype=float32 \
|
||||||
|
--eval_interval=10 --eval_iters=10 --log_interval=10 \
|
||||||
|
--block_size=64 --batch_size=12 \
|
||||||
|
--n_layer=4 --n_head=4 --n_embd=128 \
|
||||||
|
--max_iters=60 --lr_decay_iters=60 --dropout=0.0 \
|
||||||
|
--always_save_checkpoint=True
|
||||||
|
|
||||||
|
echo "Done. Checkpoint should exist at: ${NANOGPT_DIR}/out-shakespeare-char/ckpt.pt"
|
||||||
24
__LOCAL_LLMs/oss_llm/testNanoGPT/31_train_mps_quick.sh
Executable file
24
__LOCAL_LLMs/oss_llm/testNanoGPT/31_train_mps_quick.sh
Executable file
@ -0,0 +1,24 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_venv
|
||||||
|
cd_nanogpt
|
||||||
|
|
||||||
|
# Requires torch with MPS support (Apple Silicon). If MPS isn't available,
|
||||||
|
# fall back to CPU.
|
||||||
|
DEVICE="mps"
|
||||||
|
if ! "${VENV_PY}" -c "import torch; import sys; sys.exit(0 if torch.backends.mps.is_available() else 1)"; then
|
||||||
|
echo "MPS not available; falling back to CPU" >&2
|
||||||
|
DEVICE="cpu"
|
||||||
|
fi
|
||||||
|
|
||||||
|
"${VENV_PY}" train.py config/train_shakespeare_char.py \
|
||||||
|
--device="${DEVICE}" --compile=False --dtype=float32 \
|
||||||
|
--eval_interval=10 --eval_iters=10 --log_interval=10 \
|
||||||
|
--block_size=64 --batch_size=12 \
|
||||||
|
--n_layer=4 --n_head=4 --n_embd=128 \
|
||||||
|
--max_iters=60 --lr_decay_iters=60 --dropout=0.0 \
|
||||||
|
--always_save_checkpoint=True
|
||||||
|
|
||||||
|
echo "Done. Checkpoint should exist at: ${NANOGPT_DIR}/out-shakespeare-char/ckpt.pt"
|
||||||
14
__LOCAL_LLMs/oss_llm/testNanoGPT/40_sample_cpu.sh
Executable file
14
__LOCAL_LLMs/oss_llm/testNanoGPT/40_sample_cpu.sh
Executable file
@ -0,0 +1,14 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
source "$(cd "$(dirname "$0")" && pwd)/_common.sh"
|
||||||
|
require_venv
|
||||||
|
cd_nanogpt
|
||||||
|
|
||||||
|
if [[ ! -f "out-shakespeare-char/ckpt.pt" ]]; then
|
||||||
|
echo "ERROR: ckpt.pt not found. Run training first:" >&2
|
||||||
|
echo " bash oss_llm/testNanoGPT/30_train_cpu_quick.sh" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
"${VENV_PY}" sample.py --out_dir=out-shakespeare-char --device=cpu --max_new_tokens=200
|
||||||
32
__LOCAL_LLMs/oss_llm/testNanoGPT/98_smoke_test_mps.sh
Executable file
32
__LOCAL_LLMs/oss_llm/testNanoGPT/98_smoke_test_mps.sh
Executable file
@ -0,0 +1,32 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# One-shot validation that prefers Apple Silicon MPS.
|
||||||
|
|
||||||
|
bash oss_llm/testNanoGPT/10_clone_nanogpt.sh
|
||||||
|
bash oss_llm/testNanoGPT/00_setup_env.sh
|
||||||
|
bash oss_llm/testNanoGPT/20_prepare_shakespeare.sh
|
||||||
|
|
||||||
|
# Train with MPS if available (script falls back to CPU)
|
||||||
|
bash oss_llm/testNanoGPT/31_train_mps_quick.sh
|
||||||
|
|
||||||
|
# Sample with MPS if available, else CPU
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||||
|
# shellcheck source=./_common.sh
|
||||||
|
source "${SCRIPT_DIR}/_common.sh"
|
||||||
|
require_venv
|
||||||
|
cd_nanogpt
|
||||||
|
|
||||||
|
DEVICE="mps"
|
||||||
|
if ! "${VENV_PY}" -c "import torch; import sys; sys.exit(0 if torch.backends.mps.is_available() else 1)"; then
|
||||||
|
DEVICE="cpu"
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ ! -f "out-shakespeare-char/ckpt.pt" ]]; then
|
||||||
|
echo "ERROR: ckpt.pt not found after training" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
"${VENV_PY}" sample.py --out_dir=out-shakespeare-char --device="${DEVICE}" --dtype=float32 --max_new_tokens=200
|
||||||
|
|
||||||
|
printf '\nOK: nanoGPT MPS smoke test completed (device=%s).\n' "${DEVICE}"
|
||||||
11
__LOCAL_LLMs/oss_llm/testNanoGPT/99_smoke_test_all.sh
Executable file
11
__LOCAL_LLMs/oss_llm/testNanoGPT/99_smoke_test_all.sh
Executable file
@ -0,0 +1,11 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# One-shot end-to-end validation.
|
||||||
|
bash oss_llm/testNanoGPT/10_clone_nanogpt.sh
|
||||||
|
bash oss_llm/testNanoGPT/00_setup_env.sh
|
||||||
|
bash oss_llm/testNanoGPT/20_prepare_shakespeare.sh
|
||||||
|
bash oss_llm/testNanoGPT/30_train_cpu_quick.sh
|
||||||
|
bash oss_llm/testNanoGPT/40_sample_cpu.sh
|
||||||
|
|
||||||
|
echo "\nOK: nanoGPT smoke test completed."
|
||||||
105
__LOCAL_LLMs/oss_llm/testNanoGPT/README.md
Normal file
105
__LOCAL_LLMs/oss_llm/testNanoGPT/README.md
Normal file
@ -0,0 +1,105 @@
|
|||||||
|
# nanoGPT: use & test (local)
|
||||||
|
|
||||||
|
This folder contains runnable scripts to validate **nanoGPT** end-to-end in this workspace.
|
||||||
|
|
||||||
|
## Prereqs
|
||||||
|
|
||||||
|
- `git`, `python3`, and `curl` are available.
|
||||||
|
- GitHub is reachable (for cloning nanoGPT and downloading the tiny Shakespeare text).
|
||||||
|
|
||||||
|
## What the scripts do
|
||||||
|
|
||||||
|
- `00_setup_env.sh`: installs Python deps into the workspace venv.
|
||||||
|
- `10_clone_nanogpt.sh`: clones `karpathy/nanoGPT` into `oss_llm/nanoGPT` (or updates it).
|
||||||
|
- `20_prepare_shakespeare.sh`: downloads/prepares the tiny Shakespeare dataset.
|
||||||
|
- `30_train_cpu_quick.sh`: quick CPU training run (writes `out-shakespeare-char/ckpt.pt`).
|
||||||
|
- `31_train_mps_quick.sh`: quick MPS (Metal) training run (faster on Apple Silicon).
|
||||||
|
- `40_sample_cpu.sh`: samples from the trained checkpoint.
|
||||||
|
- `98_smoke_test_mps.sh`: runs clone → deps → prepare → train (MPS) → sample (MPS).
|
||||||
|
- `99_smoke_test_all.sh`: runs clone → deps → prepare → train → sample.
|
||||||
|
|
||||||
|
## What nanoGPT demonstrates (current capabilities)
|
||||||
|
|
||||||
|
With the default tiny Shakespeare **character-level** example, the checkpoint you train here supports:
|
||||||
|
|
||||||
|
- **Unconditional generation**: start from a newline and generate Shakespeare-ish character patterns.
|
||||||
|
- **Prompted continuation**: provide a short prompt (e.g. a phrase) and generate a continuation.
|
||||||
|
- **Sampling controls**:
|
||||||
|
- `--temperature` controls randomness (lower is more deterministic).
|
||||||
|
- `--top_k` clamps sampling to the top-K next-token candidates (lower is more conservative).
|
||||||
|
|
||||||
|
Reality check: with the default quick config (small model, short training), output will often look like _Shakespeare-shaped gibberish_. That’s expected; the goal is validating the end-to-end training + sampling workflow.
|
||||||
|
|
||||||
|
### Reproduce the demo generations
|
||||||
|
|
||||||
|
From the workspace root:
|
||||||
|
|
||||||
|
1. Ensure you have a checkpoint (CPU):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/99_smoke_test_all.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Unconditional samples (2 short samples):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd oss_llm/nanoGPT
|
||||||
|
./../.venv/bin/python sample.py \
|
||||||
|
--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
|
||||||
|
--num_samples=2 --max_new_tokens=220 --temperature=0.8 --top_k=200
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Prompted continuation (more conservative sampling):
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd oss_llm/nanoGPT
|
||||||
|
./../.venv/bin/python sample.py \
|
||||||
|
--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
|
||||||
|
--num_samples=1 --max_new_tokens=220 --temperature=0.4 --top_k=50 \
|
||||||
|
--start="To be, or not to be"
|
||||||
|
```
|
||||||
|
|
||||||
|
## One-command smoke test
|
||||||
|
|
||||||
|
From the workspace root:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/99_smoke_test_all.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## One-command MPS smoke test
|
||||||
|
|
||||||
|
From the workspace root:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/98_smoke_test_mps.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common commands
|
||||||
|
|
||||||
|
- Install deps only:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/00_setup_env.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
- Quick CPU train + sample:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/30_train_cpu_quick.sh
|
||||||
|
bash oss_llm/testNanoGPT/40_sample_cpu.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
- Quick MPS train + sample:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bash oss_llm/testNanoGPT/31_train_mps_quick.sh
|
||||||
|
bash oss_llm/testNanoGPT/40_sample_cpu.sh # sampling can still run on CPU
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- The CPU/MPS training scripts intentionally use a tiny model + small iteration count to finish quickly.
|
||||||
|
- If `ckpt.pt` is missing, it usually means training didn’t run an eval step; these scripts set `--eval_interval` low to force checkpoint writes.
|
||||||
|
- Scripts will auto-create `./.venv` on first run if it does not exist.
|
||||||
|
- Dataset download is via a `github.com/.../raw/...` URL to avoid reliance on `raw.githubusercontent.com`.
|
||||||
61
__LOCAL_LLMs/oss_llm/testNanoGPT/_common.sh
Executable file
61
__LOCAL_LLMs/oss_llm/testNanoGPT/_common.sh
Executable file
@ -0,0 +1,61 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Resolve workspace root (this repo) regardless of where the script is called from.
|
||||||
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
WORKSPACE_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
|
||||||
|
|
||||||
|
# Some managed macOS environments set TMPDIR to slow/unusual locations.
|
||||||
|
# Force a local temp directory for predictable behavior.
|
||||||
|
export TMPDIR="/tmp"
|
||||||
|
|
||||||
|
VENV_PY="${WORKSPACE_ROOT}/.venv/bin/python"
|
||||||
|
NANOGPT_DIR="${WORKSPACE_ROOT}/oss_llm/nanoGPT"
|
||||||
|
|
||||||
|
require_python3() {
|
||||||
|
if ! command -v python3 >/dev/null 2>&1; then
|
||||||
|
echo "ERROR: python3 not found in PATH" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
require_curl() {
|
||||||
|
if ! command -v curl >/dev/null 2>&1; then
|
||||||
|
echo "ERROR: curl not found in PATH" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
ensure_venv() {
|
||||||
|
if [[ -x "${VENV_PY}" ]]; then
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
require_python3
|
||||||
|
echo "Creating workspace venv at: ${WORKSPACE_ROOT}/.venv" >&2
|
||||||
|
python3 -m venv "${WORKSPACE_ROOT}/.venv"
|
||||||
|
}
|
||||||
|
|
||||||
|
require_venv() {
|
||||||
|
ensure_venv
|
||||||
|
if [[ ! -x "${VENV_PY}" ]]; then
|
||||||
|
echo "ERROR: Failed to create venv python at: ${VENV_PY}" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
require_git() {
|
||||||
|
if ! command -v git >/dev/null 2>&1; then
|
||||||
|
echo "ERROR: git not found in PATH" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
cd_nanogpt() {
|
||||||
|
if [[ ! -d "${NANOGPT_DIR}" ]]; then
|
||||||
|
echo "ERROR: nanoGPT repo not found at ${NANOGPT_DIR}" >&2
|
||||||
|
echo "Run: bash oss_llm/testNanoGPT/10_clone_nanogpt.sh" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
cd "${NANOGPT_DIR}"
|
||||||
|
}
|
||||||
Loading…
Reference in New Issue
Block a user