| .. | ||
| _common.sh | ||
| 00_setup_env.sh | ||
| 10_clone_nanogpt.sh | ||
| 20_prepare_shakespeare.sh | ||
| 30_train_cpu_quick.sh | ||
| 31_train_mps_quick.sh | ||
| 40_sample_cpu.sh | ||
| 98_smoke_test_mps.sh | ||
| 99_smoke_test_all.sh | ||
| README.md | ||
nanoGPT: use & test (local)
This folder contains runnable scripts to validate nanoGPT end-to-end in this workspace.
Prereqs
git,python3, andcurlare available.- GitHub is reachable (for cloning nanoGPT and downloading the tiny Shakespeare text).
What the scripts do
00_setup_env.sh: installs Python deps into the workspace venv.10_clone_nanogpt.sh: cloneskarpathy/nanoGPTintooss_llm/nanoGPT(or updates it).20_prepare_shakespeare.sh: downloads/prepares the tiny Shakespeare dataset.30_train_cpu_quick.sh: quick CPU training run (writesout-shakespeare-char/ckpt.pt).31_train_mps_quick.sh: quick MPS (Metal) training run (faster on Apple Silicon).40_sample_cpu.sh: samples from the trained checkpoint.98_smoke_test_mps.sh: runs clone → deps → prepare → train (MPS) → sample (MPS).99_smoke_test_all.sh: runs clone → deps → prepare → train → sample.
What nanoGPT demonstrates (current capabilities)
With the default tiny Shakespeare character-level example, the checkpoint you train here supports:
- Unconditional generation: start from a newline and generate Shakespeare-ish character patterns.
- Prompted continuation: provide a short prompt (e.g. a phrase) and generate a continuation.
- Sampling controls:
--temperaturecontrols randomness (lower is more deterministic).--top_kclamps sampling to the top-K next-token candidates (lower is more conservative).
Reality check: with the default quick config (small model, short training), output will often look like Shakespeare-shaped gibberish. That’s expected; the goal is validating the end-to-end training + sampling workflow.
Reproduce the demo generations
From the workspace root:
- Ensure you have a checkpoint (CPU):
bash oss_llm/testNanoGPT/99_smoke_test_all.sh
- Unconditional samples (2 short samples):
cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
--num_samples=2 --max_new_tokens=220 --temperature=0.8 --top_k=200
- Prompted continuation (more conservative sampling):
cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
--num_samples=1 --max_new_tokens=220 --temperature=0.4 --top_k=50 \
--start="To be, or not to be"
One-command smoke test
From the workspace root:
bash oss_llm/testNanoGPT/99_smoke_test_all.sh
One-command MPS smoke test
From the workspace root:
bash oss_llm/testNanoGPT/98_smoke_test_mps.sh
Common commands
- Install deps only:
bash oss_llm/testNanoGPT/00_setup_env.sh
- Quick CPU train + sample:
bash oss_llm/testNanoGPT/30_train_cpu_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh
- Quick MPS train + sample:
bash oss_llm/testNanoGPT/31_train_mps_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh # sampling can still run on CPU
Notes
- The CPU/MPS training scripts intentionally use a tiny model + small iteration count to finish quickly.
- If
ckpt.ptis missing, it usually means training didn’t run an eval step; these scripts set--eval_intervallow to force checkpoint writes. - Scripts will auto-create
./.venvon first run if it does not exist. - Dataset download is via a
github.com/.../raw/...URL to avoid reliance onraw.githubusercontent.com.