learning_ai_common_plat/__LOCAL_LLMs/oss_llm/testNanoGPT
2026-02-28 00:03:37 -08:00
..
_common.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
00_setup_env.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
10_clone_nanogpt.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
20_prepare_shakespeare.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
30_train_cpu_quick.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
31_train_mps_quick.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
40_sample_cpu.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
98_smoke_test_mps.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
99_smoke_test_all.sh move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00
README.md move oss_llm/ from learning_ai_2nd_brain 2026-02-28 00:03:37 -08:00

nanoGPT: use & test (local)

This folder contains runnable scripts to validate nanoGPT end-to-end in this workspace.

Prereqs

  • git, python3, and curl are available.
  • GitHub is reachable (for cloning nanoGPT and downloading the tiny Shakespeare text).

What the scripts do

  • 00_setup_env.sh: installs Python deps into the workspace venv.
  • 10_clone_nanogpt.sh: clones karpathy/nanoGPT into oss_llm/nanoGPT (or updates it).
  • 20_prepare_shakespeare.sh: downloads/prepares the tiny Shakespeare dataset.
  • 30_train_cpu_quick.sh: quick CPU training run (writes out-shakespeare-char/ckpt.pt).
  • 31_train_mps_quick.sh: quick MPS (Metal) training run (faster on Apple Silicon).
  • 40_sample_cpu.sh: samples from the trained checkpoint.
  • 98_smoke_test_mps.sh: runs clone → deps → prepare → train (MPS) → sample (MPS).
  • 99_smoke_test_all.sh: runs clone → deps → prepare → train → sample.

What nanoGPT demonstrates (current capabilities)

With the default tiny Shakespeare character-level example, the checkpoint you train here supports:

  • Unconditional generation: start from a newline and generate Shakespeare-ish character patterns.
  • Prompted continuation: provide a short prompt (e.g. a phrase) and generate a continuation.
  • Sampling controls:
    • --temperature controls randomness (lower is more deterministic).
    • --top_k clamps sampling to the top-K next-token candidates (lower is more conservative).

Reality check: with the default quick config (small model, short training), output will often look like Shakespeare-shaped gibberish. Thats expected; the goal is validating the end-to-end training + sampling workflow.

Reproduce the demo generations

From the workspace root:

  1. Ensure you have a checkpoint (CPU):
bash oss_llm/testNanoGPT/99_smoke_test_all.sh
  1. Unconditional samples (2 short samples):
cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
	--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
	--num_samples=2 --max_new_tokens=220 --temperature=0.8 --top_k=200
  1. Prompted continuation (more conservative sampling):
cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
	--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
	--num_samples=1 --max_new_tokens=220 --temperature=0.4 --top_k=50 \
	--start="To be, or not to be"

One-command smoke test

From the workspace root:

bash oss_llm/testNanoGPT/99_smoke_test_all.sh

One-command MPS smoke test

From the workspace root:

bash oss_llm/testNanoGPT/98_smoke_test_mps.sh

Common commands

  • Install deps only:
bash oss_llm/testNanoGPT/00_setup_env.sh
  • Quick CPU train + sample:
bash oss_llm/testNanoGPT/30_train_cpu_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh
  • Quick MPS train + sample:
bash oss_llm/testNanoGPT/31_train_mps_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh  # sampling can still run on CPU

Notes

  • The CPU/MPS training scripts intentionally use a tiny model + small iteration count to finish quickly.
  • If ckpt.pt is missing, it usually means training didnt run an eval step; these scripts set --eval_interval low to force checkpoint writes.
  • Scripts will auto-create ./.venv on first run if it does not exist.
  • Dataset download is via a github.com/.../raw/... URL to avoid reliance on raw.githubusercontent.com.