History

saravanakumardb1 1a4b3c1fb3 move oss_llm/ from learning_ai_2nd_brain		2026-02-28 00:03:37 -08:00
..
_common.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
00_setup_env.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
10_clone_nanogpt.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
20_prepare_shakespeare.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
30_train_cpu_quick.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
31_train_mps_quick.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
40_sample_cpu.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
98_smoke_test_mps.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
99_smoke_test_all.sh	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00
README.md	move oss_llm/ from learning_ai_2nd_brain	2026-02-28 00:03:37 -08:00

README.md

nanoGPT: use & test (local)

This folder contains runnable scripts to validate nanoGPT end-to-end in this workspace.

Prereqs

git, python3, and curl are available.
GitHub is reachable (for cloning nanoGPT and downloading the tiny Shakespeare text).

What the scripts do

00_setup_env.sh: installs Python deps into the workspace venv.
10_clone_nanogpt.sh: clones karpathy/nanoGPT into oss_llm/nanoGPT (or updates it).
20_prepare_shakespeare.sh: downloads/prepares the tiny Shakespeare dataset.
30_train_cpu_quick.sh: quick CPU training run (writes out-shakespeare-char/ckpt.pt).
31_train_mps_quick.sh: quick MPS (Metal) training run (faster on Apple Silicon).
40_sample_cpu.sh: samples from the trained checkpoint.
98_smoke_test_mps.sh: runs clone → deps → prepare → train (MPS) → sample (MPS).
99_smoke_test_all.sh: runs clone → deps → prepare → train → sample.

What nanoGPT demonstrates (current capabilities)

With the default tiny Shakespeare character-level example, the checkpoint you train here supports:

Unconditional generation: start from a newline and generate Shakespeare-ish character patterns.
Prompted continuation: provide a short prompt (e.g. a phrase) and generate a continuation.
Sampling controls:
- --temperature controls randomness (lower is more deterministic).
- --top_k clamps sampling to the top-K next-token candidates (lower is more conservative).

Reality check: with the default quick config (small model, short training), output will often look like Shakespeare-shaped gibberish. That’s expected; the goal is validating the end-to-end training + sampling workflow.

Reproduce the demo generations

From the workspace root:

Ensure you have a checkpoint (CPU):

bash oss_llm/testNanoGPT/99_smoke_test_all.sh

Unconditional samples (2 short samples):

cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
	--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
	--num_samples=2 --max_new_tokens=220 --temperature=0.8 --top_k=200

Prompted continuation (more conservative sampling):

cd oss_llm/nanoGPT
./../.venv/bin/python sample.py \
	--out_dir=out-shakespeare-char --device=cpu --dtype=float32 \
	--num_samples=1 --max_new_tokens=220 --temperature=0.4 --top_k=50 \
	--start="To be, or not to be"

One-command smoke test

From the workspace root:

bash oss_llm/testNanoGPT/99_smoke_test_all.sh

One-command MPS smoke test

From the workspace root:

bash oss_llm/testNanoGPT/98_smoke_test_mps.sh

Common commands

Install deps only:

bash oss_llm/testNanoGPT/00_setup_env.sh

Quick CPU train + sample:

bash oss_llm/testNanoGPT/30_train_cpu_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh

Quick MPS train + sample:

bash oss_llm/testNanoGPT/31_train_mps_quick.sh
bash oss_llm/testNanoGPT/40_sample_cpu.sh  # sampling can still run on CPU

Notes

The CPU/MPS training scripts intentionally use a tiny model + small iteration count to finish quickly.
If ckpt.pt is missing, it usually means training didn’t run an eval step; these scripts set --eval_interval low to force checkpoint writes.
Scripts will auto-create ./.venv on first run if it does not exist.
Dataset download is via a github.com/.../raw/... URL to avoid reliance on raw.githubusercontent.com.

README.md Unescape Escape

nanoGPT: use & test (local)

Prereqs

What the scripts do

What nanoGPT demonstrates (current capabilities)

Reproduce the demo generations

One-command smoke test

One-command MPS smoke test

Common commands

Notes

README.md