Site icon Efficient Coder

$100 LLM Training: How to Build a ChatGPT Clone in 4 Hours

How I trained a ChatGPT-like model for less than the price of a pair of sneakers, served it in a browser, and didn’t break the cloud bill.


Hook: From “We Need 100?”

Picture this:
You walk out of a budget meeting where the exec just asked for a 175-billion-parameter model and a seven-figure CapEx. On the subway ride home you open GitHub, clone a repo, launch one script, and four hours later you’re chatting with your own LLM on a public IP. No slide decks, no purchase orders—just 8 GPUs, 100 bucks, and nanochat.

Below is the exact playbook, command-for-command, metric-for-metric. If you can ssh and git, you can reproduce the experiment before your coffee gets cold.


1. Why nanochat Matters for SEO & GEO (Skip if you only code)

  • Google E-E-A-T: Experience (✓ I ran it), Expertise (✓ metrics inside), Authoritativeness (✓ Karpathy repo), Trustworthiness (✓ all open-source).
  • Generative Engine Optimization (GEO): Structured data (HowTo, FAQ), semantic triples (“nanochat trains LLM”), and fresh numbers make AI-search engines (Bing Chat, Bard, Perplexity) more likely to cite this article.
  • Keyword cluster: train ChatGPT clone, $100 LLM, nanochat tutorial, cheap AI model, end-to-end LLM pipeline.

2. Hardware & Cloud Bill (100% Real Cost)

Component Spec Price (Oct 2025, Lambda Cloud)
GPU 8 × H100 80 GB SXM $24/h
Runtime ~4 h ≈ $96
Egress < 1 GB $0
Storage 200 GB (free tier) $0
Total ≤ $100

Single-GPU mode works too—just 8× slower and still produces bit-for-bit identical checkpoints thanks to gradient accumulation.


3. 30-Second Environment Check

# 1. Spawn node with Ubuntu 22.04 + CUDA 12
ssh ubuntu@<your-ip>
git clone https://github.com/karpathy/nanochat.git && cd nanochat

# 2. Install uv (faster than pip)
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.cargo/env
uv pip install -r requirements.txt

# 3. Verify GPUs
nvidia-smi -L  # should list 8 devices

4. The One-Script Pipeline: speedrun.sh Deconstructed

Stage Command (abridged) Output Wall Time
Tokenize python -m nanochat.dataset -n 450 data/*.bin 10 min
Pre-train torchrun --nproc_per_node=8 -m scripts.base_train ckpt/base.pt 1 h 30 m
Mid-train torchrun --nproc_per_node=8 -m scripts.mid_train ckpt/mid.pt 1 h 20 m
SFT torchrun --nproc_per_node=8 -m scripts.sft_train ckpt/sft.pt 40 m
Eval python -m scripts.eval > report.md report.md 5 m

All hyper-parameters are in-script; no YAML monsters. Change --depth or --device_batch_size and rerun—zero code archaeology required.


5. Launching the Chat UI (3 Commands)

# Still in venv
python -m scripts.chat_web
# Uvicorn running on http://0.0.0.0:8000

Open http://<public-ip>:8000 in a browser—mobile friendly, no JS build step.

Live screenshot from my run:


The model’s answer is surprisingly accurate for 4e19 FLOPs—kindergarten level but confident.


6. Report Card: Benchmarks Inside report.md

Benchmark MID SFT RL (opt)
ARC-Easy 0.356 0.388
GSM8K 0.025 0.046 0.076
HumanEval 0.067 0.085
MMLU 0.311 0.315

Take-away: These numbers won’t wow an LLM leaderboard, but they will wow your CFO—because the whole run costs less than a team dinner.


7. Scaling to GPT-2 Territory (~$300)

Need a stronger baseline? Bump depth to 26 layers:

# Download more shards (formula: params×20×4.8÷250 M)
python -m nanochat.dataset -n 170

# Halve batch to fit 80 GB
torchrun --nproc_per_node=8 -m scripts.base_train --depth=26 --device_batch_size=16

Training stretches to 12 h≈ $300 and beats GPT-124M on CORE score. Same script, same repo—no hidden configs.


8. FAQ (Schema-marked)

Q1: Will it run on 4 × A100 40 GB?
A: Yes, set --device_batch_size=4 and expect 2× longer runtime.

Q2: Can I pause & resume?
A: Absolutely—pass --resume=ckpt/mid.pt to any script; checkpoints save every 1,000 steps.

Q3: How do I feed Chinese data?
A: Tokenizer is BPE-based. Drop UTF-8 .txt, rerun nanochat.dataset, and keep the same filename pattern (shard_*.bin). No code change needed.

Q4: Multi-node support?
A: Not yet. The repo is single-node by design for simplicity. PRs welcome.

Exit mobile version