FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index latest AI news Training nanochat on a nvidia RTX3060 !
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Training nanochat on a nvidia RTX3060 !
Posted: Sun Oct 26, 2025 09:51 AM
regards, saludos

Antonio Linares
www.fivetechsoft.com
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Re: Training nanochat on a nvidia RTX3060 !
Posted: Sun Oct 26, 2025 10:10 AM
Step
Command
Description
Expected Output/Notes
1. Clone Repo cd ~
git clone https://github.com/karpathy/nanochat
cd nanochat
Clone the standard nanochat repository from Karpathy. Creates ~/nanochat directory with all source files.โ€‹
2. Create venv python3.10 -m venv .venv
source .venv/bin/activate
Set up Python 3.10 virtual environment. Activates .venv; use uv for faster package management.โ€‹
3. Install Dependencies uv sync Install PyTorch, CUDA libs, maturin, and other deps (~3-5 min). Resolved 91 packages; CUDA: True, GPU: NVIDIA GeForce RTX 3060.โ€‹ 4. Verify Torch/CUDA uv run python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}, GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else None}')" Confirm CUDA and GPU detection. Output: CUDA: True, GPU: NVIDIA GeForce RTX 3060.โ€‹ 5. Install nanochat Editable uv pip install -e . Editable install (symlinks to source). nanochat==0.1.0 installed; enables module imports.โ€‹ 6. Verify nanochat Imports uv run python -c "from nanochat.gpt import GPT, GPTConfig; print('nanochat installed OK')" Test core model imports. Output: nanochat installed OK.โ€‹ 7. Compile Rust Tokenizer uv run maturin develop --release --manifest-path rustbpe/Cargo.toml Build RustBPE tokenizer (~2-5 min). 📦 Built wheel; RustBPETokenizer available.โ€‹ 8. Verify Tokenizer uv run python -c "from nanochat.tokenizer import RustBPETokenizer; print('Tokenizer compiled OK')" Confirm tokenizer module. Output: Tokenizer compiled OK.โ€‹ 9. Set PYTHONPATH export PYTHONPATH="$(pwd):$PYTHONPATH" Add repo root to Python path for multiprocessing. Permanent: Add to ~/.bashrc; required for torchrun.โ€‹ 10. Download Dataset PYTHONPATH="$(pwd):$PYTHONPATH" uv run python -m nanochat.dataset -n 40 Download 40 FineWeb-EDU shards (~4 GB, 5-15 min). Downloading 8 shards using 4 workers...; Target: ~/.cache/nanochat/base_data/.โ€‹ 11. Verify Dataset ls ~/.cache/nanochat/base_data/ | wc -l
du -sh ~/.cache/nanochat/base_data/
Check downloaded shards. 40 files; ~4 GB total (shard_00000.parquet to shard_00039.parquet).
12. Train Tokenizer (Optional) uv run python -m scripts.tok_train --max_chars=2000000000 Train custom BPE tokenizer (~10-20 min CPU). Generates tokenizer.json (~65K vocab); Edit vocab_size=32768 for smaller model.โ€‹ 13. Eval Tokenizer (Optional) uv run python -m scripts.tok_eval Test tokenizer compression (~1 min). Compression ratio ~4.8 chars/token (vs. GPT-4). 14. Pretrain Model PYTHONPATH="$(pwd):$PYTHONPATH" uv run torchrun --standalone --nproc_per_node=1 scripts/base_train.py --depth=10 --device_batch_size=4 --max_seq_len=1024 --compile --num_iterations=800 Base training: depth=10 (~100M params, ~1-2 hours on RTX 3060). Overriding: depth=10...; Loss ~11.09 โ†’ ~2.0; Checkpoints in checkpoints/d10/; Monitor with watch -n 1 nvidia-smi (VRAM ~7 GB). 15. Midtrain (Post-Pretrain) PYTHONPATH="$(pwd):$PYTHONPATH" uv run python scripts/mid_train.py --model_path checkpoints/d10 Add reasoning datasets (SmolTalk/MMLU/GSM8K, ~10 min, 4 GB VRAM). Improves chat coherence; Output: checkpoints/d10/midtrain.pt.โ€‹ 16. Supervised Fine-Tuning uv run python scripts/sft.py --batch_size=2 Align to ChatGPT-style responses (~5 min, 2 GB VRAM). Enables conversational capabilities.โ€‹ 17. Evaluation uv run python scripts/eval.py Run benchmarks (~5 min). MMLU ~25-35%, GSM8K ~5-10%, HumanEval ~8%, ARC-Easy ~30%.โ€‹ 18. Launch Web UI uv run python -m nanochat.webui Start ChatGPT-like interface. Localhost:8000; Test with trained d10 model.โ€‹ 19. Full Pipeline (Alternative) bash speedrun.sh Automated script (edit base_train.py depth=10 for RTX 3060). Handles tokenizer โ†’ pretrain โ†’ midtrain โ†’ SFT โ†’ eval โ†’ UI; Generates report.md.โ€‹
regards, saludos

Antonio Linares
www.fivetechsoft.com

Continue the discussion