# Pure Nix GPU Training Guide

## Status: ✅ WORKING

Your RTX 6000 Ada 48GB is now accessible via pure Nix!

## Quick Start

```bash
# Enter GPU training shell (first time: ~5 min to install PyTorch)
cd ~/bastions && nix develop .#gpu-training

# Verify CUDA works
python3 -c "import torch; print(torch.cuda.is_available())"  # True

# Run smoke test
cd /home/workspaces/totally-spies-cultshot
python3 tools/run_wan22_train.py \
  --training-data-dir materials/training-data/iiw-english-smoke-video-only \
  --output-dir materials/training-data/iiw-english-smoke-video-only/wan22_checkpoints \
  --model-variant ti2v-5b \
  --lora-rank 16 \
  --epochs 1 \
  --dataset-repeat 20 \
  --learning-rate 2e-5 \
  --num-frames 81 \
  --height 480 \
  --width 832 \
  --gradient-accumulation-steps 4 \
  --dry-run
```

## Full Training Workflow

### 1. Install DiffSynth-Studio

In the GPU shell:
```bash
git clone https://github.com/modelscope/DiffSynth-Studio.git ~/src/DiffSynth-Studio
cd ~/src/DiffSynth-Studio
pip install -e .
```

### 2. Download Wan2.2 Model

```bash
# Accept terms first: https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B
huggingface-cli login
huggingface-cli download Wan-AI/Wan2.2-TI2V-5B --local-dir ~/models/Wan2.2-TI2V-5B
```

### 3. Run Training

```bash
cd /home/workspaces/totally-spies-cultshot

# Activate GPU environment
cd ~/bastions && nix develop .#gpu-training

# Run training
python3 tools/run_wan22_train.py \
  --training-data-dir materials/training-data/iiw-english-smoke-video-only \
  --output-dir materials/training-data/iiw-english-smoke-video-only/wan22_checkpoints \
  --diffsynth-path ~/src/DiffSynth-Studio \
  --wan22-model ~/models/Wan2.2-TI2V-5B \
  --model-variant ti2v-5b \
  --lora-rank 16 \
  --epochs 1 \
  --dataset-repeat 20 \
  --learning-rate 2e-5 \
  --num-frames 81 \
  --height 480 \
  --width 832 \
  --gradient-accumulation-steps 4
```

**Expected time:** 2-3 hours for 160-clip smoke test

### 4. Monitor Training

In another terminal:
```bash
watch -n 5 nvidia-smi
```

### 5. Check Results

```bash
ls -lh materials/training-data/iiw-english-smoke-video-only/wan22_checkpoints/
```

## Shell Persistence

The venv is cached in `/tmp` but will be recreated each time. For persistence:

```bash
# Modify gpu-training-shell.nix to use ~/.venv/spies-gpu instead of $TMPDIR
```

## Troubleshooting

### CUDA Out of Memory
Reduce resolution:
```bash
--height 384 --width 640 --gradient-accumulation-steps 8
```

### libstdc++ Error
Make sure `gcc-unwrapped.lib` is in buildInputs and LD_LIBRARY_PATH.

### Slow Installation
The venv is cached. Subsequent shell entries are instant.

## Architecture

```
┌─────────────────────────────────────────────────┐
│  NixOS System                                   │
│  - hardware.opengl.enable = true               │
│  - programs.nix-ld with CUDA libs              │
└──────────────┬──────────────────────────────────┘
               │
               │ nix develop .#gpu-training
               ▼
┌─────────────────────────────────────────────────┐
│  Nix Shell                                      │
│  - python312 + pip                              │
│  - CUDA runtime (nixpkgs)                       │
│  - NVIDIA driver (linuxPackages.nvidia_x11)     │
│  - gcc-unwrapped.lib (libstdc++)                │
└──────────────┬──────────────────────────────────┘
               │
               │ pip install torch --index-url cu121
               ▼
┌─────────────────────────────────────────────────┐
│  Python venv with PyTorch CUDA                  │
│  - torch 2.5.1+cu121                            │
│  - torchvision                                  │
│  - accelerate, diffusers, transformers          │
└──────────────┬──────────────────────────────────┘
               │
               │ import torch
               ▼
┌─────────────────────────────────────────────────┐
│  NVIDIA RTX 6000 Ada 48GB                       │
│  - CUDA 12.1                                    │
│  - Driver 580.142                               │
└─────────────────────────────────────────────────┘
```

## Next Steps After Smoke Test

1. ✅ Validate LoRA quality with test inference
2. ✅ Full pilot training (601 clips, ~8-12 hours)
3. ✅ Generate validation shots (TSV-001, TSV-002, TSV-005)
4. ✅ Send results to Laurent

---

**Pure Nix:** ✅ Reproducible environment  
**Local GPU:** ✅ RTX 6000 Ada 48GB  
**CUDA Working:** ✅ torch.cuda.is_available() = True
