# Wan2.2 video-only smoke subset

This folder is the first control dataset for Wan2.2 smoke training.

## Contents

- `manifest.json` — 160 selected video clips, 40 each from EP01/04/05/20
- `diffsynth_metadata.jsonl` — DiffSynth/Wan2.2 video training metadata
- `wan21_metadata.json` / `wan2.1_metadata.json` — compatibility metadata
- `wan22_smoke_package_manifest.json` — package summary
- `clips/` and `first_frames/` — hardlinked from `../iiw-english-pilot/`

## Important

- Video-only: no character plate rows are included in training metadata.
- Identity plates should be used only as eval/reference for this smoke test.
- Existing YouTube-derived `materials/training-data/clips/` is untouched.

## Local validation

The Wan2.2 wrapper supports `--dry-run`, which validates metadata/media paths and prints the exact Accelerate command without starting training:

```bash
python tools/run_wan22_train.py \
  --training-data-dir materials/training-data/iiw-english-smoke-video-only \
  --output-dir materials/training-data/iiw-english-smoke-video-only/wan22_checkpoints \
  --model-variant ti2v-5b \
  --lora-rank 16 \
  --epochs 1 \
  --dataset-repeat 20 \
  --learning-rate 2e-5 \
  --num-frames 81 \
  --height 480 \
  --width 832 \
  --gradient-accumulation-steps 4 \
  --dry-run
```

Nix helper validation is also available after building `runWan22Train`:

```bash
nix build -f nix/tools.nix runWan22Train --no-link --print-out-paths
```

## GPU training command shape

Use local/pinned DiffSynth and Wan2.2 model paths on a GPU host:

```bash
python tools/run_wan22_train.py \
  --training-data-dir materials/training-data/iiw-english-smoke-video-only \
  --output-dir materials/training-data/iiw-english-smoke-video-only/wan22_checkpoints \
  --diffsynth-path /path/to/DiffSynth-Studio \
  --wan22-model /path/to/Wan2.2-TI2V-5B \
  --model-variant ti2v-5b \
  --lora-rank 16 \
  --epochs 1 \
  --dataset-repeat 20 \
  --learning-rate 2e-5 \
  --num-frames 81 \
  --height 480 \
  --width 832 \
  --gradient-accumulation-steps 4
```

Use a GPU host; this local workspace currently has no `nvidia-smi` GPU visible.
