# Totally Spies AI Pipeline — Index
*Navigation guide for all project documentation.*

---

## Quick reference

| Question | Doc |
|---|---|
| What is the current dataset state? | `dataset-state.md` |
| What are the character visual anchors? | `s7-current-understanding.md` |
| Why Wan2.2 and not FLUX/LTX/Hunyuan? | `model-licensing.md` |
| How do I run training? | `gpu-training-runbook.md` |
| Meeting prep for Banijay/Laurent? | `meeting-talking-points.md` |
| One-page executive summary? | `meeting-executive-summary.md` |

---

## Core docs

### `s7-current-understanding.md` ← authoritative reference
Everything about the franchise (character anchors, villains, gadgets, locations), the dataset state, model selection rationale, and pipeline overview. Start here.

### `dataset-state.md`
Complete reference for the training dataset: manifest field specification, training_caption format, methodology (two-pass VLM, conservative captioning, two-stage gadget naming, force-alignment), location/scene/character distributions, remaining gaps.

### `model-licensing.md`
Verified license analysis for every candidate model (FLUX, LTX, HunyuanVideo, Wan2.1, Wan2.2, Wan2.7) with direct quotes from source license documents. Explains why Wan2.2 is the only clean commercial option for a French/EU production.

### `gpu-training-runbook.md`
Step-by-step training guide: GPU provisioning, model ingestion, pipeline execution (step1 → step5), hardware specs, expected outputs, troubleshooting.

---

## Meeting docs

### `meeting-talking-points.md`
Structured talking points for the Banijay/Laurent meeting: why generic AI fails, the bible-first method, what we've solved, pipeline state, why Wan2.2, budget fallback (image tier), the two-tier offer.

### `meeting-executive-summary.md`
One-page version: what we're building, how, why, pipeline state, key numbers.

---

## Research docs (historical context)

| Doc | Contents |
|---|---|
| `s7-bible-gaps.md` | Gap analysis from the original bible build |
| `s7-benchmark-packs.md` | Benchmark evaluation packs (3 packs) |
| `s7-evaluation-rubric.md` | How to evaluate generated output |
| `s7-content-source-map.md` | Where each piece of content comes from |
| `s7-three-axis-framework.md` | Three-axis content framework |
| `s7-animation-style-vs-video-diffusion.md` | Style analysis for generation |
| `s7-model-selection-and-training-priorities.md` | Early model selection analysis |
| `official-benchmark-frame-analysis.md` | Frame analysis from official materials |
| `model-comparison-gemma4-vs-qwen3vl.md` | VLM comparison results |

---

## Key file locations

```
materials/
  training-data/
    manifest.json              ← complete dataset metadata
    training_caption field     ← USE THIS for training
    ltx2_dataset.json          ← LTX-2 format
    wan21_metadata.json        ← Wan2.2 format ← PRIMARY
    clips/                     ← 1,551 video clips (720p)
    first_frames/              ← 1,551 PNG keyframes

  benchmark/youtube-s7-validation/bible/
    catalog-shots-patched.jsonl  ← 2,852 shots
    cross-reference/
      gadget-lockdown-v3.json    ← gadget visual DB
      villain-database-final.json
      villain-visual-db.json
    final/
      speaker-character-maps.json
      locations-canonical.json
    episodes/{id}/
      transcript-with-speakers.json
      diarization.json

  reference/totally-spies/youtube-official/
    inventory.json             ← all 104 reference downloads
    pure-s7/                   ← 13 full episode videos

nix/
  pipeline.nix                 ← full training DAG
  pipeline-inputs.nix          ← model paths (Wan2.2-TI2V-5B)
  tools.nix                    ← devenv task definitions

tools/
  vlm_repass_conservative.py   ← two-pass VLM caption script
  vlm_caption_job.py           ← original full caption pass
  run_wan21_train.py           ← Wan2.2 LoRA training wrapper
  caption_clips.py             ← step2-caption (pipeline)
```
