# Proposal scope — image-first cycle (revised 2026-04-13)

Purpose: internal record of the adjusted proposal scope following Banijay budget confirmation.

**Critical clarification:** This revision maintains the **same training corpus** (6 characters, 5 gadgets, 3 locations) as originally planned. We drop only the **custom video model head**—not the training scope. The keyframes become the primary handoff for commercial video tools (Seedance, Runway).

## Background

Banijay confirmed half the budget the agency was planning to bill (EUR 7.5K/month vs 15K). Rather than force a partial video build, we pivot to a controlled image generation layer that Laurent's team can feed into commercial services as keyed plates.

**Key efficiency point:** We're preserving the training corpus investment (character identity modeling, gadget physics, location adaptation) while removing the video model development overhead. Same dataset work, different output leverage.

## Main deliverables

- a text model and agent for brief interpretation, shot and storyboard planning
- an image model for key art, closeups, trio compositions, gadgets, and locked keyframes designed for image-to-video handoff
- **no custom video model in this cycle**—keyframes optimized for Seedance/Runway/etc

## Objective

End the first cycle with owned models trained on franchise assets (7 seasons of established visual identity) for use with commercial video tools. Character consistency anchored; motion interpolation externalized.

## Included scope for the first cycle

### Characters (6 total) — SAME AS ORIGINAL

From core cast across all 7 seasons:

- **3 main spies**: Sam, Clover, Alex (the anchor trio)
- **3 supporting**: Jerry (founder/director), Mandy, Zerlina (Season 7 director)

**Same training corpus as April 2 proposal.** Mandy and Zerlina maintained to preserve dataset value—training on 6 characters amortizes common infrastructure (preprocessing, conditioning architecture, validation) across more assets.

Additional characters (beyond 6) would be scoped separately:
- Toby, Glitterstar, Britney (spy-in-training)
- Villains: Cyberchac, The Curator, **Shmagi** (former WOOHP agent, inventor of "Ultra Fluff and Tickle Cotton Balls")
- Add-on pricing: $1,500-2,500 depending on screen time and visual complexity

### Gadgets and props (5 total) — SAME AS ORIGINAL

Given the gadget-forward nature of the show:

- **Compowder** (iconic computer-makeup hybrid device)
- **Ultra Fluff and Tickle Cotton Balls** (invented by former WOOHP agent Shmagi)
- **Catsuit** (signature spy uniform—purple/red/yellow)
- Body-swapping laser (action gadget)
- One additional prop/vehicle (TBD)

**Same training corpus as April 2 proposal.** Physics understanding, deployment mechanics, and in-hand poses trained for all 5—same overhead, same value.

Additional gadgets: $1,000-1,500 each.

### Locations (3 total) — SAME AS ORIGINAL

Franchise spans Beverly Hills, WOOHP headquarters, the WOOHP Express, plus Season 7 locations (Singapore, Paris, Amazon):

- **WOOHP interior** (headquarters or similar)
- **Field/epic location** (Amazon Rainforest, Paris streets, or Singapore)
- **Urban/residential** (Beverly Hills or Malibu University campus)

**Same training corpus as April 2 proposal.** Lighting adaptation, spatial consistency, and style modeling for all 3.

Additional locations: $1,000-1,500 each.

### Output format change — NOT training scope change

| What | April 2 (Full) | Current (Revised) |
|------|---------------|-------------------|
| Characters trained | 6 | **6** (preserved) |
| Gadgets trained | 5 | **5** (preserved) |
| Locations trained | 3 | **3** (preserved) |
| Video model | ✓ Built | ✗ Dropped |
| Image model | ✓ Built | ✓ Built |
| Keyframes | ✓ Output | ✓ **Primary deliverable** |
| **Total fee** | $25,000 | **$12,500** (-50%) |

**Efficiency note:** Per-concept investment drops from $1,786 to $893 because video model development removed—**not** because training scope reduced.

### Character coverage includes

- hero closeups
- medium shots
- full figure
- trio groupings
- object interaction
- standard promo framing

All anchored to franchise visual identity (7 seasons).

## Timing from full materials acceptance

- first `3` weeks for:
  - the agent
  - the first image model pass
  - output tests across closeups, trio shots, gadgets, and locations
- following `3` to `4` weeks for:
  - finetuning on commercial video handoff requirements
  - validation
  - final delivery of the `2` models (text + image) and workflow package
- **total: 6-7 weeks**

## Commercial structure

- total: `$12,500`
- `30%` ($3,750) upfront to prep and lock capacity
- `40%` ($5,000) when the full materials package lands and is accepted for processing
- `30%` ($3,750) on final delivery of the models and workflow package

**Note:** Same payment structure as before, halved amounts.

## Assumptions

- the full materials package lands complete and to spec (franchise-level, not just Season 7)
- the `300` minutes of runtime materially covers **all 6 characters, 5 gadgets, and 3 locations**
- the immediate output target is short-form promotional material
- visual style anchored to established franchise identity
- editorial, sound, captions, and final finishing sit outside this scope unless added separately
- **additional characters/gadgets/locations beyond 6/5/3** would be scoped separately
- output is scoped for `16:9` frames at `720p/1080p`
- upscaling is not included
- **commercial video tool costs and iteration overheads are production's responsibility**

## Materials needed to start

- the series bible (**franchise-level**, not just Season 7)
- one asset pack per included character, gadget, object, and location
  - **character packs: 6 total** (Sam, Clover, Alex, Jerry, Mandy, Zerlina)
    - multi-angle views, expressions, full body, medium, and closeup coverage
  - **gadget packs: 5 total** (Compowder, Ultra Fluff, Catsuit, laser, prop)
    - front, side, three-quarter, and in-hand references
  - **location packs: 3 total** (WOOHP, field, urban)
    - wides, mediums, empty background plates without characters/overlays
- clean environment plates where available
- at least `300` minutes of master runtime (**across seasons as available**)
- scripts and storyboards for that `300` minute package if available
- the branding pack
- a usage sheet marking what is approved for training and what is reference only, if needed

## Source delivery spec

- runtime should be original masters where possible
- deliver at original resolution and frame rate
- no burnt subtitles, logos, or graphics
- **sufficient coverage for 6 characters, 5 gadgets, 3 locations**

## Key risk factors for commercial video tool workflow

The controllability problem with off-the-shelf video tools (Seedance, Runway, etc.) is identity drift across shots. Our models solve the **starting point** (character-consistent keyframes), though production will still face:

- **Temporal coherence**: 2-4x iteration required per shot (external tool limitation)
- **Character drift between frames**: keyframe holds at frame 0, shifts by frame 60 (external tool limitation)
- **Limited editorial cutability**: modular clip workflow, not traditional ripple-edit

**Trade-off summary:** We preserve training corpus value (character/gadget/location identity) but defer motion/temporal modeling to commercial tools. Quality cost: degraded motion coherence. Efficiency gain: 50% fee reduction while maintaining dataset scope.
