> **⚠️ SUPERSEDED** — This document is a historical receipt. See `s7-current-understanding.md` for the authoritative current position.

# Gemma 3 12B pack confirmation

## Purpose

Confirm or adjust the current S7 model-selection / training-priority memo using a
fresh local `gemma3:12b` run over the three benchmark packs.

Raw output:

- `materials/benchmark/youtube-s7-validation/packs-eval-gemma3-12b.json`

## Result

### Overall

Gemma 3 12B **confirms the current prioritization**.

It does not materially overturn the existing conclusions. It mainly sharpens
how the failure modes should be described.

## What Gemma confirmed

### 1. Dialogue / hold pack is priority-critical

Gemma described the key failure modes as:

- pose degradation / wobble
- identity drift
- costume artifacts
- background instability
- gadget disappearance or distortion
- motion bleed during supposed holds

This strongly confirms the current memo's emphasis on:

- low-motion stability
- held-frame discipline
- stable faces and costumes
- static-background consistency

### 2. Action / gadget pack is still about control, not realism

Gemma emphasized:

- gadget readability
- temporal consistency under dynamic action
- background stability
- motion distortion as a main generic-model risk
- cutout-style edge / compositing artifacts as a likely failure mode

This confirms the memo's current reading that the action pack is not about
maximizing fluid realism. It is about preserving controlled, legible,
cutout-style action under higher pressure.

### 3. Character / costume consistency remains the first gate

Gemma highlighted:

- identity drift
- costume distortion
- silhouette instability
- gadget loss / alteration
- temporal jitter

This directly confirms the memo's current ordering:

1. lock identity
2. lock holds
3. lock controlled action / gadget readability

## What Gemma adjusted or sharpened

Gemma adds useful language around a few specific risks that should remain in
future evaluations:

- **pose degradation** in held shots
- **motion bleed** during intended stillness
- **cutout artifacts** / popping at character edges during action
- **background bleed** affecting character appearance

These do not change the training priorities, but they are good additions to
future scoring rubrics.

## Final assessment

The current memo remains correct.

Best current priority order still stands:

1. **character / costume consistency**
2. **dialogue / hold temporal consistency**
3. **action / gadget control**

If anything, Gemma makes the same conclusion more concrete by describing the
failure modes in more production-facing language.
