[
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/8jyTvAc-w6k/f00379.jpg",
    "episode": "Every EPISODE 3 of Totally Spies! \ud83c\udf38 Seasons 1-7 \u2728 FULL Episode Compilation #25years #anniversary",
    "source": "s1-6-comparison",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 95.2,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: near-hold\nCHARACTERS_VISIBLE: Alex, Clover, Sam (from Totally Spies!)\nSHOT_FRAMING: medium-wide\nLINEWORK: Consistent medium-thickness outlines with clean, smooth curves; no visible sketchiness or variation in line weight.\nCOLORING: flat-fill with subtle shadows; bright, saturated character colors (purple, red, teal) against a pastel-toned store background.\nEYE_STYLE: Large anime-influenced eyes (occupying ~1/3 of face height), oval-shaped with small white highlights; expressive and cartoonish.\nBODY_PROPORTIONS: Western cartoon proportions with slightly elongated limbs; head-to-body ratio ~1:6, blending anime eye style with more realistic body structure.\nBACKGROUND: Detailed but simplified store interior (clothing racks, mannequin, signage); static with no motion-blur; same production quality as characters.\nDESIGN_ERA: Early 2000s traditional/digital hybrid animation (consistent with Totally Spies! Season 1\u20132).\nNOTABLE: The \"FAS\" shopping bag held by Alex, a recurring visual gag referencing the show\u2019s fashion-obsessed tone.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. First, the scene type: it's three characters in a store, probably talking or interacting, so dialogue/talking-head. Motion state: they're standing, maybe",
    "qwen3vl_seconds": 10.8
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xxc4GQCan0U/f00638.jpg",
    "episode": "Totally Spies! Season 7 Episode 8 - Mega Moon Cheese | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 2,
    "gemma4_response": "",
    "gemma4_seconds": 89.9,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Alex, Clover, Sam\nSHOT_FRAMING: medium-wide\nLINEWORK: Consistent medium-thickness outlines with clean, smooth edges; no visible sketchiness or variation in line weight.\nCOLORING: flat-fill with minimal shadowing; bright, saturated palette (vibrant greens for foliage, pastel character outfits) typical of early 2000s Western animation.\nEYE_STYLE: Large oval eyes occupying ~1/3 of face height; prominent white highlights and detailed irises convey concern/surprise.\nBODY_PROPORTIONS: Head-to-body ratio ~1:4 with slightly elongated limbs; stylized Western cartoon proportions (less exaggerated than anime, more realistic than classic American cartoons).\nBACKGROUND: Simplified static foliage with flat coloring; matches character production quality but lacks fine detail.\nDESIGN_ERA: Early 2000s digital animation (consistent with \"Totally Spies!\" 2001\u20132008 production era, showing early digital rigging with hand-drawn aesthetic).\nNOTABLE: The pink gem (gadget) held by Clover contrasts sharply with the chaotic physical comedy of the tangled characters, blending action and humor.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. First, SCENE_TYPE: The characters are in a situation that looks like a reaction to something, maybe a comedy-reaction since they're in a tangled position",
    "qwen3vl_seconds": 21.7
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/DgZSBwIyP4o/f00236.jpg",
    "episode": "Totally Spies! Season 7 Episode 10 - Totally Pawsome | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 91.8,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Penny Proud, Trudy Proud, [male character in black suit]\nSHOT_FRAMING: wide\nLINEWORK: Clean, consistent outline thickness with smooth vector-style linework; no visible hand-drawn imperfections or variation in stroke weight.\nCOLORING: flat-with-shadow; vibrant, saturated color palette with distinct separation between character hues (red, blue, black) and cool-toned background.\nEYE_STYLE: Eyes are large relative to face, oval-shaped with small circular highlights; convey concern or curiosity through slight upward tilt and pupil placement.\nBODY_PROPORTIONS: Western cartoon proportions: head is approximately 1/4 of total height, limbs slightly elongated but not extreme, with naturalistic posture.\nBACKGROUND: detailed geometric glass ceiling structure with repeating panels; static, same high-quality digital production as characters.\nDESIGN_ERA: newer rigged/digital animation (2020s style)\nNOTABLE: Low-angle perspective through a geometric glass ceiling, emphasizing the characters' viewpoint from above while framing them within the architectural grid.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are looking down, probably reacting to something below. So it's a comedy-reaction or maybe location-establish? Wait, ",
    "qwen3vl_seconds": 36.7
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/8MtYWoNOL3Y/f00279.jpg",
    "episode": "Totally Spies! Season 7 Episode 7 - Totally Trolling, Much? | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 93.7,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction  \nMOTION_STATE: mid-motion  \nCHARACTERS_VISIBLE: Sam, Clover, Alex, [unidentified female character]  \nSHOT_FRAMING: medium-wide  \nLINEWORK: Medium-thick outlines with consistent cleanliness and sharp edges, typical of polished digital animation. No visible line variation or hand-drawn imperfections.  \nCOLORING: flat-fill with subtle shadowing (e.g., snow depth, building crevices); vibrant character palettes (neon greens/purples) contrast with cool, muted winter background tones.  \nEYE_STYLE: Large oval eyes with prominent white highlights, occupying ~1/3 of the face; expressive (wide pupils, angled brows) to convey surprise.  \nBODY_PROPORTIONS: Head-to-body ratio ~1:6, Western cartoon style with slightly elongated limbs; less exaggerated than anime but more stylized than realistic.  \nBACKGROUND: Detailed yet simplified (stylized snow-covered trees, geometric building); static with no motion-blur, matching character production quality.  \nDESIGN_ERA: Newer digital animation (2000s\u20132010s), evident from crisp vector-like linework and flat digital coloring.  \nNOTABLE: Glowing blue cylindrical device on a stand in the foreground, emitting light that contrasts sharply with the snowy environment.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: There are characters interacting, maybe reacting to something. The one on the right is facing the others, who seem to be in a state ",
    "qwen3vl_seconds": 25.2
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/fdC6OBDmQGM/f00034.jpg",
    "episode": "Totally Spies! Season 7 Episode 11 - The DAH-WHO | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 94.1,
    "qwen3vl_response": "SCENE_TYPE: location-establish\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: background-only\nSHOT_FRAMING: wide\nLINEWORK: Outline thickness is consistent and clean, with smooth, uniform lines defining tree shapes and snow layers; no visible variation or roughness, typical of digital animation.\nCOLORING: flat-with-shadow; solid color fills with subtle gradient shadows on snow and tree trunks to suggest depth, dominated by cool blues, whites, and dark greens for a wintry palette.\nEYE_STYLE: Not visible due to character distance and size.\nBODY_PROPORTIONS: N/A (character too small to discern proportions).\nBACKGROUND: Detailed and static, with the same production quality as foreground elements; no motion blur, fully rendered with consistent stylization.\nDESIGN_ERA: Newer digital animation; clean vector-like lines and flat coloring with digital shading indicate modern production (post-2000s).\nNOTABLE: Stylized, rounded snow clumps on tree branches create a whimsical yet realistic winter aesthetic, distinctive for its balance of cartoonish shape and atmospheric detail.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The image shows a snowy forest with trees and a small character (maybe a deer or animal) in the background. It's setting the locatio",
    "qwen3vl_seconds": 18.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xUo79ZckeK0/f00355.jpg",
    "episode": "Totally Spies! Season 7 Episode 6 - It's Totally a Test | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 91.3,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: subtle-motion\nCHARACTERS_VISIBLE: Clover, Sam, Alex\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines with clean, smooth edges; no visible roughness or variation, typical of digital animation.\nCOLORING: flat-fill with subtle shadowing on clothing; bright, saturated color palette dominated by pinks, blues, and yellows.\nEYE_STYLE: Large anime-influenced eyes (nearly 1/3 of face height), oval-shaped with prominent white highlights; expressive concern shown through pupil placement and eyelid angles.\nBODY_PROPORTIONS: Head-to-body ratio ~1:3 with slightly elongated limbs; Western cartoon base with anime-inspired eye proportions.\nBACKGROUND: Simplified but detailed (furniture, framed art), static with no motion-blur; matches character production quality in digital cleanliness.\nDESIGN_ERA: Newer digital animation (2010s+), evident from smooth vector-style linework and consistent digital coloring.\nNOTABLE: Character-specific spy gadgets (lipstick device, cactus-shaped tool) held in-hand, diagnostic of the \"Totally Spies!\" series' core premise.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are standing and talking, showing expressions. So dialogue/talking-head.\n\nMOTION_STATE: They're mostly still, just su",
    "qwen3vl_seconds": 13.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/4L1cJkYeaT4/f01075.jpg",
    "episode": "Totally Spies! Season 7 Episode 5 - Totally Vintage | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 4,
    "gemma4_response": "",
    "gemma4_seconds": 92.3,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: Starfire\nSHOT_FRAMING: medium-closeup\nLINEWORK: Consistent medium-thickness outlines with clean, sharp edges; no visible line variation or roughness, typical of digital animation.\nCOLORING: flat-fill with subtle shadows; vibrant palette featuring bold reds (outfit), greens (food/background), and cool blues (cityscape), with minimal shading gradients.\nEYE_STYLE: Eyes are tightly closed in a wince, with exaggerated anime-influenced size relative to the face; when open, they\u2019d feature large, expressive shapes with highlights, but here the squint emphasizes comedic discomfort.\nBODY_PROPORTIONS: Slightly anime-influenced Western cartoon style; head-to-body ratio is moderate (not extreme chibi), with stylized but proportionally balanced limbs.\nBACKGROUND: Detailed cityscape through windows (static, no motion-blur); production quality matches characters, with clean digital rendering of buildings and interior elements.\nDESIGN_ERA: Mid-2000s digital animation (consistent with *Teen Titans* series), blending traditional animation expressiveness with early digital rigging/inking.\nNOTABLE: The anthropomorphic green food item (resembling a sushi roll) with a comically distressed expression, paired with Starfire\u2019s exaggerated wince, creates a whimsical focal point.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The character is reacting to something, probably a comedic situation with the food. So comedy-reaction.\n\nMOTION_STATE: The character",
    "qwen3vl_seconds": 14.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/8MtYWoNOL3Y/f00205.jpg",
    "episode": "Totally Spies! Season 7 Episode 7 - Totally Trolling, Much? | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 92.8,
    "qwen3vl_response": "SCENE_TYPE: action/fight\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: Blossom, Bubbles, Buttercup (Powerpuff Girls)\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines with clean, smooth lines; no visible roughness or variation in stroke weight.\nCOLORING: flat-fill with subtle shadows; vibrant primary and secondary color palette (blue, purple, green, pink) typical of the series.\nEYE_STYLE: Large eyes relative to face, oval-shaped with prominent white highlights; highly expressive with strained/angry expressions visible.\nBODY_PROPORTIONS: Head-to-body ratio approximately 1:3; stylized Western cartoon proportions with slightly elongated limbs but not anime-influenced.\nBACKGROUND: Simplified but detailed enough to establish location (school interior); static with no motion-blur; same production quality as characters.\nDESIGN_ERA: Original 1990s traditional animation style (pre-digital rigging era).\nNOTABLE: The exaggerated physical comedy of the three girls in a tangled, strained struggle with matching angry expressions is diagnostic of the Powerpuff Girls' action-comedy approach.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The image shows three characters in a physical struggle, so it's action/fight. \n\nMOTION_STATE: They're in the middle of a dynamic ac",
    "qwen3vl_seconds": 14.7
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/7lA-b6ou8yc/f00141.jpg",
    "episode": "Totally Spies! Season 7 Episode 1 - Frankenpanda | FULL EPISODE \ud83d\udd75\ufe0f NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 90.2,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: subtle-motion\nCHARACTERS_VISIBLE: Sam, Clover, Alex, additional female character (yellow outfit), partial male character (right side)\nSHOT_FRAMING: medium-closeup\nLINEWORK: Clean, consistent medium-thick outlines with no roughness; typical of early 2000s Western TV animation.\nCOLORING: Flat-fill with subtle shadows; bright, saturated pastel palette (purple, green, yellow, red) with high contrast.\nEYE_STYLE: Large oval eyes relative to face, prominent white highlights, highly expressive (conveying shock/surprise in the central character).\nBODY_PROPORTIONS: Western cartoon style; head-to-body ratio ~1:3, limbs slightly stylized but not extreme.\nBACKGROUND: Simplified control room with geometric shapes/screens; static, same production quality as characters.\nDESIGN_ERA: Early 2000s traditional animation (consistent with *Totally Spies!*\u2019 2001\u20132008 production era).\nNOTABLE: The pink, flower-shaped spy gadget on the table is a signature whimsical design element of the series.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are reacting, especially the one in purple with an open mouth. So comedy-reaction makes sense.\n\nMOTION_STATE: The cha",
    "qwen3vl_seconds": 18.8
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/Jl77Vup5yHw/f00850.jpg",
    "episode": "Totally Spies! Season 7 Episode 12 - Undercover Supervillains | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 3,
    "gemma4_response": "",
    "gemma4_seconds": 89.9,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Luz Noceda, Amity Blight\nSHOT_FRAMING: closeup\nLINEWORK: Consistent medium-thickness outlines with clean, sharp edges; no visible roughness or variation, typical of polished digital animation.\nCOLORING: flat-fill with subtle shadowing on faces; warm color palette dominated by oranges and reds in the background, contrasting with characters\u2019 distinct skin tones and hair colors.\nEYE_STYLE: Large anime-influenced eyes relative to face; Luz has brown eyes with small highlights, Amity has blue eyes with prominent highlights, both conveying serious expressiveness.\nBODY_PROPORTIONS: Slightly exaggerated head-to-body ratio (Western cartoon with anime influence); limbs not visible, but facial proportions reflect modern hybrid animation style.\nBACKGROUND: Simplified and static with soft, blurred warm tones; production quality matches characters but lacks fine detail.\nDESIGN_ERA: Newer digital animation (consistent with 2020s TV series production)\nNOTABLE: The stark contrast between the characters\u2019 serious expressions and the fiery, warm background creates a dramatic visual tension unique to this scene\u2019s emotional tone.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are facing forward, likely in a dialogue scene. So dialogue/talking-head.\n\nMOTION_STATE: They're holding still, no vi",
    "qwen3vl_seconds": 15.4
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/7lA-b6ou8yc/f01240.jpg",
    "episode": "Totally Spies! Season 7 Episode 1 - Frankenpanda | FULL EPISODE \ud83d\udd75\ufe0f NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 4,
    "gemma4_response": "",
    "gemma4_seconds": 89.3,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: near-hold\nCHARACTERS_VISIBLE: one character (unidentified female with pink-streaked hair, colorful apron)\nSHOT_FRAMING: medium-closeup\nLINEWORK: Clean, consistent outline thickness with smooth digital precision; no visible texture or variation, typical of modern vector-based animation.\nCOLORING: flat-fill with subtle shadow accents; vibrant, saturated palette dominated by pinks, purples, and blues with high-contrast highlights.\nEYE_STYLE: Eyes closed in a joyful expression, suggesting large anime-influenced eyes when open\u2014implied expressiveness with likely prominent highlights and rounded shape relative to face.\nBODY_PROPORTIONS: Slightly exaggerated head-to-body ratio (cartoonish but not extreme); limbs proportionally balanced, blending Western cartoon styling with anime-inspired facial expressiveness.\nBACKGROUND: Detailed but simplified (cafe interior with plants, furniture); static with no motion-blur; matches characters\u2019 production quality in cleanliness and color vibrancy.\nDESIGN_ERA: Newer digital animation (2010s\u20132020s), indicative of rigged/vector-based production with smooth lines and flat color rendering.\nNOTABLE: The two elaborate purple drinks with whipped cream and straws on the tray, paired with the whimsical pink fish in the background, create a distinctive, playful visual focal point.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The character is holding a tray with drinks, smiling, so it's likely a dialogue or talking-head scene since she's probably interacti",
    "qwen3vl_seconds": 29.1
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/LcD5zxm4vKM/f01082.jpg",
    "episode": "Totally Spies! Season 7 Episode 2 - It Takes A Slob | FULL EPISODE \ud83d\udd75\ufe0f NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 4,
    "gemma4_response": "",
    "gemma4_seconds": 89.6,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: subtle-motion\nCHARACTERS_VISIBLE: two characters (unidentified)\nSHOT_FRAMING: medium-closeup\nLINEWORK: Consistent medium-thickness outlines with clean, sharp edges; no visible line variation or roughness, typical of polished digital animation.\nCOLORING: flat-with-shadow \u2014 solid base colors with subtle gradient shading on faces and clothing, warm earth-toned palette (browns, oranges, deep greens).\nEYE_STYLE: Large relative to face, almond-shaped with small white circular highlights; eyes convey concern through slight squint and directional gaze.\nBODY_PROPORTIONS: Western cartoon with anime-influenced eye design; head-to-body ratio approx 1:5 (visible torso suggests standard cartoon proportions), limbs not visible.\nBACKGROUND: simplified static gradient background (muted browns/oranges), less detailed than characters but same digital production quality.\nDESIGN_ERA: newer digital animation (smooth lines, consistent coloring, likely rigged character models).\nNOTABLE: Red-haired character\u2019s glowing yellow-lensed goggles with visible light reflection are highly distinctive.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are talking, so it's dialogue/talking-head. \n\nMOTION_STATE: They seem to be in a still pose, maybe subtle-motion sinc",
    "qwen3vl_seconds": 29.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/FiCclhRQbiw/f00566.jpg",
    "episode": "Totally Spies! Season 7 Episode 4 - Creepy Crawly Creature Catcher | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 2,
    "gemma4_response": "",
    "gemma4_seconds": 89.3,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Starfire\nSHOT_FRAMING: closeup\nLINEWORK: Consistent medium-thickness outlines with clean, smooth curves; no visible line variation or roughness.\nCOLORING: flat-fill for characters with a gradient-shaded background (dark blue to pink sky); palette features vibrant orange hair, green eyes, and muted green outfit against a starry gradient backdrop.\nEYE_STYLE: Large, oval-shaped eyes with prominent white highlights and thick black lashes; eyes are expressive and dominate the face, typical of stylized superhero animation.\nBODY_PROPORTIONS: Anime-influenced with a large head-to-body ratio (visible upper torso suggests slender build); long hair and stylized features align with Western superhero anime fusion.\nBACKGROUND: Simplified static gradient sky with scattered stars; production quality matches character design with smooth digital coloring.\nDESIGN_ERA: Mid-2000s traditional animation style (consistent with Teen Titans series era), showing hand-drawn aesthetic with digital coloring.\nNOTABLE: Distinctive orange hair with yellow streaks and the gradient sky background transitioning from deep blue to pink, which is characteristic of the show\u2019s stylized nighttime settings.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. First, SCENE_TYPE: it's a character speaking, so dialogue/talking-head. MOTION_STATE: the character is still, so full-hold. CHARACTERS_VISIBLE: the chara",
    "qwen3vl_seconds": 9.5
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/Jl77Vup5yHw/f01097.jpg",
    "episode": "Totally Spies! Season 7 Episode 12 - Undercover Supervillains | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 4,
    "gemma4_response": "",
    "gemma4_seconds": 93.5,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: near-hold\nCHARACTERS_VISIBLE: Loretta Callisto\nSHOT_FRAMING: closeup\nLINEWORK: Clean, consistent outline thickness with smooth, precise edges; no visible roughness or variation, typical of digital animation.\nCOLORING: flat-fill with subtle shadowing; warm palette dominated by oranges (outfit) and reds (background), creating high-contrast intensity.\nEYE_STYLE: Large relative to face, almond-shaped with prominent highlights; expressive furrowed brows and wide pupils convey anger/frustration.\nBODY_PROPORTIONS: Slightly Western cartoon with anime-influenced eye size; head-to-body ratio is standard for TV animation (not extreme), upper torso visible with proportional shoulders.\nBACKGROUND: Simplified static background with textured red wall; same digital production quality as characters, no motion-blur.\nDESIGN_ERA: Newer rigged/digital animation (mid-2010s TV style, consistent with *Miles from Tomorrowland* era).\nNOTABLE: The character\u2019s intensely expressive angry face (furrowed brows, open mouth) paired with the warm, high-contrast color scheme (orange outfit against red background) creates a striking emotional focal point.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The character is speaking or reacting, so dialogue/talking-head. \n\nMOTION_STATE: The character is mostly still, maybe a slight expre",
    "qwen3vl_seconds": 14.0
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/DgZSBwIyP4o/f00305.jpg",
    "episode": "Totally Spies! Season 7 Episode 10 - Totally Pawsome | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 92.4,
    "qwen3vl_response": "SCENE_TYPE: location-establish\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Lois Lane, Superman\nSHOT_FRAMING: medium\nLINEWORK: Consistent, clean digital outlines with uniform thickness; smooth and precise with no visible hand-drawn imperfections.\nCOLORING: flat-fill with minimal shading; bright, saturated primary colors (yellow, red, blue) dominate the palette.\nEYE_STYLE: Background character (Superman) has large, anime-influenced eyes with prominent highlights and expressive shape; foreground character\u2019s face not visible.\nBODY_PROPORTIONS: Slightly exaggerated head-to-body ratio (1:6) with stylized limbs; Western cartoon style with anime influence.\nBACKGROUND: Simplified geometric shapes with minimal detail; static and matching character production quality.\nDESIGN_ERA: Newer digital animation (2020s-era, consistent with 2023\u2019s \"My Adventures with Superman\").\nNOTABLE: Distinctive purple circular emblem on the back of Lois Lane\u2019s yellow suit.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The frame shows two characters, one in the foreground (back to us) and another in the background. It seems like they're in a setting",
    "qwen3vl_seconds": 35.4
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xxc4GQCan0U/f00636.jpg",
    "episode": "Totally Spies! Season 7 Episode 8 - Mega Moon Cheese | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 2,
    "gemma4_response": "",
    "gemma4_seconds": 82.4,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: Alex, Sam, Clover (from Totally Spies!)\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines with clean, smooth edges; no visible roughness or variation, typical of digital animation.\nCOLORING: flat-fill with subtle shadowing on clothing; bright, saturated color palette dominated by greens and pastels.\nEYE_STYLE: Large oval eyes occupying significant face space, with small circular highlights; wide-eyed expressions convey surprise/concern.\nBODY_PROPORTIONS: Western cartoon style with slightly enlarged heads (1:6 head-to-body ratio), natural limb proportions, and stylized but not exaggerated anatomy.\nBACKGROUND: Simplified static foliage with minimal detail; same digital production quality as characters but less intricate.\nDESIGN_ERA: Early 2000s digital animation (consistent with Totally Spies! 2001\u20132008 production era).\nNOTABLE: The pink heart-shaped gadget held by Sam is a distinctive series-specific prop, and the characters\u2019 tangled, off-balance poses suggest a comedic fall or surprise moment.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are in a situation that looks like they're reacting to something, maybe a comedy-reaction because of their expression",
    "qwen3vl_seconds": 13.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/FiCclhRQbiw/f01110.jpg",
    "episode": "Totally Spies! Season 7 Episode 4 - Creepy Crawly Creature Catcher | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 4,
    "gemma4_response": "",
    "gemma4_seconds": 81.6,
    "qwen3vl_response": "SCENE_TYPE: action/fight\nMOTION_STATE: high-motion\nCHARACTERS_VISIBLE: Blossom, Buttercup\nSHOT_FRAMING: wide\nLINEWORK: Clean, consistent outline thickness with smooth curves; no visible sketchiness or variation in line weight.\nCOLORING: flat-fill with subtle shadow accents; vibrant primary colors (green, orange) against a dark blue night palette with purple lens flares.\nEYE_STYLE: Large oval eyes occupying significant face space, with small circular highlights; highly expressive with defined eyelashes.\nBODY_PROPORTIONS: Anime-influenced with oversized heads (1:3 head-to-body ratio) and elongated limbs relative to torso.\nBACKGROUND: Simplified cityscape with geometric buildings; static, same digital production quality as characters.\nDESIGN_ERA: Newer digital animation (2016 reboot era), with vector-based rigging evident in smooth poses.\nNOTABLE: The contrasting lens flares on both sides of the frame create a dynamic visual anchor while emphasizing the nighttime urban setting.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. First, SCENE_TYPE: the characters are in mid-air, likely flying or leaping, so it's action/fight. MOTION_STATE: they're in dynamic poses, so high-motion.",
    "qwen3vl_seconds": 11.5
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/7lA-b6ou8yc/f00874.jpg",
    "episode": "Totally Spies! Season 7 Episode 1 - Frankenpanda | FULL EPISODE \ud83d\udd75\ufe0f NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 3,
    "gemma4_response": "",
    "gemma4_seconds": 82.1,
    "qwen3vl_response": "SCENE_TYPE: chase\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: robot (unidentified), three flying superheroes (unidentified)\nSHOT_FRAMING: wide\nLINEWORK: Outline thickness is consistent and moderately thick, with clean, smooth lines typical of digital animation; no visible sketchiness or variation in stroke weight.\nCOLORING: flat-fill with subtle shadows; vibrant color palette featuring bright primary colors for the flying characters, metallic grays for the robot, and a clear blue sky with soft white clouds.\nEYE_STYLE: The robot features a single large circular \"eye\" (a green-lensed optical sensor) with no human-like eyes; the flying characters are too distant to discern eye details, but the robot\u2019s eye is dominant and mechanical.\nBODY_PROPORTIONS: The robot has a bulky, oversized head-to-body ratio with thick limbs; the flying characters appear Western cartoon\u2013influenced with slightly exaggerated proportions but not extreme anime stylization.\nBACKGROUND: Detailed cityscape with static buildings and sky; production quality matches the characters, with clean digital rendering and no motion blur.\nDESIGN_ERA: Newer rigged/digital animation, evident from the smooth lines, consistent coloring, and polished 3D-influenced character modeling.\nNOTABLE: The stark size contrast between the massive robot and the tiny flying heroes, with the robot\u2019s single glowing green eye serving as a focal point against the bright city backdrop.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The image shows characters in mid-air, likely moving quickly, with a large robot and cityscape. It looks like a chase scene since th",
    "qwen3vl_seconds": 13.8
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xUo79ZckeK0/f00187.jpg",
    "episode": "Totally Spies! Season 7 Episode 6 - It's Totally a Test | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 82.5,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: near-hold\nCHARACTERS_VISIBLE: two characters (unidentified but distinct)\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines, clean and smooth with no visible roughness or variation in line weight.\nCOLORING: flat-with-shadow; vibrant character colors (yellow, pink) contrast with a muted, dark background, using subtle shadows for depth without gradients.\nEYE_STYLE: Large relative to face, oval-shaped with prominent highlights; expressive (left character\u2019s eyes show concern, right character\u2019s closed eyes suggest exasperation).\nBODY_PROPORTIONS: Western cartoon style; head-to-body ratio ~1:3, limbs proportionate to body with no extreme stylization.\nBACKGROUND: Simplified but detailed with shelves/boxes; static, same production quality as characters (digital flat-shading with minimal texture).\nDESIGN_ERA: Newer digital animation (rigged, clean lines, flat shading with shadows).\nNOTABLE: The bright yellow \"wet floor\" sign in the background is a distinctive contextual element indicating a store or warehouse setting.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The two characters are interacting, one is gesturing while the other looks on. It's a dialogue scene, so \"dialogue/talking-head\".\n\nM",
    "qwen3vl_seconds": 15.2
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/LcD5zxm4vKM/f00702.jpg",
    "episode": "Totally Spies! Season 7 Episode 2 - It Takes A Slob | FULL EPISODE \ud83d\udd75\ufe0f NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 2,
    "gemma4_response": "",
    "gemma4_seconds": 92.9,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: two female characters (unidentifiable by name)\nSHOT_FRAMING: medium-closeup\nLINEWORK: Thick, clean outlines with consistent weight throughout; no variation in line thickness, typical of polished digital animation.\nCOLORING: flat-fill with subtle shadows; bright, saturated color palette dominated by purples, blues, and pinks with minimal shading.\nEYE_STYLE: Large eyes partially obscured by oversized sunglasses (blue on left, pink on right); lenses have reflective highlights indicating light source, maintaining expressiveness despite coverage.\nBODY_PROPORTIONS: Western cartoon style with head-to-body ratio ~1:3; limbs proportionate to torso, not anime-influenced.\nBACKGROUND: Simplified static pattern (abstract cloud-like shapes); same digital production quality as characters but less detailed.\nDESIGN_ERA: Newer rigged/digital animation (2010s\u20132020s), evident from smooth lines and uniform coloring.\nNOTABLE: Oversized, uniquely shaped sunglasses (especially the pink geometric frame on the right character) serve as a diagnostic visual signature.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are interacting, one is pointing and the other has a hand near her mouth, so it's dialogue or talking-head. They're h",
    "qwen3vl_seconds": 43.0
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/fdC6OBDmQGM/f00663.jpg",
    "episode": "Totally Spies! Season 7 Episode 11 - The DAH-WHO | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 2,
    "gemma4_response": "",
    "gemma4_seconds": 97.9,
    "qwen3vl_response": "SCENE_TYPE: comedy-reaction\nMOTION_STATE: mid-motion\nCHARACTERS_VISIBLE: Anton\nSHOT_FRAMING: medium-closeup\nLINEWORK: Consistent medium-thick outlines with smooth, clean edges; no visible roughness or variation, typical of digital animation.\nCOLORING: flat-fill with subtle shadows; bright, saturated palette with dominant orange/yellow on the character and multicolored background audience.\nEYE_STYLE: Large oval eyes occupying significant face space, with small circular highlights; wide-eyed expression conveys surprise.\nBODY_PROPORTIONS: Western cartoon style with ~1:3 head-to-body ratio; limbs proportionate to torso, not anime-influenced.\nBACKGROUND: Simplified but detailed audience in static arrangement; no motion blur, matching character production quality.\nDESIGN_ERA: Newer digital animation (modern rigging/digital techniques evident in clean lines and consistent styling)\nNOTABLE: Distinctive white fur-trimmed ice skating helmet paired with the numbered athletic uniform, specific to the show's ice rink episode context.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The character is on an ice rink, maybe reacting or preparing for something. It's not dialogue (no speech bubble), not action/fight, ",
    "qwen3vl_seconds": 26.1
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/Jl77Vup5yHw/f00137.jpg",
    "episode": "Totally Spies! Season 7 Episode 12 - Undercover Supervillains | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 0,
    "gemma4_response": "",
    "gemma4_seconds": 83.7,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Kimber, Jerrica, Aja, Shana\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines with smooth, clean edges; no visible roughness or variation, typical of digital animation.\nCOLORING: flat-fill with subtle shadows; vibrant, saturated color palette featuring bold pinks, blues, greens, and reds.\nEYE_STYLE: Large anime-influenced eyes (nearly 1/3 of face height), oval-shaped with prominent white highlights; expressive concern visible in all characters.\nBODY_PROPORTIONS: Slightly stylized Western cartoon proportions (1:5 head-to-body ratio); limbs are proportional but with subtle anime influence in eye size and facial structure.\nBACKGROUND: Simplified static background with a large circular emblem; same digital production quality as characters, no motion-blur.\nDESIGN_ERA: Newer digital animation (2010s-era revival style, consistent with 2015's \"Jem and the Holograms\" series).\nNOTABLE: Distinctive 2015 Jem reboot character designs with modernized, diverse hairstyles (afro, beret, asymmetrical cut) and fashion-forward outfits.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are standing and looking serious, probably in a dialogue scene. So dialogue/talking-head.\n\nMOTION_STATE: They're all ",
    "qwen3vl_seconds": 19.0
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xxc4GQCan0U/f00354.jpg",
    "episode": "Totally Spies! Season 7 Episode 8 - Mega Moon Cheese | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 83.7,
    "qwen3vl_response": "SCENE_TYPE: location-establish\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: none\nSHOT_FRAMING: wide\nLINEWORK: Outline thickness is consistent and clean throughout, with smooth digital vector-style lines; no variation or roughness, indicating precise digital production.\nCOLORING: flat-fill with subtle gradient shading on architectural elements; uses a bright, pastel color palette with minimal shadowing.\nEYE_STYLE: none (no characters visible)\nBODY_PROPORTIONS: none (no characters visible)\nBACKGROUND: detailed but stylized/simplified; static; same production quality as the rest of the frame, with cohesive digital coloring and linework.\nDESIGN_ERA: newer digital animation (clean vector-like lines, flat color application, and polished aesthetic typical of modern TV animation)\nNOTABLE: The prominent \"W\" emblem on the building\u2019s archway serves as a key visual identifier for the location, suggesting a branded or iconic setting.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. First, SCENE_TYPE: it's a location-establish shot because it shows a building with a \"W\" logo, probably setting the scene for where the action takes plac",
    "qwen3vl_seconds": 17.2
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/SMoOJq93mAY/f00353.jpg",
    "episode": "Animal Antics & Creature Chaos! \ud83c\udf38 Totally Spies! \ud83c\udf38 Seasons 1 - 7 \u2728 FULL Episode Compilation",
    "source": "s1-6-comparison",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 83.5,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Clover, Sam, blue fish (gadget)\nSHOT_FRAMING: medium\nLINEWORK: Consistent medium-thickness outlines with clean, smooth edges; no visible line variation or roughness, typical of digital animation.\nCOLORING: flat-fill with subtle shadows; vibrant primary color palette (bright blue sky, yellow/green outfits, bold blue fish) emphasizing a cheerful, cartoonish tone.\nEYE_STYLE: Large relative to face, almond-shaped with prominent highlights; Sam\u2019s eyes are squinted in expression, Clover\u2019s are closed, showing high expressiveness common in early 2000s Western animation.\nBODY_PROPORTIONS: Slightly stylized Western cartoon proportions\u2014head-to-body ratio is 1:3.5, limbs moderately elongated but not extreme; less exaggerated than anime, more realistic than classic American cartoons.\nBACKGROUND: Simplified static background with soft clouds and distant shapes; same clean digital production quality as characters, no motion-blur.\nDESIGN_ERA: Early 2000s digital animation (reminiscent of \"Totally Spies!\" 2001\u20132008), with hand-drawn digital styling rather than modern rigged animation.\nNOTABLE: The anthropomorphic blue fish with yellow accents, functioning as a gadget, is unexpectedly integrated into an emotional character moment, highlighting the show\u2019s blend of action-comedy and gadget-driven storytelling.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The characters are interacting closely, one is hugging the other, so it's a dialogue or talking-head moment. They seem to be having ",
    "qwen3vl_seconds": 14.9
  },
  {
    "key": "materials/benchmark/youtube-s7-validation/dense-frames/xUo79ZckeK0/f00348.jpg",
    "episode": "Totally Spies! Season 7 Episode 6 - It's Totally a Test | Full Episode | NEW EPISODE!",
    "source": "s7-episode",
    "stratum": 1,
    "gemma4_response": "",
    "gemma4_seconds": 81.6,
    "qwen3vl_response": "SCENE_TYPE: dialogue/talking-head\nMOTION_STATE: full-hold\nCHARACTERS_VISIBLE: Trudy Proud\nSHOT_FRAMING: medium\nLINEWORK: Clean, consistent outline thickness with smooth, precise edges; no visible roughness or variation, typical of modern digital animation.\nCOLORING: flat-with-shadow; vibrant, saturated colors with subtle gradient shadows on the blazer, set against a bright blue background with a detailed holographic display.\nEYE_STYLE: Large relative to the face, almond-shaped with prominent white highlights; expressive and stylized for Western animation, conveying engagement.\nBODY_PROPORTIONS: Head-to-body ratio and limb lengths are proportional to Western cartoon standards (not exaggerated like anime or highly realistic); naturalistic for a contemporary animated sitcom.\nBACKGROUND: Simplified room elements (wall, door frame) with a highly detailed holographic screen; static, no motion blur; background matches character production quality in digital polish.\nDESIGN_ERA: Newer rigged/digital animation (2020s-era, consistent with \"The Proud Family: Louder and Prouder\").\nNOTABLE: The holographic cityscape display with the \"W\" logo on the character\u2019s blazer and screen, indicating a tech-integrated setting unique to the show\u2019s modern reboot aesthetic.",
    "qwen3vl_thinking": "Got it, let's analyze this frame step by step. \n\nFirst, SCENE_TYPE: The character is speaking, probably in a dialogue scene. There's a screen behind her, but the focus is on her talking. So dialogue/t",
    "qwen3vl_seconds": 12.9
  }
]