# THE BAKER — Flux.2 LoRA Training

Custom fine-tuned model for generating deck visuals, mood boards, and concept art in THE BAKER's "grandeur with decay" aesthetic.

---

## Quick Start

### 1. Generate Training Dataset

```bash
cd /home/workspaces/thebaker/training

# Activate the cultguard-agents environment (has diffusers, transformers, etc.)
source /home/workspaces/cultguard-agents/.devenv/state/venv/bin/activate

# Generate 54 training images + captions (requires GPU, ~30-60 minutes)
python generate_dataset.py --config config.json

# OR: Only generate captions, add your own images manually
python generate_dataset.py --config config.json --skip-generation
```

**Output:**
- `dataset/images/` — 54 PNG images (1024x1024)
- `dataset/captions/` — 54 TXT caption files

### 2. Train LoRA Adapter

```bash
# Training requires ~24GB VRAM minimum (RTX 6000 Ada has 48GB)
python train_flux_lora.py --config config.json
```

**Training Details:**
- **Steps:** 2000 (~2-4 hours on RTX 6000 Ada)
- **Checkpoints:** Saved every 500 steps
- **Output:** `outputs/checkpoint-XXXX/` and `outputs/lora_weights.safetensors`

### 3. Use the Trained LoRA

```python
from diffusers import FluxPipeline
from peft import PeftModel
import torch

# Load base model
pipeline = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.2-dev",
    torch_dtype=torch.float16,
).to("cuda")

# Load LoRA weights
pipeline.load_lora_weights("/home/workspaces/thebaker/training/outputs/lora_weights.safetensors")

# Generate images
prompt = "the baker film style, fredric barakat portrait, aging patriarch, grandeur with decay, chiaroscuro lighting, cinematic"
image = pipeline(prompt, num_inference_steps=28, guidance_scale=3.5).images[0]
image.save("output.png")
```

---

## Visual Concepts

The LoRA is trained on 54 images across 8 categories:

| Category | Count | Description |
|----------|-------|-------------|
| Estate Interiors | 10 | Barakat estate: baroque, phoenician, decay |
| Church and Ritual | 8 | Maronite Catholic, communion, reverence |
| Nightclub and Violence | 8 | Opulence with basement, hidden danger |
| Lebanon and Memory | 8 | Byblos, civil war, cemetery, exile |
| Family and Food | 6 | Communion reception, gatherings, tension |
| Character Portraits | 10 | Fredric, Magda, Billy, Vincent, Aida, etc. |
| Symbolic Objects | 8 | Crucifix, gold Beretta, bread and blood |
| Title Cards | 6 | Deck visuals, typography, dividers |

**Total:** 54 images, ~2-3 hours generation time, ~2-4 hours training

---

## Configuration

Edit `config.json` to customize:

```json
{
  "training": {
    "rank": 16,           // LoRA rank (higher = more capacity, more VRAM)
    "max_train_steps": 2000,
    "learning_rate": 1e-4,
    "batch_size": 1,
    "mixed_precision": "fp16"
  },
  "dataset": {
    "resolution": 1024    // Image size (Flux.2 native)
  }
}
```

**VRAM Requirements:**
- Rank 16, fp16: ~18-22GB
- Rank 32, fp16: ~24-28GB
- Rank 16, fp32: ~32-36GB

---

## Style Keywords

Always include these in prompts for best results:

**Required:**
- `the baker film style` (trigger word)

**Recommended:**
- `grandeur with decay`
- `chiaroscuro lighting`
- `cinematic`
- `baroque catholic weight`
- `phoenician memory`
- `lebanese diaspora`
- `morally compromised luxury`

**Example:**
```
the baker film style, barakat estate interior, baroque catholic weight with phoenician memory, grandeur with decay, chiaroscuro lighting, cinematic
```

---

## File Structure

```
thebaker/training/
├── config.json                 # Training configuration
├── generate_dataset.py         # Dataset generation script
├── train_flux_lora.py          # LoRA training script
├── README.md                   # This file
├── visual_concepts.md          # Detailed concept documentation
├── dataset/
│   ├── images/                 # Training images (PNG/JPG)
│   └── captions/               # Caption files (TXT)
├── outputs/
│   ├── checkpoint-500/         # Training checkpoints
│   ├── checkpoint-1000/
│   ├── checkpoint-1500/
│   ├── checkpoint-2000/
│   └── lora_weights.safetensors # Final LoRA weights
└── logs/                       # Training logs
```

---

## Troubleshooting

### Out of Memory (OOM)

Reduce batch size or rank:
```json
{
  "training": {
    "batch_size": 1,
    "rank": 8,
    "gradient_accumulation_steps": 8
  }
}
```

### Slow Generation

- Use `gemma4:31b-cloud` for faster VLM work
- Reduce resolution to 768x768 for testing
- Use fewer inference steps (20 instead of 28)

### Poor Quality Results

- Increase training steps to 3000
- Increase rank to 32
- Refine captions with more specific details
- Add more training images to weak categories

---

## Integration with Deck Workflow

### Generate Deck Visuals

```bash
# Generate a visual for a specific slide
python generate_deck_visual.py \
  --prompt "the baker film style, holy communion ceremony, maronite church, family gathering" \
  --output assets/decks/visuals/communion.png \
  --lora /home/workspaces/thebaker/training/outputs/lora_weights.safetensors
```

### Update Typst Deck

Add generated images to your Typst deck:

```typst
#image("assets/decks/visuals/estate.png", width: 100%)
```

---

## Next Steps

1. **Generate dataset** with `generate_dataset.py`
2. **Review images** and refine captions if needed
3. **Train LoRA** with `train_flux_lora.py`
4. **Test outputs** with sample prompts
5. **Generate deck visuals** for Cannes pitch
6. **Iterate**: regenerate weak concepts, retrain

---

## Credits

**Base Model:** Flux.2-dev by Black Forest Labs  
**Training Framework:** Diffusers + PEFT + Accelerate  
**Visual Direction:** THE BAKER pitch deck (Ronny Mouawad & Nicholas Lathouris)  
**LoRA Training:** Custom script for "grandeur with decay" aesthetic

---

## License

This training pipeline is for THE BAKER project internal use.  
Base model (Flux.2) subject to Black Forest Labs license terms.
