cue.image_preprocess¶

Image preprocessing for digest input.

Downscales a keyframe to the local-vision model's native resolution, strips EXIF + ICC metadata, and re-encodes to JPEG. Returns bytes — the caller (LocalVisionBackend) hands them straight to llama-server as a data URL. Keyframe paths never appear in the prompt or in any log line that crosses the local→remote boundary; only the bytes do.

prepare_for_digest ¶

prepare_for_digest(path: Path, *, max_dim: int = DEFAULT_MAX_DIM, quality: int = DEFAULT_QUALITY) -> bytes

Open path, downscale to fit within max_dim × max_dim keeping aspect, strip EXIF / ICC, re-encode JPEG at quality. Returns the encoded bytes; never writes to disk.

EXIF stripping matters because some camera / phone-uploaded screen captures carry GPS or device-id metadata that the model would see even if the visible pixels are clean. The pruner already writes EXIF-free JPEGs at ≤ max_dim, so the common case is the short-circuit at the top — no decode, no re-encode, no generation loss.