Skip to content

cue.local_models

On-device model manifest + downloader for LocalVisionBackend.

Each entry in MANIFEST pins the (repo_id, filename, revision, sha256) tuple for both the model GGUF and its multimodal projector GGUF. The download path uses huggingface_hub.hf_hub_download for resumability + revision pinning. After download, we hash the file locally (when sha256 is set) so a corrupted cache or a swapped HF revision doesn't silently break vision input.

Privacy posture: - Models live under CONFIG_DIR/models/ (owner-only 0o700 via cue.fs.secure_dir). - No image bytes ever flow back into HF — only the GGUF download. - Failures surface as typed exceptions; the LocalVisionBackend (commit 6) treats LocalUnavailable as a skip-this-cycle signal, never a silent cloud fallback.

The plan calls for revision + sha256 to be pinned literals so a compromised HF can't swap the file at the same tag. revision lands as "main" and sha256 as None in the initial commit; the values get locked in a follow-up after the first verified download. Until then, callers should be aware that HF is the trust anchor.

LocalModelError

Bases: Exception

Base class for any failure of the local-model machinery.

LicenseGated

Bases: LocalModelError

The HF repo requires authentication and no token is available.

InsufficientDisk

Bases: LocalModelError

Not enough free space for model + mmproj + a small margin.

DownloadFailed

Bases: LocalModelError

The download itself raised — network error, 4xx/5xx, etc.

ChecksumMismatch

Bases: LocalModelError

The downloaded file's sha256 doesn't match the manifest.

MmprojMissing

Bases: LocalModelError

Model was downloaded but the multimodal projector wasn't — LocalVisionBackend can't run vision inference without it.

Cancelled

Bases: LocalModelError

Caller cancelled the download via the cancel_event.

Artifact dataclass

A single GGUF file (either the model or its mmproj).

models_dir

models_dir() -> Path

CONFIG_DIR/models/ — owner-only, opted out of search indexing.

is_present

is_present(slug: str) -> bool

Cheap readiness probe used by the Settings banner: do BOTH artifacts exist on disk? No checksum work, no network. Safe to call from the UI thread. Caller must run is_ready() before treating the model as trusted — is_present exists only so the banner doesn't freeze on a multi-GB hash.

is_ready

is_ready(slug: str) -> bool

Full readiness check: both files present + checksums (when pinned) match. No network calls. Updates get_state(slug) so the Settings banner can read the latest result.

Result is cached: once _state[slug] reaches READY, subsequent calls return True without re-hashing as long as both file paths still exist. The first-cycle verify after a download still runs the slow path.

download

download(slug: str, *, progress: Callable[[str, str], None] | None = None, cancel_event: Event | None = None) -> ModelState

Resumable download of model + mmproj for slug. Idempotent — if both files are already cached and pass the checksum check, returns immediately with status=READY.

progress(stage, message) is invoked on milestones so the Settings banner can render coarse status (stage{"checking", "downloading_model", "downloading_mmproj", "verifying", "ready"}). Per-byte progress is deferred to a follow-up; huggingface_hub already prints a tqdm bar to stderr today.

cancel_event is checked between artifacts (not mid-download — the sync HF API doesn't expose a cancel hook). Setting it interrupts after the current artifact completes.

preflight

preflight(slug: str) -> ModelState

Boot-time helper: idempotently brings the model to READY. Safe to call from a daemon thread — disk-space + license errors raise typed exceptions; the caller (typically cue.main) logs and moves on so a missing local model never blocks startup.