LocalLLM by Actalithic

refresh

storage Calculating cache size…

Refresh to latest version?

If actalithic.github.io has shipped an update, this clears the service worker cache so the new version loads on next visit — just like a hard refresh on a new device.

All downloaded models will also be deleted.
Every AI model cached locally will be removed and must be re-downloaded before you can chat again. Only do this if you need the latest app version or want to free up space.

LocalLLM

100% local. Zero server. Zero API.

Runs entirely in your browser via WebGPU. First download is cached permanently — subsequent loads are instant.

Initializing…

Engine Options

ActalithicCore Actalithic

Fast hybrid GPU+CPU processing. Offloads excess layers to RAM — runs 7B+ on 4–6 GB VRAM. Targets ~5 tok/s on 8B, 7–12 tok/s on 7B, 15–20 tok/s on 3B. Best for mid-range GPUs.

CPU / WASM fallback Experimental

No GPU required. Runs via WebAssembly. Expect ~1–3 tok/s.

Enable thinking DeepSeek R1

Shows a collapsible reasoning block before answers. More accurate on complex questions, slightly slower.

Expected speed by device (tokens / sec)

Dedicated GPU (RTX 3060+)

—

Steam Deck (RDNA2 iGPU)

—

Laptop integrated GPU

—

Phone (Snapdragon)

—

CPU / WASM fallback

—

ActalithicCore (hybrid)

—

info Models are created by Meta, Mistral AI, and DeepSeek. Actalithic does not own or take responsibility for model outputs. Downloads are sourced via MLC AI and Hugging Face.