circle cached · radio_button_unchecked not downloaded
refresh
storageCalculating cache size…
Refresh to latest version?
If actalithic.github.io has shipped an update, this clears the service worker cache so the new version loads on next visit — just like a hard refresh on a new device.
All downloaded models will also be deleted.
Every AI model cached locally will be removed and must be re-downloaded before you can chat again. Only do this if you need the latest app version or want to free up space.
Model Source
Active model
Downloaded from
Models are quantized by MLC AI and served via Hugging Face. Cached in your browser after first download. 100% local — no data ever leaves your device.
memoryno model
wifi_off
Offline — inference works normally, new downloads unavailable.
smartphone
Phone detected — Llama 3.2 3B selected automatically for best speed.
Heavier models may freeze your device. Tap ■ Stop anytime to abort a response.
LocalLLM
100% local. Zero server. Zero API.
Runs entirely in your browser via WebGPU. First download is cached permanently — subsequent loads are instant.
Initializing…
0%
Engine Options
ActalithicCore Actalithic
Fast hybrid GPU+CPU processing. Offloads excess layers to RAM — runs 7B+ on 4–6 GB VRAM. Targets ~5 tok/s on 8B, 7–12 tok/s on 7B, 15–20 tok/s on 3B. Best for mid-range GPUs.
CPU / WASM fallback Experimental
No GPU required. Runs via WebAssembly. Expect ~1–3 tok/s.
Enable thinking DeepSeek R1
Shows a collapsible reasoning block before answers. More accurate on complex questions, slightly slower.
Expected speed by device (tokens / sec)
Dedicated GPU (RTX 3060+)
—
Steam Deck (RDNA2 iGPU)
—
Laptop integrated GPU
—
Phone (Snapdragon)
—
CPU / WASM fallback
—
ActalithicCore (hybrid)
—
infoModels are created by Meta, Mistral AI, and DeepSeek. Actalithic does not own or take responsibility for model outputs. Downloads are sourced via MLC AI and Hugging Face.