Install

Draft — verify against the current release.

Prerequisites

Castwright runs the synthesis engines locally, so it needs a reasonably modern machine.

GPU. An NVIDIA GPU with 8 GB of VRAM is the benchmarked path — a gaming PC or gaming laptop generates at roughly real-time. On Apple-silicon Macs the engines run on the Mac’s GPU automatically (CUDA → Apple’s MPS → CPU selection is built in); it works, just slower than the NVIDIA path. Without any GPU the app falls back to CPU, which works but is the slowest option — treat it as a fallback, not a plan. The Kokoro engine (the default, English-only engine) and the Qwen voice-design engine both use whichever device is selected.

AMD GPU (experimental preview). Castwright detects an AMD GPU and runs Qwen and Coqui on ROCm while Kokoro stays on the CPU — DirectML can’t run the Kokoro model, so that’s expected, not a fault. It needs a ROCm-supported AMD card and a recent driver (on Windows, the latest Adrenalin). Because the ROCm wheels are still alpha previews, the install is best-effort: if any AMD step fails, Castwright completes a working CPU setup instead of erroring and tells you it’s running on CPU. To stay on CPU and silence the warning, set the Accelerator to CPU in Advanced settings — changing the accelerator rebuilds the Python environment, so it isn’t instant. Your books and designed voices are untouched either way; they live in the workspace, not the venv.

Operating system. Windows, macOS, and Linux are all supported — Windows or Linux for the NVIDIA path, macOS 12+ on Apple silicon. Windows is the most tested platform during early access.

Audio assembly. The last step of every audiobook stitches your generated voice clips into one properly-levelled file. Castwright does this with a free tool called ffmpeg — install it once and you’re set:

Windows: winget install ffmpeg
macOS: brew install ffmpeg
Linux: sudo apt install ffmpeg

Castwright checks for it on first run (the setup wizard’s Audio assembly step) and will generate voices fine without it — it just can’t assemble the finished audiobook until ffmpeg is there.

Python environment. The synthesis sidecar runs in a local Python virtual environment. The installer sets this up for you on first run — you do not need to manage it manually.

Download

There are two ways in, both free — pick by comfort level.

One click with Pinokio

If you use Pinokio, Castwright installs with no terminal and no prerequisites — Pinokio provides its own Python, ffmpeg, and Node. Open Pinokio, paste the Castwright repo URL (https://github.com/dudarenok-maker/Castwright), and click Install; it builds the latest published release (around ten to twenty minutes). Click Start, then Open Web UI — the first launch runs the same setup wizard as the native install. Update, Stop, and Reset are one click each, and your books and designed voices are preserved across all of them.

Direct download

The canonical place to get the zip is the site’s download page, or grab it straight from the GitHub releases page — same artifact, your pick. Download it, extract it, and run the launcher; one build runs on Windows, macOS, and Linux. At launch this becomes proper installers: signed for Windows, notarised for macOS.

First run

On first launch Castwright will download the Kokoro voice weights (about 1 GB) and set up the local analysis model via Ollama if you have it installed, or fall back to a Gemini API key if you supply one. The in-app Model Manager (under the account menu) shows installation progress and lets you add optional engines like Coqui XTTS or the Qwen voice-design model.

If you run into trouble, see Troubleshooting or email hello@castwright.ai.