How-to guide
How to remove vocals with Demucs vs VocalSplit
Demucs (from Meta/Facebook AI Research) is a state-of-the-art open-source separator in its Hybrid Transformer release. It is free if you can set up Python and have a decent GPU. VocalSplit is a hosted alternative with zero setup.
Demucs on CPU is very slow (minutes per song). With an NVIDIA GPU it becomes fast enough for batch work. VocalSplit is consistently fast regardless of what machine you run it from — it is all cloud GPU.
Step-by-step
- Demucs method — Set up Python environment. pip install -U demucs. PyTorch installs automatically. For GPU, install a CUDA-enabled PyTorch build first.
- Demucs method — Run the separator. demucs song.mp3 --two-stems=vocals. First run downloads the Hybrid Transformer model (about 650 MB).
- VocalSplit method — Upload at vocalsplit.io. Drop the file on the upload area. No install, no model downloads.
- VocalSplit method — Pay and process. $0.99 per split. Returns in about 15 seconds regardless of your hardware.
- Pick based on use case. Demucs locally: free for big batch work on a GPU you own. VocalSplit: far less setup friction, comparable output quality on most material, fair pricing per split.
Tips for better results
- Demucs Hybrid Transformer is currently one of the top open-source models on most leaderboards.
- Expect 1–3 minutes per song on CPU, 5–15 seconds on a decent GPU for Demucs.
- If you only need occasional separation, VocalSplit is usually cheaper than the electricity plus hardware cost of running Demucs yourself.
Try VocalSplit free
Upload a song and get clean vocals and instrumental stems in under 15 seconds. First split is $0.99.
Split a song