From Waxy:
Fast and Free Music Separation with Deezer’s Machine Learning Library
Cleanly isolating vocals from drums, bass, piano, and other musical accompaniment is the dream of every mashup artist, karaoke fan, and producer. Commercial solutions exist, but can be expensive and unreliable. Techniques like phase cancellation have very mixed results.
The engineering team behind streaming music service Deezer just open-sourced Spleeter, their audio separation library built on Python and TensorFlow that uses machine learning to quickly and freely separate music into stems. (Read more in today’s announcement.)
You can train it yourself if you have the resources, but the three models they released already far surpass any available free tool that I know of, and rival commercial plugins and services. The library ships with three pre-trained models:
-
- Two stems – Vocals and Other Accompaniment
- Four stems – Vocals, Drums, Bass, Other
- Five stems – Vocals, Drums, Bass, Piano, Other
It took a couple minutes to install the library, which includes installing Conda, and processing audio was much faster than expected.
On my five-year-old MacBook Pro using the CPU only, Spleeter processed audio at a rate of about 5.5x faster than real-time for the simplest two-stem separation, or about one minute of processing time for every 5.5 minutes of audio. Five-stem separation took around three minutes for 5.5 minutes of audio.
When running on a GPU, the Deezer team report speeds 100x faster than real-time for four stems, converting 3.5 hours of music in less than 90 seconds on a single GeForce GTX 1080.
They have some samples and the effect is really good - something to play with this winter.
Leave a comment