This page contains audio samples of our paper “MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation” submitted to ICASSP 2023. Code is here.

Pop samples

In this section, accompaniments of each song were first removed by GSEP (https://studio.gaudiolab.io), provided by GAUDIO Lab, Inc.

Duet

Jason Mraz and Colbie ******Caillat - Lucky
Shawn Mendes and Camila Cabello - Señorita
Lady Gaga and Tony Bennett - I’ve Got You Under My Skin
Verandah Project (Dong Ryul Kim and Sang-soon Lee) - Bike Riding

Main vs. rest

Queen - Bohemian Rhapsody

Duet audio samples in MedleyVox

Sample 1
Sample 2
Sample 3

Unison audio samples in MedleyVox

Sample 1
Sample 2
Sample 3

Main vs. rest audio samples in MedleyVox

Sample 1
Sample 2
Sample 3

STFT vs. Learnable basis

We observed that STFT/iSTFT basis provide perceptually better output than learnable encoder-decoder framework, which was originally used in Conv-TasNet and many other literatures in speech separation. Since we did not use mixture consistency loss for training the models on this comparison, outputs of the models were loudness normalized to -27 LUFS to prevent the output scale exploding.

Sample 1