The VocalData is a corpus of lyrics singing voice, separated from songs of professional singers. Its purpose is to enable the training and testing of automatic lyrics recognition (ALR) systems.
More information about the corpus and experiments can be found in our paper (In this paper, we only used train_clean and test_clean data).
One can also take a look at the Demo.
If you need this corpus for academic use, please email your purpose and institution to:
- Pascal: [email protected]
- Che-Ping: [email protected]
- A total of 110 songs with 62 different singers
- The total duration is about 313.8 minutes
- Every songs is compressed in
flac
format: (singer-id)-(song-id)-(clip-id).flac- train_clean, test_clean : WER < 95% by ASR trained with Librispeech
- train_other, test_other : WER >= 95% by ASR trained with Librispeech
- Directory
alignment/
contains 13 clips with labeled alignments - Our experiment for this data is in another repository Lyric_ASR
- SONGS.TXT: singer, clustered genres information of each song.
- SONG_ori.TXT: raw genres information.
- SETS.TXT: the ID list of songs in
train_clean/
,test_clean/
,train_other/
,test_other/
. - SINGERS.TXT: singer information.
- CLIPS.TXT: duration, labels of singing speed, and harmony existence of each clip.
@inproceedings{tsai2018transcribing,
title={Transcribing lyrics from commercial song audio: the first step towards singing content processing},
author={Tsai, Che-Ping and Tuan, Yi-Lin and Lee, Lin-shan},
booktitle={2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={5749--5753},
year={2018},
organization={IEEE}
}
- DAMP: cover songs corpus
- Kara1k: cover songs corpus
- Singing Voice Audio Dataset: Opera songs by professional and amateur singers
- A list of data related to music information retrieval