[1] Cosentino J, Pariente M, Cornell S, et al. LibriMix: An Open-Source Dataset for Generalizable Speech Separation[J]. arXiv preprint arXiv:2005.11262, 2020.
[2] Hershey J R, Chen Z, Le Roux J, et al. Deep clustering: Discriminative embeddings for segmentation and separation[C]//2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016: 31-35.
[3] Kolbæk M, Yu D, Tan Z H, et al. Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(10): 1901-1913.
[4] Luo Y, Mesgarani N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266.
[5] Tzinis E, Wang Z, Smaragdis P. Sudo rm-rf: Efficient networks for universal audio source separation[C]//2020 IEEE 30th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2020: 1-6.
[6] Luo Y, Chen Z, Yoshioka T. Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020: 46-50.