🧰 Project
📂 Datasets

LRS3-For-Speech-Separation
Kai Li
- Open source audio-visual dataset processing script. Following are the steps to generate training and testing data. There are several parameters to change in order to match different purpose.
🎤 Audio-only Speech Separation Methods

DPRNN-Pytorch
Kai Li
- Dual-path RNN. Efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch.


Conv-TasNet
Github Repo | | 知乎: Conv-TasNet阅读笔记
Kai Li
- Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch’s Implement.

UtterancePIT
Kai Li
- According to funcwj’s uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.

Deep Clustering
Kai Li
- Deep clustering in the field of speech separation implemented by pytorch.

AFRCNN
Github Repo | | 知乎: AFRCNN阅读笔记
Kai Li
- Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network.
🎬 Audio-visual Speech Separation Methods

Looking to Listen at the Cocktail Party
Kai Li
- The project is an audiovisual model reproduced by the contents of the paper Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation.
📖 Tutorial

Speech-Separation-Paper-Tutorial
Kai Li
- A must-read paper and tutorial list for speech separation based on neural networks.