Deep CASA for talker-independent monaural speaker separation

Introduction

This is the Tensorflow implementation of "Divide and conquer: A deep CASA approach to talker-independent monaural speaker separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, pp. 2092-2102. Please find demos and keynote lecture slides at ASRU-19 here.

./feat/exp_prepare_folder.sh: prepares folders for experiments.
./feat/feat_gen.py: generates STFT featuers for training, validation and test.
./feat/stft.py: defines STFT and iSTFT.
./nn/simul_group.py: training/validation/test of the simultaneous grouping stage.
./nn/seq_group.py: training/validation/test of the sequential grouping stage.
./nn/utility.py: defines various functions in simul_group.py and seq_group.py.

Experimental setup

This codebase has been tested on AWS EC2 p3.2xlarge nodes with Deep Learning AMI (Ubuntu 16.04) Version 26.0.

Follow instructions in turn to set up the environment and run experiments.

Requirements:
- Tensorflow 1.15.0.
  Activate the environment on EC2:
```
source activate tensorflow_p27
```
- gflags
```
pip install python-gflags
```
- Please install other necessary python packages if not using AWS deep Learning AMI (Ubuntu 16.04) Version 26.0.
Before running experiments, activate the tensorflow environment on EC2 using:
```
source activate tensorflow_p27
```
Generate the WSJ0-2mix dataset using http://www.merl.com/demos/deep-clustering/create-speaker-mixtures.zip. Copy the generated files to the EC2 instance.
Start feature extraction by running the following command in the main directory:
```
python feat/feat_gen.py
```
Thre are two arguments in feat_gen.py, data_folder and wav_list_folder. Change them to where your WSJ0-2mix dataset and file list locate.
Train the simultaneous grouping stage using:
```
TIME_STAMP=train_simul_group
python nn/simul_group.py --time_stamp $TIME_STAMP --is_deploy 0 --batch_size 1 
```
- Due to utterance-level training and limited GPU memory, batch_size can be selected as 1 or 2.
- Change data_folder and wav_list_folder accordingly.
- You can also change other hyperparameters, e.g., the number of epochs and learning rate, using gflags arguments.
Run inference of simultaneous grouping (tt set) using:
```
RESUME_MODEL=exp/deep_casa_wsj/models/train_simul_group/deep_casa_wsj_model.ckpt_step_1
python nn/simul_group.py --is_deploy 1 --resume_model $RESUME_MODEL
```
- $RESUME_MODEL is the model to be loaded for inference. Change it accordingly.
- Mixtures, clean references and Dense-UNet estimates will be generated and saved in folder ./exp/deep_casa_wsj/output_tt/files/.
- Please use your own scripts to generate results in different metrics.
Generate temporary .npy file for the next stage (sequential grouping):
```
RESUME_MODEL=exp/deep_casa_wsj/models/train_simul_group/deep_casa_wsj_model.ckpt_step_1
python nn/simul_group.py --is_deploy 2 --resume_model $RESUME_MODEL
```
- Setting is_deploy to 2 will generate unorganized estimates by Dense-UNet, and save them as .npy files for the sequential grouping stage.
- tr, cv and tt data are generated in turn, and saved in ./exp/deep_casa_wsj/feat/.
Train the sequential grouping stage using:
```
TIME_STAMP=train_seq_group
python nn/seq_group.py --time_stamp $TIME_STAMP --is_deploy 0
```
Change data_folder and wav_list_folder accordingly. You can also change other hyperparameters, e.g., the number of epochs and learning rate, using gflags arguments.
Run inference of sequential grouping (tt set) using:
```
 RESUME_MODEL=exp/deep_casa_wsj/models/train_seq_group/deep_casa_wsj_model.ckpt_step_1
 python nn/seq_group.py --is_deploy 1 --resume_model $RESUME_MODEL
```
- $RESUME_MODEL is the model to be loaded for inference. Change it accordingly.
- Mixtures, clean references and estimates will be saved in folder ./exp/deep_casa_wsj/output_tt/files/.
- Please use your own scripts to generate results in different metrics.

michaelrw / deep-casa Goto Github PK

deep-casa's Introduction

Deep CASA for talker-independent monaural speaker separation

Introduction

Contents

Experimental setup

deep-casa's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent