Git Product home page Git Product logo

dcunettorchsound's Introduction

Phase-aware speech enhancement with DC U-Net

Implementation of paper Phase-aware speech enhancement with deep complex U-Net

Train

Here you find all 4 architectires from paper

DCUnet_10

python3 train_unet.py -m_f 32 -e_d 5 -epochs 10

DCUnet_16

python3 train_unet.py -m_f 32 -e_d 8 -epochs 10

DCUnet_20

python3 train_unet.py -m_f 32 -e_d 10 -epochs 10

DCUnet_20

python3 train_unet.py -m_f 45 -e_d 10 -epochs 10

Model is saved after every epoch if save_best = False, if save_best=True model is saved only if PESQ on val data increased. Specify checkpoint name in -from_checkpoint to start training from checkpoint

Inference

Option 1: inference from Voice Bank + DEMAND with specified voice and noise and desired SNR

python3 inference_one_audio.py \
-chp chp_model_32_8_epoch_5_-0.97_2.98.pth \
-srn 0 \
-speaker_id p295 \
-utterance_id p295_168.wav \
-noise_origin SCAFE \
-noise_id ch14.wav

speaker_id, utterance_id, noise_origin, noise_id - can be None, if None all of them will be random

chp - choose checkpoint name from 'models' directory. All checkpoints during training will be saved in 'models' directory

Option 2: inference from custom file

python3 inference_one_audio.py \
-chp chp_model_32_10_epoch_3_-0.98_2.99.pth \
-custom_file results/live_1.wav \

-custom_file - path to custom file to read and process in model

Some experiments (after 10 epochs training)

SNR Initial sound DCUnet-10 DCUnet-16 DCUnet-20
live audio live.wav live_10.wav live_16.wav live_20.wav
0 init_sound_1.wav sound_1_10.wav sound_1_16.wav sound_1_20.wav
10 init_sound_2.wav sound_2_10.wav sound_2_16.wav sound_2_20.wav

dcunettorchsound's People

Contributors

mhlevgen avatar jonashaag avatar dependabot[bot] avatar

Stargazers

 avatar  avatar Kuang Yuan avatar  avatar Weirenlan avatar Kai Li (李凯) avatar Qibaba avatar  avatar Feiyan avatar Jiang Wenbin avatar  avatar Yunusemre avatar Tianrui Wang (王天锐) avatar Jie Yang avatar LI NAN avatar The Shining avatar  avatar cRinZler avatar  avatar leooo avatar  avatar Ellery Queen avatar 爱可可-爱生活 avatar Hao Zhang avatar  avatar Vinay Kothapally avatar mega avatar  avatar Slice avatar Saurabh avatar Xingjian Du avatar Rishikesh (ऋषिकेश) avatar pushpendra pratap avatar  avatar Cameron Maske avatar  avatar  avatar  avatar

Watchers

James Cloos avatar Rishikesh (ऋषिकेश) avatar  avatar

dcunettorchsound's Issues

loss and pesq is nan

I want to add some noise data to train, but when i do this, the project will be error! like "RuntimeWarning: Mean of empty slice", After the warning, loss and pesq will be nan. I wonder how to process the noise data to eliminate the error.

Pre-trained model

Hi, thanks for your job. @mhlevgen
Is it convenient for you to provide a pre-trained model? I want to do inferences to verify the result of denoising.

Looking forward to your reply.

PESQ computation takes long time

In my setup, computing PESQ of all batches takes ~50% of total training time (with DCUNet-20).

Maybe we can compute PESQ in the background (it's on CPU anyways), or compute PESQ of a subset only.

For my experiments I've changed the code to only compute PESQ of 10% of samples.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.