Git Product home page Git Product logo

deepfilternet's Introduction

DeepFilterNet

A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. Audio samples from the voice bank/DEMAND test set can be found at https://rikorose.github.io/DeepFilterNet-Samples/

  • libDF contains Rust code used for data loading and augmentation.
  • DeepFilterNet contains Python code including a libDF wrapper for data loading, DeepFilterNet training, testing and visualization.
  • models contains DeepFilterNet model weights and config.

Usage

This framework is currently only tested under Linux.

PyPI

Install the DeepFilterNet python package via pip:

# Install cpu/cuda pytorch (>=1.8) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install DeepFilterNet
pip install deepfilternet

To enhance noisy audio files using DeepFilterNet run

# Specify an output directory with --output-dir [OUTPUT_DIR]
deepFilter path/to/noisy_audio.wav

Manual Installation

Install cargo via rustup. Usage of a conda or virtualenv recommended.

Installation of python dependencies and libDF:

cd path/to/DeepFilterNet/  # cd into repository
# Recommended: Install or activate a python env
# Mandatory: Install cpu/cuda pytorch (>=1.8) dependency from pytorch.org, e.g.:
pip install torch torchaudio -f https://download.pytorch.org/whl/cpu/torch_stable.html
# Install build dependencies used to compile libdf and DeepFilterNet python wheels
pip install maturin poetry
# Build and install libdf python package required for enhance.py
maturin develop --release -m pyDF/Cargo.toml
# Optional: Install libdfdata python package with dataset and dataloading functionality for training
maturin develop --release -m pyDF-data/Cargo.toml
# Install remaining DeepFilterNet python dependencies
cd DeepFilterNet
poetry install

To enhance noisy audio files using DeepFilterNet run

# usage: enhance.py [-h] [--output-dir OUTPUT_DIR] [--model_base_dir MODEL_BASE_DIR] noisy_audio_files [noisy_audio_files ...]
python DeepFilterNet/df/enhance.py DeepFilterNet/pretrained_models/DeepFilterNet/ path/to/noisy_audio.wav

Training

The entry point is DeepFilterNet/df/train.py. It expects a data directory containing HDF5 dataset as well as a dataset configuration json file.

So, you first need to create your datasets in HDF5 format. Each dataset typically only holds training, validation, or test set of noise, speech or RIRs.

# Install additional dependencies for dataset creation
pip install h5py librosa soundfile
# Go to DeepFilterNet python package
cd path/to/DeepFilterNet/DeepFilterNet
# Prepare text file (e.g. called training_set.txt) containing paths to .wav files
#
# usage: prepare_data.py [-h] [--num_workers NUM_WORKERS] [--max_freq MAX_FREQ] [--sr SR] [--dtype DTYPE]
#                        [--codec CODEC] [--mono] [--compression COMPRESSION]
#                        type audio_files hdf5_db
#
# where:
#   type: One of `speech`, `noise`, `rir`
#   audio_files: Text file containing paths to audio files to include in the dataset
#   hdf5_db: Output HDF5 dataset.
python df/prepare_data.py --sr 48000 speech training_set.txt TRAIN_SET_SPEECH.hdf5

All dataset should be made available in one dataset folder for the train script.

The dataset configuration file should contain 3 entries: "train", "valid", "test". Each of those contains a list of datasets (e.g. a speech, noise and a RIR dataset). Optionally a sampling factor may be specified that can be used to over/under-sample the dataset. Say, you have a specific dataset with transient noises and want to increase the amount of non-stationary noises by oversampling.

File dataset.cfg:

{
  "train": [
    [
      "TRAIN_SET_SPEECH.hdf5",
      1.0
    ]
  ]
}

Finally, start the training script. The training script may create a model base_dir if not existing used for logging, some audio samples, model checkpoints, and config. If no config file is found, it will create a default config. See DeepFilterNet/pretrained_models/DeepFilterNet for a config file.

# usage: train.py [-h] [--debug] data_config_file data_dir base_dir
python df/train.py path/to/dataset.cfg path/to/data_dir/ path/to/base_dir/

Citation

This code accompanies the paper 'DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering'.

@misc{schröter2021deepfilternet,
      title={DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering}, 
      author={Hendrik Schröter and Alberto N. Escalante-B. and Tobias Rosenkranz and Andreas Maier},
      year={2021},
      eprint={2110.05588},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

License

DeepFilterNet is free and open source! All code in this repository is dual-licensed under either:

at your option. This means you can select the license you prefer!

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

deepfilternet's People

Contributors

dependabot[bot] avatar rikorose avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.