Asteroid : Audio Source Separation on steroids

Asteroid is a Pytorch-based source separation and speech enhancement API that enables fast experimentation on common datasets. It comes with a source code written to support a large range of architectures and a set of recipes to reproduce some papers.
Asteroid is intended to be a community-based project so hop on and help us !

Guiding principles

User friendliness. Asteroid's API offers simple solutions for most common use cases.
Modularity. Building blocks are thought and designed to be seamlessly plugged together. Filterbanks, encoders, maskers, decoders and losses are all common building blocks that can be combined in a flexible way to create new systems.
Extensibility. Extending Asteroid with new features is simple. Add a new filterbank, separator, architecture, dataset or even recipe very easily.
Reproducibility. Recipes provide an easy way to reproduce results with data preparation, training and evaluation in a same script.

🚧 ⚠️ Under development ⚠️ 🚧

Installation

In order to install Asteroid, clone the repo and install it using pip or python :

git clone https://github.com/mpariente/AsSteroid
cd AsSteroid
# Install with pip (in editable mode)
pip install -e .
# Install with python
python setup.py install

Running a recipe

cd egs/wham/ConvTasNet
./run.sh

More information in egs/README.md.

Recipes

ConvTasnet (Luo et al.)
Tasnet (Luo et al.)
Deep clustering (Hershey et al. and Isik et al.)
Chimera ++ (for ) (Luo et al. and Wang et al.)
FurcaNeXt (Shi et al.)
DualPathRNN (Luo et al.)
Two step learning (Tzinis et al.)

Writing your own recipe

Contributing

See our contributing guidelines.

Codebase structure

├── asteroid                 # Python package / Source code
│   ├── data                 # Data classes, DalatLoaders maker.
│   ├── engine               # Training classes : losses, optimizers and trainer.
│   ├── filterbanks          # Common filterbanks and related classes.
│   ├── masknn               # Separation building blocks and architectures.
│   └── utils.py
├── examples                 # Simple asteroid examples 
└── egs                      # Recipes for all datasets and systems.
│   ├── wham                 # Recipes for one dataset (WHAM) 
│   │   ├── ConvTasNet       # ConvTasnet systme on the WHAM dataset.
│   │   │   └── ...          # Recipe's structure. See egs/README.md for more info
│   │   ├── Your recipe      # More recipes on the same dataset (Including yours)
│   │   ├── ...
│   │   └── DualPathRNN
│   └── Your dataset         # More datasets (Including yours)

Why Asteroid ?

Audio source separation and speech enhancement are fast evolving fields with a growing number of papers submitted to conferences each year. While datasets such as wsj0-{2, 3}mix, WHAM or MS-SNSD are being shared, there has been little effort to create common codebases for development and evaluation of source separation and speech enhancement algorithms. Here is one !

templeblock / assteroid Goto Github PK

assteroid's Introduction

Asteroid : Audio Source Separation on steroids

Guiding principles

🚧 ⚠️ Under development ⚠️ 🚧

Installation

Running a recipe

Recipes

Writing your own recipe

Contributing

Codebase structure

Why Asteroid ?

assteroid's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent