Git Product home page Git Product logo

ss-vq-vae's Introduction

Self-Supervised VQ-VAE for One-Shot Music Style Transfer

This is the code repository for the ICASSP 2021 paper Self-Supervised VQ-VAE for One-Shot Music Style Transfer by Ondřej Cífka, Alexey Ozerov, Umut Şimşekli, and Gaël Richard.

Copyright 2020 InterDigital R&D and Télécom Paris.

Links

🔬 Paper preprint [pdf]
🎵 Supplementary website with audio examples
🎤 Demo notebook
🧠 Trained model parameters (212 MB)

Contents

  • src – the main codebase (the ss-vq-vae package); install with pip install ./src; usage details below
  • data – Jupyter notebooks for data preparation (details below)
  • experiments – model configuration, evaluation, and other experimental stuff

Setup

pip install -r requirements.txt
pip install ./src

Usage

To train the model, go to experiments, then run:

python -m ss_vq_vae.models.vqvae_oneshot --logdir=model train

This is assuming the training data is prepared (see below).

To run the trained model on a dataset, substitute run for train and specify the input and output paths as arguments (use run --help for more information). Alternatively, see the colab_demo.ipynb notebook for how to run the model from Python code.

Datasets

Each dataset used in the paper has a corresponding directory in data, containing a Jupyter notebook called prepare.ipynb for preparing the dataset:

  • the entire training and validation dataset: data/comb; combined from LMD and RT (see below)
  • Lakh MIDI Dataset (LMD), rendered as audio using SoundFonts
    • the part used as training and validation data: data/lmd/audio_train
    • the part used as the 'artificial' test set: data/lmd/audio_test
    • both require downloading the raw data and pre-processing it using data/lmd/note_seq/prepare.ipynb
    • the following SoundFonts are required (available here and here): FluidR3_GM.sf2, TimGM6mb.sf2, Arachno SoundFont - Version 1.0.sf2, Timbres Of Heaven (XGM) 3.94.sf2
  • RealTracks (RT) from Band-in-a-Box UltraPAK 2018 (not freely available): data/rt
  • Mixing Secrets data
    • the 'real' test set: data/mixing_secrets/test
    • the set of triplets for training the timbre metric: data/mixing_secrets/metric_train
    • both require downloading and pre-processing the data using data/mixing_secrets/download.ipynb

Acknowledgment

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 765068.

ss-vq-vae's People

Contributors

cifkao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.