Git Product home page Git Product logo

scalablefhvae's Introduction

Scalable Factorized Hierarchical Variational Autoencoders

This repository contains (refactored) codes to reproduce the core results from the two papers:

Previous version of the codes can be found here

If you find the code useful, please cite

@inproceedings{hsu2017learning,
  title={Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data},
  author={Hsu, Wei-Ning and Zhang, Yu and Glass, James},
  booktitle={Advances in Neural Information Processing Systems},
  year={2017},
}
@article{hsu2018scalable,
  title={Scalable Factorized Hierarchical Variational Autoencoder Training},
  author={Hsu, Wei-Ning and Glass, James},
  journal={arXiv preprint arXiv:1804.03201},
  year={2018},
  arxiv={1804.03201},
}

Dependencies

This project uses Python 2.7.6. Before running the code, you have to install

The former 9 dependencies can be installed using pip by running

pip install -r requirements.txt

The last one requires Kaldi before a specific commit (d1e1e3b). If you don't have Kaldi before that version, you can install both Kaldi and Kaldi-Python by running

make all

Getting Started

Main source codes can be found in ./fhvae/. ./scripts contains runable python scripts. Example scripts for preprocessing are in ./examples/.

Two dataset formats are allowed: Kaldi and Numpy. Dataset should be stored in ./datasets/<dataset_name>/<set_name>/, where <set_name> refers to {train,dev,test}. Each set folder should contain a feats.scp and a len.scp. *.scp files follow Kaldi's script-file format, where each line is:

sequence-id value

The value for len.scp is an integer denoting the feature sequence length, and the value for feats.scp is *.npy (Numpy format) or *.ark:<offset> (Kaldi format).

Such files can be prepared with ./scripts/preprocess/prepare_kaldi_data.py (Kaldi format) or ./scripts/preprocess/prepare_numpy_data.py (Numpy format), given a wav.scp file.

Before running any codes, source the environment script first to update $PYTHONPATH:

. ./env.sh

Preprocessing

We now provide numpy preprocessing recipes for TIMIT and LibriSpeech from a raw data directory

python ./examples/prepare_timit_numpy <TIMIT_DIR>	# TIMIT
python ./examples/prepare_librispeech_numpy <LIBRISPEECH_DIR> # LibriSpeech

use -h to see more options

Training with Hierarchical Sampling

python ./scripts/train/run_hs_train.py --dataset=timit_np_fbank --is_numpy --nmu2=2000

Experiments will be saved to ./exp/timit_np_fbank/<exp_name>.

Training without Hierarchical Sampling (original FHVAE training)

python ./scripts/train/run_train.py --dataset=timit_np_fbank --is_numpy

Experiments will be saved to ./exp/timit_np_fbank/<exp_name>.

Evaluation

python scripts/eval/run_eval.py ./exp/timit_np_fbank/<exp_name> --seqlist=./misc/timit_eval.txt

Use --seqlist to specify which sequences to use for qualitative evaluation. Results with be saved to ./exp/timit_np_fbank/<exp_name>/img.

scalablefhvae's People

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.