Git Product home page Git Product logo

toyadmos-dataset's Introduction

ToyADMOS dataset

ToyADMOS dataset is a machine operating sounds dataset of approximately 540 hours of normal machine operating sounds and over 12,000 samples of anomalous sounds collected with four microphones at a 48kHz sampling rate, prepared by Yuma Koizumi and members in NTT Media Intelligence Laboratories. The ToyADMOS dataset is designed for anomaly detection in machine operating sounds (ADMOS) research. We have collected normal and anomalous operating sounds of miniature machines by deliberately damaging their components. It is designed for three tasks of ADMOS: product inspection (toy car), fault diagnosis for fixed machine (toy conveyor), and fault diagnosis for moving machine (toy train). For more information, refer to the paper [1]. If you use the ToyADMOS dataset in your work, please cite this paper where it was introduced.

[1] Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. Paper URL: https://arxiv.org/abs/1908.03299

Download:

The dataset can be downloaded at https://zenodo.org/record/3351307#.XT-JZ-j7QdU.

Since the total size of the ToyADMOS dataset is over 440GB, each sub-dataset is split into 7-9 files by 7-zip (7z-format). The total size of the compressed dataset is approximately 180GB, and that of each sub-dataset is approximately 60GB. Download the zip files corresponding to sub-datasets of interest and use your favorite compression tool to unzip these split zip files.

Detailed description of dataset

See the file named DETAIL.pdf

Usage examples

To give a sense of the usage of this dataset, a set of Python codes for data-generation, training, and test are available.

Tutorials on small training/test datasets written in [1].
- Dowload "C01_create_small_INT_dataset", "E01_simple_AE_test", and "anomaly_conditions"
- Run "make_dataset_for_car_and_conveyor.py" and "make_dataset_for_train.py" in "C01_create_small_INT_dataset" to make dataset.
[20 Feb. 2020] Note that the description of the gain parameters in our paper was wrong.
Original: To control the signal-to-noise ratio, we multiplied 3.16 (+10 dB) by the waveforms of target sounds in the toy-car and toy-conveyor sub-datasets and by the waveforms of noise sounds in the toy-train sub-dataset. 
Correctl: To control the signal-to-noise ratio, we multiplied 3.16 (+10 dB) by the waveforms of target sounds in toy-train sub-dataset and by the waveforms of noise sounds in the toy-car and toy-conveyor sub-datasets.
- Run "01_train.py" in "E01_simple_AE_test" to train a model
- Run "02_test.py" in "E01_simple_AE_test" to evaluate a model
- Note that paths in each code need to be changed depending on your environment

We have tested these codes on follwoing environment:

Python: 3.6.8
Chainer: 4.5.0
NumPy: 1.16.2
CuPy:
  CuPy Version          : 4.1.0
  CUDA Build Version    : 9000
  CUDA Driver Version   : 10000
  CUDA Runtime Version  : 9000
  cuDNN Build Version   : 7104
  cuDNN Version         : 7600

License:

See the file named LICENSE.pdf

Authors and Contact

toyadmos-dataset's People

Contributors

yumakoizumi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.