- Original version:
pip install tensorflow==1.9 keras==2.1.5
- Updated 2020.04.20:
pip install tensorflow==1.15.2 keras
- or
pipenv install
is you havepipenv
- Download ffmpeg from ffmpeg, you should select
Static
linking and get a zip file. - extract the zip file into
ffmpeg
folder, so that there existsffmpeg/bin/ffmeg.exe
.
- Download sox from SOund eXchange, you should get a zip file.
- extract zip file into the sox folder. so that there exists
sox/sox.exe
.
Convert recorded audio files to *.wav files
python ./convert_file.py <Data Folder>
The Data Folder
should contains many subfolders where your audios files reside. Typically, one of your audio file could be <Data Folder>/group1/a.mp3
.
The results of conversion are within ./data/train/
. Your should manually move some of them to ./data/test
to accomplish training-validation
separation.
The fraction of moved files depends on yourself.
The data augmentation server is implemented by grpc.
pip install grpcio
or for some version of python3
pip3 install grpcio
Training involves two files: train.py
and augmentation/
.
python -m augmentation
will start a augmentation server that provide train data and test data.
train.py
will connect to augmentation server and request data.
augmentation/config.py
is used for configuring the batch size/thread size/data source/...
Before training, there are several things you should do.
You have done it in Data preparation
. Now check it again.
-
put train data into
data/train/
-
put validate data into
data/test/
-
NOTE: the wav file must be encoded by 16 bit signed integer, mono-channeled and at a sampling rate of 16000.
-
You should got things correct if you obtained them from
convert_file.py
- You should got sox in
sox/
, now check it again.
server side: python -m augmentation
- this will start an augmentation server utilizing
sox
.
client side: python train.py
-
this will start trainig with data requested from augmentation server.
-
NOTE: run it from the folder
audioNet
** Resume a interrupted training process.
You can resume from certain checkpoint, modify the last line of train.py
, set -1
(Negtive 1) as your start point.
modify webfront.py
, change MODEL_ID
to yours.
open a web browser and input URL:http://127.0.0.1:5000/predict.
*It requires [ffmpeg](https://ffmpeg.org/)
for audio file format convertion.
** Select Checkpoint for Evaluation
modify webfront.py
, change MODEL_ID
to yours.
See Run python webfront.py
- Choose an
ID
of checkpoint by yourself frommodels/save_<ID>.h5
. - Run
python ./create_pb.py <ID>
. This will create filemodels/model.pb
- Place your model.pb file where you want to deploy. Typically, see Android mobile example: androidAudioRecg