python3.8, linux (мак тоже не подходит, разве что если в системе gcc, а не clang)
pip install -r ./requirements.txt
pip install https://github.com/kpu/kenlm/archive/master.zip
wget https://github.com/karoldvl/ESC-50/archive/master.zip
unzip master.zip
wget https://www.openslr.org/resources/11/3-gram.arpa.gz
gunzip 3-gram.arpa.gz
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1hC0HbwZ3dfhEYYvozUa-8rHjZl9QZwf6' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1hC0HbwZ3dfhEYYvozUa-8rHjZl9QZwf6" -O default_test_model/checkpoint.pth && rm -rf /tmp/cookies.txt
You might be a little intimidated by the number of folders and classes. Try to follow this steps to gradually undestand the workflow.
- Test
hw_asr/tests/test_dataset.py
andhw_asr/tests/test_config.py
and make sure everythin works for you - Implement missing functions to fix tests in
hw_asr\tests\test_text_encoder.py
- Implement missing functions to fix tests in
hw_asr\tests\test_dataloader.py
- Implement functions in
hw_asr\metric\utils.py
- Implement missing function to run
train.py
with a baseline model - Write your own model and try to overfit it on a single batch
Pain and sufferingImplement your own models and train them. You've mastered this template when you can tune your experimental setup just by tuningconfigs.json
file and runningtrain.py
- Don't forget to write a report about your work
- Get hired by Google the next day
- Make sure your projects run on a new machine after complemeting installation guide
- Search project for
# TODO: your code here
and implement missing functionality - Make sure all tests work without errors
python -m unittest discover hw_asr/tests
- Make sure
test.py
works fine and works as expected. You should create filesdefault_test_config.json
and your installation guide should download your model checpoint and configs indefault_test_model/checkpoint.pth
anddefault_test_model/config.json
.python test.py \ -c default_test_config.json \ -r default_test_model/checkpoint.pth \ -t test_data \ -o test_result.json
- Use
train.py
for training
this repository is based on a heavily modified fork of pytorch-template repository.
These barebones can use more tests. We highly encourage students to create pull requests to add more tests / new functionality. Current demands:
- Tests for beam search
- W&B logger backend
- README section to describe folders
- Notebook to show how to work with
ConfigParser
andconfig_parser.init_obj(...)