In the WHAM ConvTasNet s, you can set some options in

Me and <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

You would think about something like this? <div class="snippet-clipboard-content n

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

conf.yaml options vs. run.sh options about asteroid HOT 8 OPEN

asteroid-team commented on August 23, 2024 1

conf.yaml options vs. run.sh options

from asteroid.

Comments (8)

popcornell commented on August 23, 2024

Me and @mpariente discussed about this since the beginning of Asteroid.
Basically now it is super flexible, but in my opinion it is also confusing.
In order to run the full recipe you have however to pass also paths to where.json data is saved which can be modified by the user in run.sh. ( otherwise the user will have to modify paths both in the yaml and run.sh)

My suggestion is to keep the paths in run.sh and every hyperpar in the yaml file. Paths will be common to multiple runs so one just specify them in run.sh and then moves to the yaml to modify hyperpars for successive runs.

from asteroid.

mpariente commented on August 23, 2024

Yes, I agree that this is still confusing and not optimal and we discussed it several times..

One of the most important feature to have IMO is to be able to change the hparams of an experiment from the command line so that you can use a for loop to launch experiments with different hparams, without needing to duplicate your conf.yml file.
I personally never duplicate conf files, but we can change the logic to make this simpler.

@jonashaag can you have a look at this PR (#104) and tell me what you think?
I think this is close to what I'd like:

There are no duplicates between bash and yaml
Every yml field can be modified from the command line
You could easily pass a config file to start from I guess

The only thing that holds me to do that is that it's "complicated" and the code can be confusing..

from asteroid.

jonashaag commented on August 23, 2024

Looks good to me, but I wonder why roll our own? We could use something like https://github.com/bw2/ConfigArgParse instead (never used it but looks good).

For my use cases, I think best case would be something along the lines of:

Have a single "model command runner" Python file that knows how to preprocess, train, eval, infer for a certain model. So, more or less a mixture of run.sh, train.py and eval.py with the addition of inference (e.g. directly from .wav files).
The runner file reads a model YAML file (like conf.yaml but without the strictly "localhost" things like data paths). The values read from the YAML file can be changed using CLI args.
The runner file also parses "localhost" things like data paths from CLI args. These do not make sense in the model YAML file.
Get rid of run.sh entirely, since then everything is covered by the Python script, and this way we also do not have to deal with arg parsing that is consistent in Bash and Python code.
(Optional: Move dataset preprocessing code to the dataset code itself, since it's duplicated for any model/dataset combination at the moment)

from asteroid.

mpariente commented on August 23, 2024

Also thought about removing the run.sh file completely and replace it by an omnipotent Python file.
And yes, in this case, argument parsing gets much simpler.
It's also related to #183, because unit testing python files would be easier. And we could refactor the evaluation to be part of the lightning module.

Would you be willing to put some time into this @jonashaag ?
There are lot of things to improve on asteroid's code, particularly things which are recipe-related IMO and your feedbacks are well appreciated thanks !

@popcornell What would you say about moving full python?

from asteroid.

mpariente commented on August 23, 2024

You would think about something like this?

python run.py dataprep  # run dataprep train and eval (default)
python run.py train --someargs  # run train and eval
python run.py eval --otherargs  # run eval only
python run.py predict  --files_glob # perform inference

from asteroid.

popcornell commented on August 23, 2024

IDK for simple things full python is nice. But what about if we have to check some dependencies in one recipe and pip install them. An advantage of going full python is debugging. Easier especially with an IDE.
Sure there are workarounds but they are not immediate as they are in bash. Also data download is not immediate. You have always to use subprocess.
And this comes from one who is not a bash super fan. I prefer to avoid it when I can but for some things it is the best tool.

from asteroid.

mpariente commented on August 23, 2024

@popcornell is right, parts of data preparation is a mess in python, how about?

Centralized bash script for each dataset, registered as a package script (accessible everywhere)
Python wrapper to that script, with subprocess, inside asteroid.data.dataset_name
Recipe in python just calling asteroid.data.librimix.dataprep in the dataprep mode

If we want to rewrite everything in Python for the dataprep later, we also can.

from asteroid.

jonashaag commented on August 23, 2024

I will try to spend some time on understanding the similarities and differences of the models and datasets and come up with a concept

from asteroid.

conf.yaml options vs. run.sh options about asteroid HOT 8 OPEN

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent