Git Product home page Git Product logo

Comments (12)

popcornell avatar popcornell commented on August 23, 2024 2

We will also do an inference engine I think with the CLI interface smthing like:

$ asteroid infer --experiment exp/train_convtasnet_exp7/ --folder /media/sam/separate --output /folder/separated --window_size 32000

the idea is parsing everything recursively and then using overlap_add to separate every file and save it to an output folder.

from asteroid.

popcornell avatar popcornell commented on August 23, 2024

Are you going to use click for CLI interface ?

from asteroid.

jonashaag avatar jonashaag commented on August 23, 2024

Let's discuss implementation later, but yes, my current protoype involves dataclasses (Python 3.x+), click, autoclick (optional; to derive CLI options from dataclasses), and lightning's DataModules

from asteroid.

mpariente avatar mpariente commented on August 23, 2024

First, thanks a lot for the detailed design explanation, the time and effort you put into it !

I really love the idea of having a powerful CLI in asteroid, and I'm sure it can be beneficial.
I'm convinced that having a CLI for dataset preparation and inference will only be beneficial, I'm not 100% sure about evaluation and I have more than mixed feelings about training. I can detail more tomorrow but it has to do with the genericity/expressivity compomise.

But first, I have few questions. In your opinion

  • Would this design replace the recipes? Or live next to the recipes?
  • Who is this interface intended for? What type of users?
  • What would be the steps to include a new model?
  • Is the underlying training script shared by all the training experiments?

Actually, it all boils down to listing use cases that we want to support in asteroid and what type of users we are "targeting", and this is not an easy question.

from asteroid.

jonashaag avatar jonashaag commented on August 23, 2024

(Added the dataset: and model: strings to the YAML files above, which I forgot.)

It would replace most recipes (all that we can convert to the new design).

Users: People running inference on pretrained models; people training their own models; people working on new models to contribute to Asteroid; people working on new private models/datasets. So I guess everyone :-)

In the design there isn't really a training script anymore. The logic has been split entirely between dataset and model. Maybe this assumption doesn't work well with some use cases; if so, can you give an example?

Below are some ideas on what the code can look like.

Steps to include new model

Create new Python module (either in asteroid.* or somewhere else*) with Torch code; model config definition (what params the model expects); a bit of "registration" code to make it available to the CLI; done. Sketch:

class MyModel(...):
    name = "mymodel"

    def __init__(self, config: MyModelConfig):
        self.config = config

    def forward(self, ...):
        ...


class MyModelConfig:
    # Note: This is also the single source of truth for what the model's config schema is.
    # CLI and YAML format is derived from this
    def __init__(self, n_filters=64, ...):
        self.n_filters = n_filters
        ....


asteroid.register_model(MyModel, MyModelConfig)

*) If outside of asteroid.*, I'm not sure yet how we make the CLI find the module in the first place. Maybe something like asteroid --load-module myprivatestuff.models train --model-name "myprivatestuff.models.MyModel" ....

Steps to include new dataset

Same as with model: Create new module with Torch code; dataset config definition (e.g. number of speakers); registration code; optionally preparation code (download + mix); optionally additional CLI commands that are specific to the dataset. Sketch:

class MyDataset(...):
    name = "mydata"

    def __init__(self, config: MyDatasetConfig):
        self.config = config

    # Required for all modules
    def get_datamodule(self) -> LightningDataModule:
        ...

    # Optional but implemented by Asteroid
    def download(self):
        ...

    # Custom and unknown to Asteroid
    def custom_print_stats(self):
        ...


class MyDatasetConfig:
    def __init__(self, n_speakers=2, ...):
        self.n_speakers = n_speakers
        ....


cli = asteroid.register_dataset(MyDataset, MyDatasetConfig)

# Register custom additional command
@cli.command()
def print_stats(...):
    """Can be run with: asteroid data mydata print-stats"""
    dataset = asteroid_load_dataset_from_cli_args()
    dataset.custom_print_stats()

from asteroid.

mpariente avatar mpariente commented on August 23, 2024

This sounds really cool !

In the design there isn't really a training script anymore. The logic has been split entirely between dataset and model. Maybe this assumption doesn't work well with some use cases; if so, can you give an example?

I didn't mean training script per se, but we need to do the training at one point. Where does this live? I guess a general and configurable class in asteroid?

Just to be clear, I see the advantage in having generic tools that enable training any model on any dataset with no code duplicates, I'm just a bit worried about how general might hurt flexibility / ease of getting into the code. So I'm trying to find the edge cases where having such a general framework would hurt us.

  • How would you handle a custom training loop?
  • Dataset can have any number of outputs, ones that have to be passed to the model, ones to the loss, others that might control the training loop, others for logging. How do you pass these to the right objects?
  • How do you differentiate between single/multi channel models? Between single/multi channel datasets?
  • How would you handle the TwoStep recipe for example? Is it in the training config so that every model can use this? Or it is not supported because too specific?

from asteroid.

jonashaag avatar jonashaag commented on August 23, 2024

Good questions. I'll try to come up with solutions for these.

from asteroid.

mpariente avatar mpariente commented on August 23, 2024

I'm afraid to become the Keras of source separation, where research gets difficult, where it's harder and harder to get in the source code.

I understand that the current code has a lot of duplicate but it has one advantage: it's pretty easy to get into because the organization is trivial, not too abstracted; so you can easily change things from inside the recipes to make new ones.

Yes, training TasNet, ConvTasNet or DPRNNTasNet on wsj0-mix, WHAM, WHAMR or LibriMix doesn't require a large/complex code because the recipes are really the same, and writing a CLI for it would be very nice for the users.
How would this core code grow with time? How easy would it be to modify it?
I'm a bit afraid to create an API which is far from Modules and Datasets because it takes time to learn.

In my opinion, having a CLI for "easy" use-cases would be great but it would be hard to have it general, flexible, and modifiable (with ease).

If we are able to achieve this, with easy strategies to add custom datasets/models/losses/training loops/, we can replace all the recipes by the CLI.

If not, how many of the recipes can we translate? Does it make sense to have two different functioning modes for Asteroid (verbosy recipes with duplicated code -that we can also try to reduce- vs CLI recipes) ? How much effort would it require?

from asteroid.

jonashaag avatar jonashaag commented on August 23, 2024
  • How would you handle a custom training loop?

Register a custom CLI command from a custom module and use that. It may reuse some parts of the default training loop. For example if the "official" train command is defined as

# asteroid.cli.default_commands

def train(model: ModelConfig, data: DataConfig, continue_from: Experiment = None, ...):
    ...

asteroid.register_cmd("train")

You can define your own:

# myprivatesetuff.cli_commands

TwoStepStep = typing.Literal["filterbank", "separator"]

def mytrain(step: TwoStepStep, ...):
    """Use like this: asteroid train-twostep --step filterbank|separator ..."""

asteroid.register_cmd("train-twostep")

Btw, I don't insist on doing this on CLI. Instead of asteroid data librimix download we can also say that users should run import asteroid.data.librimix; asteroid.data.librimix.download("/target/path") in a Python shell. But then I'm not sure how to easily change params from other scripts. Pass them via env variables? Tell users to roll their own CLI parsing?

  • Dataset can have any number of outputs, ones that have to be passed to the model, ones to the loss, others that might control the training loop, others for logging. How do you pass these to the right objects?
  • How do you differentiate between single/multi channel models? Between single/multi channel datasets?

I think we can cover the common cases in the general training code and special cases need to be special cased in the recipes.

  • How would you handle the TwoStep recipe for example? Is it in the training config so that every model can use this? Or it is not supported because too specific?

I think things like that should live in their own recipes. But those recipes could be written in a way that allows for any dataset to be used.

For TwoStep in particular, another idea would be to split it into two models, since it essentialy is a two-model approach:

$ asteroid model twostep-filterbank configure --n-filters 1234 > twostep-filterbank.yml
$ asteroid train --model twostep-filterbank.yml --data ~/asteroid-datasets/librimix2/dataset.yml
Creating experiment folder exp/blabla1/...
Training TwoStep filterbank on LibriMix2...
...
Wrote filterbank to exp/blabla1/filterbank.
$ asteroid model twostep-separator configure --filterbank-path exp/blabla1/filterbank > twostep-separator.yml
$ asteroid train --model twostep-separator.yml --data ~/asteroid-datasets/librimix2/dataset.yml
Creating experiment folder exp/blabla2/...
Training TwoStep separator on LibriMix2...

from asteroid.

mpariente avatar mpariente commented on August 23, 2024

But then I'm not sure how to easily change params from other scripts. Pass them via env variables? Tell users to roll their own CLI parsing?

What do you mean by that?

I think we can cover the common cases in the general training code and special cases need to be special cased in the recipes.

This is also what I believe. While having CLI for the common cases will be extremely useful, I think we cannot completely replace recipes.

So, should CLI and recipes share the same abstraction? i.e should recipes reuse the config organization you mentioned?

I would suggest to start with

  • Models

    • TasNet
    • ConvTasNet
    • DPRNN
    • Sudormrf
    • DPTNet
  • Datasets

    • wsj-mix
    • wham
    • whamr
    • LibriMix
    • DNS Challenge's dataset
    • VoiceBank + Demande
    • Maybe musdb?
    • Maybe FUSS?

from asteroid.

jonashaag avatar jonashaag commented on August 23, 2024

What do you mean by that?

I was just thinking loudly how users were to implement a convenient way to change params when they implement recipes without using the new Asteroid CLI “framework” (ie recipes in the current form, as simple scripts). Like changing number of filters in an entirely custom recipe. Usually you’d use CLI params for that, but that means you’ll have to use a CLI parser, ... so as soon as you start adding a non-trivial amount of options to your recipe you’ll want to use the Asteroid CLI “framework”.

Anyways, I think we agree sufficiently for me to come up with a first implementation that covers some of the common use cases, models and datasets. So my goal would be to make these use cases really simple and the “framework” reasonably extensible, and for anything that’s to complex to integrate into the “framework” currently we’ll simply keep it as is for now. (Maybe we can move some duplicated code from these special recipes to Asteroid “core” anyways, without changing their code.)

from asteroid.

mpariente avatar mpariente commented on August 23, 2024

Thanks for the clarification.
I do agree with you and cannot wait to see the first implementation !

from asteroid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.