Git Product home page Git Product logo

Comments (4)

simonalexanderson avatar simonalexanderson commented on July 17, 2024 1

I can confirm Gustav's comment; the LSTM system is really a bottom baseline and converges fast. It proves the point of MSE loss collapsing to the mean pose. There is no code for it in the repository.

from stylegestures.

ghenter avatar ghenter commented on July 17, 2024

I am guessing maybe it is because I have over-trained the model.

By "over-training", do you mean "overfitting", or something else? The term "over-training" is not standard in the literature I have been reading.

If you suspect overfitting, I would recommend that you generate output motion at regular intervals during training and see how the quality changes. If the motion quality first improves but then starts getting worse again, then you have some form of overfitting, otherwise you do not. But if the quality never is good enough, then your issues cannot be attributed to overfitting alone.

Example motion from our trained system should be available in our supplementary material, available in the Computer Graphics Forum digital repository. I personally don't know the exact motivation for the specific training duration etc. that Simon chose. However, I think that will be more efficient and instructive for you to run your own experiments (e.g., as suggested in the previous paragraph) than to wait for Simon to weigh in with his perspective.

from stylegestures.

kelvinqin avatar kelvinqin commented on July 17, 2024

Hi, Gustav,
Thanks for your answering, yes, what I mean is "overfitting", thanks!

Yes, I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)

BTW, what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.

Have a nice day,

from stylegestures.

ghenter avatar ghenter commented on July 17, 2024

what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.

The LSTM baseline is a very simple network (a single-layer, 350 unit unidirectional LSTM, as described in Sec. 4.4 of our paper), without any autoregression or normalising flows. At least in the original MoGlow paper, the RNN (LSTM) baseline system used a completely different codebase than the MG systems, and I would suspect the same is true here. I did a quick search for "LSTM" in glow/modules.py, and nothing in there seems relevant; I only spot references to using LSTMs (or GRUs) within the affine coupling in MoGlow, which is not relevant to the baseline since the baseline doesn't use normalising flows at all!

Because the LSTM baseline from the paper is trivial to implement in something like Keras, I don't think it is included in this repository.

I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)

Like I emphasised in my previous post, you can test your hypothesis regarding overfitting without waiting for Simon to respond.

My guess is that he ran so few epochs because the simplistic baseline system – which has just a single layer, no autoregression, and no normalising flows at all – converges much faster than the MoGlow model.

from stylegestures.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.