Hi Smion, I am trying to build a baseline system (GRU model) following your paper,

Question on baseline system (LSTM or GRU model) about stylegestures HOT 4 CLOSED

simonalexanderson commented on July 17, 2024

Question on baseline system (LSTM or GRU model)

from stylegestures.

Comments (4)

simonalexanderson commented on July 17, 2024 1

I can confirm Gustav's comment; the LSTM system is really a bottom baseline and converges fast. It proves the point of MSE loss collapsing to the mean pose. There is no code for it in the repository.

from stylegestures.

ghenter commented on July 17, 2024

I am guessing maybe it is because I have over-trained the model.

By "over-training", do you mean "overfitting", or something else? The term "over-training" is not standard in the literature I have been reading.

If you suspect overfitting, I would recommend that you generate output motion at regular intervals during training and see how the quality changes. If the motion quality first improves but then starts getting worse again, then you have some form of overfitting, otherwise you do not. But if the quality never is good enough, then your issues cannot be attributed to overfitting alone.

Example motion from our trained system should be available in our supplementary material, available in the Computer Graphics Forum digital repository. I personally don't know the exact motivation for the specific training duration etc. that Simon chose. However, I think that will be more efficient and instructive for you to run your own experiments (e.g., as suggested in the previous paragraph) than to wait for Simon to weigh in with his perspective.

from stylegestures.

kelvinqin commented on July 17, 2024

Hi, Gustav,
Thanks for your answering, yes, what I mean is "overfitting", thanks!

Yes, I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)

BTW, what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.

Have a nice day,

from stylegestures.

ghenter commented on July 17, 2024

what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.

The LSTM baseline is a very simple network (a single-layer, 350 unit unidirectional LSTM, as described in Sec. 4.4 of our paper), without any autoregression or normalising flows. At least in the original MoGlow paper, the RNN (LSTM) baseline system used a completely different codebase than the MG systems, and I would suspect the same is true here. I did a quick search for "LSTM" in glow/modules.py, and nothing in there seems relevant; I only spot references to using LSTMs (or GRUs) within the affine coupling in MoGlow, which is not relevant to the baseline since the baseline doesn't use normalising flows at all!

Because the LSTM baseline from the paper is trivial to implement in something like Keras, I don't think it is included in this repository.

I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)

Like I emphasised in my previous post, you can test your hypothesis regarding overfitting without waiting for Simon to respond.

My guess is that he ran so few epochs because the simplistic baseline system – which has just a single layer, no autoregression, and no normalising flows at all – converges much faster than the MoGlow model.

from stylegestures.

Question on baseline system (LSTM or GRU model) about stylegestures HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent