Comments (4)
I can confirm Gustav's comment; the LSTM system is really a bottom baseline and converges fast. It proves the point of MSE loss collapsing to the mean pose. There is no code for it in the repository.
from stylegestures.
I am guessing maybe it is because I have over-trained the model.
By "over-training", do you mean "overfitting", or something else? The term "over-training" is not standard in the literature I have been reading.
If you suspect overfitting, I would recommend that you generate output motion at regular intervals during training and see how the quality changes. If the motion quality first improves but then starts getting worse again, then you have some form of overfitting, otherwise you do not. But if the quality never is good enough, then your issues cannot be attributed to overfitting alone.
Example motion from our trained system should be available in our supplementary material, available in the Computer Graphics Forum digital repository. I personally don't know the exact motivation for the specific training duration etc. that Simon chose. However, I think that will be more efficient and instructive for you to run your own experiments (e.g., as suggested in the previous paragraph) than to wait for Simon to weigh in with his perspective.
from stylegestures.
Hi, Gustav,
Thanks for your answering, yes, what I mean is "overfitting", thanks!
Yes, I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)
BTW, what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.
Have a nice day,
from stylegestures.
what is the exact LSTM/GRU Simon is using for that baseline experiment? I can find two classes defined (LSTM and GRU) in glow/modules.py, not sure if that is the exact model he is using.
The LSTM baseline is a very simple network (a single-layer, 350 unit unidirectional LSTM, as described in Sec. 4.4 of our paper), without any autoregression or normalising flows. At least in the original MoGlow paper, the RNN (LSTM) baseline system used a completely different codebase than the MG systems, and I would suspect the same is true here. I did a quick search for "LSTM" in glow/modules.py
, and nothing in there seems relevant; I only spot references to using LSTMs (or GRUs) within the affine coupling in MoGlow, which is not relevant to the baseline since the baseline doesn't use normalising flows at all!
Because the LSTM baseline from the paper is trivial to implement in something like Keras, I don't think it is included in this repository.
I will wait for more information from Simon about why he run so few number of epoches for the baseline system. (LSTM)
Like I emphasised in my previous post, you can test your hypothesis regarding overfitting without waiting for Simon to respond.
My guess is that he ran so few epochs because the simplistic baseline system – which has just a single layer, no autoregression, and no normalising flows at all – converges much faster than the MoGlow model.
from stylegestures.
Related Issues (20)
- Did you experiment with different batch sizes in training? HOT 2
- Strange results when I train with multiple GPUs. HOT 2
- bvh files with fixed frames
- Difference between time_steps and seqlen? HOT 2
- Possible bug when computing the log-det of Jacobian for affine coupling HOT 2
- About datasets HOT 1
- For the freshmen about Gesture Generation HOT 2
- Questions about the latent random variable Z HOT 2
- The Python version
- About the Cuda version HOT 4
- About the swapaxes for self.x and self.cond HOT 6
- This dataset link doesn't seem to work. HOT 2
- The dataset are inconsistent HOT 10
- Excuse me, where can I find the dataset used by Example3 in readme? HOT 11
- This is the curve when I use two different data sets for training, and the parameters are the same. It can be roughly seen that the loss in Figure 1 will be lower than that in Figure 2, which can indicate that the performance of the first trained model will be better? HOT 6
- How to apply the output file(*.bvh) to 3D model file(*.3ds) HOT 3
- Some questions about the style control HOT 2
- Some questions about the style control HOT 1
- About the dataset HOT 2
- The trained model posture shakes badly. What might be the cause? Is there any way to solve this problem? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stylegestures.