Thanks a lot for your implementation! Can this tokenizer be trained on video dataset i

train on video dataset about magvit2-pytorch HOT 4 CLOSED

lucidrains commented on July 30, 2024

train on video dataset

from magvit2-pytorch.

Comments (4)

lucidrains commented on July 30, 2024

your x-axis, is that number of steps? you only did 600 training steps?

from magvit2-pytorch.

Y-ichen commented on July 30, 2024

Yes, these are the results after only 600 training steps. I trained magvit in an unconditional manner on UCF101 dataset.

During the training, I noticed the initial recon_loss value was very large (2e+4), so I checked the tensor value ranges when calculating recon_loss between the video and reconstructed video. I found the video values were between 0.0-255.0, while the reconstructed video values were around -1.0 to 1.0.

Therefore, I additionally normalized the data when loading videos to rescale the tensor range to -1.0 to 1.0. With this, the initial recon_loss is around 0.3, but the discr_loss is still around 2.0, much larger than recon_loss. I'm not sure if this will affect training, so I shrink discr_loss a bit by adding discr_weight of 0.1 to balance it with recon_loss. (Then the initial value of losses becomes: recon_loss=0.3, disrc_loss=0.2 around) Here is my new results of 3k steps training with these settings:

I'm retraining as above now - should I increase the training steps to at least 20k? And should I apply this normalization of the loaded video tensor range?

from magvit2-pytorch.

Y-ichen commented on July 30, 2024

Fixed

from magvit2-pytorch.

xesdiny commented on July 30, 2024

how did you do it？

from magvit2-pytorch.

Recommend Projects

train on video dataset about magvit2-pytorch HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent