Comments (17)
I did not publish checkpoints for vox. So you can not try it now. You can try to retrain the network with 64x64 vox. This should work, however larger resulution is not that great with a current code. I'm working on improved version of this work, I will publish all the checkpoint when it will be ready.
from monkey-net.
What is the size of source2.png?
from monkey-net.
original size is 534x471, I resized it to 64x64 using !convert source2.png -resize 64x64 source2.png
from monkey-net.
It tells that the size is something like (56,64,3)
from monkey-net.
I am trying to make this into a google colab notebook can share if you can take a look to see the problem?
from monkey-net.
Can you send me resized image and driving video?
from monkey-net.
@AliaksandrSiarohin see this i am getting a different error now https://colab.research.google.com/drive/12FNlYX_inn3j-9fyxCW_ibs0i4kcNvEL
from monkey-net.
tried it with a different image Traceback (most recent call last):
File "demo.py", line 62, in
source_image = VideoToTensor()(read_video(opt.source_image, opt.image_shape + (3,)))['video'][:, :1]
File "/content/monkey-net/frames_dataset.py", line 28, in read_video
video_array = video_array.reshape((-1,) + image_shape)
ValueError: cannot reshape array of size 2964000 into shape (64,64,3)
link to image
https://upload.wikimedia.org/wikipedia/commons/thumb/5/51/Brad_Pitt_Fury_2014.jpg/800px-Brad_Pitt_Fury_2014.jpg
from monkey-net.
from monkey-net.
@AliaksandrSiarohin now getting this error with that image THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=663 error=11 : invalid argument
Traceback (most recent call last):
File "demo.py", line 67, in
out = transfer_one(generator, kp_detector, source_image, driving_video, config['transfer_params'])
File "/content/monkey-net/transfer.py", line 68, in transfer_one
kp_driving = cat_dict([kp_detector(driving_video[:, :, i:(i + 1)]) for i in range(d)], dim=1)
File "/content/monkey-net/transfer.py", line 68, in
kp_driving = cat_dict([kp_detector(driving_video[:, :, i:(i + 1)]) for i in range(d)], dim=1)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/content/monkey-net/modules/keypoint_detector.py", line 107, in forward
out = gaussian2kp(heatmap, self.kp_variance, self.clip_variance)
File "/content/monkey-net/modules/keypoint_detector.py", line 58, in gaussian2kp
var = torch.matmul(mean_sub.unsqueeze(-1), mean_sub.unsqueeze(-2))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:411
from monkey-net.
Seems like problem with cuda library. Probably pytorch does not match. Try to remove pytorch from the requirements.txt and run again.
from monkey-net.
I removed both torch==0.4.1
torchvision==0.2.1 in requirements.txt will try again now
from monkey-net.
@AliaksandrSiarohin seems it worked but the demo.gif result is weird
from monkey-net.
This can happen because model is trained on nemo dataset. It is most likely not generalise outside of this dataset. It expects black bg and proper image crop. To validate this you can check if it works with image from the test part of nemo datarset. If you want a model that work on arvitrarrry faces dataset like vox celeb should be used.
from monkey-net.
@AliaksandrSiarohin I tried it with vox.yaml and vox-full.yaml and get this error
Traceback (most recent call last):
File "demo.py", line 52, in
Logger.load_cpk(opt.checkpoint, generator=generator, kp_detector=kp_detector)
File "/content/monkey-net/logger.py", line 54, in load_cpk
generator.load_state_dict(checkpoint['generator'])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for MotionTransferGenerator:
Missing key(s) in state_dict: "appearance_encoder.down_blocks.5.conv.weight", "appearance_encoder.down_blocks.5.conv.bias", "appearance_encoder.down_blocks.5.norm.weight", "appearance_encoder.down_blocks.5.norm.bias", "appearance_encoder.down_blocks.5.norm.running_mean", "appearance_encoder.down_blocks.5.norm.running_var", "appearance_encoder.down_blocks.6.conv.weight", "appearance_encoder.down_blocks.6.conv.bias", "appearance_encoder.down_blocks.6.norm.weight", "appearance_encoder.down_blocks.6.norm.bias", "appearance_encoder.down_blocks.6.norm.running_mean", "appearance_encoder.down_blocks.6.norm.running_var", "video_decoder.up_blocks.5.conv.weight", "video_decoder.up_blocks.5.conv.bias", "video_decoder.up_blocks.5.norm.weight", "video_decoder.up_blocks.5.norm.bias", "video_decoder.up_blocks.5.norm.running_mean", "video_decoder.up_blocks.5.norm.running_var", "video_decoder.up_blocks.6.conv.weight", "video_decoder.up_blocks.6.conv.bias", "video_decoder.up_blocks.6.norm.weight", "video_decoder.up_blocks.6.norm.bias", "video_decoder.up_blocks.6.norm.running_mean", "video_decoder.up_blocks.6.norm.running_var".
size mismatch for appearance_encoder.down_blocks.4.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 3, 3]).
size mismatch for appearance_encoder.down_blocks.4.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for appearance_encoder.down_blocks.4.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 512, 1, 3, 3]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.encoder.down_blocks.4.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for dense_motion_module.hourglass.decoder.up_blocks.0.conv.weight: copying a param with shape torch.Size([512, 512, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 1024, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.0.conv.weight: copying a param with shape torch.Size([512, 522, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 1034, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.0.conv.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.running_mean: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.0.norm.running_var: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.conv.weight: copying a param with shape torch.Size([256, 1034, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1024, 2058, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.1.conv.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.weight: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.bias: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.running_mean: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.1.norm.running_var: copying a param with shape torch.Size([256]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for video_decoder.up_blocks.2.conv.weight: copying a param with shape torch.Size([128, 522, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([512, 2058, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.2.conv.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.running_mean: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.2.norm.running_var: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for video_decoder.up_blocks.3.conv.weight: copying a param with shape torch.Size([64, 266, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 1034, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.3.conv.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.3.norm.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for video_decoder.up_blocks.4.conv.weight: copying a param with shape torch.Size([32, 138, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 522, 1, 3, 3]).
size mismatch for video_decoder.up_blocks.4.conv.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.weight: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.bias: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.running_mean: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for video_decoder.up_blocks.4.norm.running_var: copying a param with shape torch.Size([32]) from checkpoint, the shape in current model is torch.Size([128]).
from monkey-net.
If it is urgent and you do not have access to gpu, I can try to train 64x64 vox for you.
from monkey-net.
@AliaksandrSiarohin I am trying to setup a colab notebook which uses a t4 gpu, if I can get training running on it as well, I can start training different models and share it, also suggest adding a google colab notebook to the repo as it makes them more reproducible much easier
from monkey-net.
Related Issues (20)
- the size larger than 64x64 does work for nemo model HOT 3
- What is the image meaning of each column when training shape dataset HOT 2
- Questions about kp2gaussian and gaussian2kp HOT 1
- how do i reduce the batch size? demo.py
- cuda running out of memory with vox 256x256 dataset chekpoint. suggest edits pls. HOT 1
- align source image with the first frame of driver video HOT 4
- Extract keypoints HOT 3
- expected keypoint coordinates HOT 2
- How to make motion transfer demo working for frame by frame prediction? HOT 1
- assigning zero weights to hourglass decoder while predicting mask HOT 1
- keypoint predictions HOT 7
- About pretrained models HOT 2
- Mask Embedding
- The link of checkpoint is invalid
- Pretrained checkpoint HOT 1
- Training on custom data using pretrained weights HOT 4
- transfer params
- Log file - More number of lines in log file than expected HOT 4
- training strategy HOT 1
- Question about generator training HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monkey-net.