adymaharana / vlcstorygan Goto Github PK
View Code? Open in Web Editor NEWOfficial code repository for the EMNLP 2021 paper
License: MIT License
Official code repository for the EMNLP 2021 paper
License: MIT License
Hello,
I'd like to finetune parameter on your model . Could you provide the checkpoint of netD_im and netD_st which are not given ?
Thanks for your help!
I have an error: “Unexpected key(s) in state_dict: ‘epoch’, ‘netG_state_dict’, ‘optimizer_state_dict’.” when resume training. (below lines are full error, and I added my trainer_vlc.py code at the bottom.)
Would you let me know how to load model correctly?
File "/project/6057220/xianzhen/storygan/vlcgan/trainer_vlc.py", line 110, in load_network_stageI
netG.load_state_dict(state_dict)
File "/home/xianzhen/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1482, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for StoryMarttGAN:
Missing key(s) in state_dict: "recurrent.weight_ih", "recurrent.weight_hh", "recurrent.bias_ih", "recurrent.bias_hh", "moconn.layer.0.attention.self.query.weight", "moconn.layer.0.attention.self.query.bias", "moconn.layer.0.attention.self.key.weight", "moconn.layer.0.attention.self.key.bias", "moconn.layer.0.attention.self.value.weight", "moconn.layer.0.attention.self.value.bias", "moconn.layer.0.attention.output.dense.weight", "moconn.layer.0.attention.output.dense.bias", "moconn.layer.0.attention.output.LayerNorm.weight", "moconn.layer.0.attention.output.LayerNorm.bias", "moconn.layer.0.memory_initilizer.init_memory_bias", "moconn.layer.0.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.0.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.0.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.0.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.0.memory_updater.memory_update_attention.query.weight", "moconn.layer.0.memory_updater.memory_update_attention.query.bias", "moconn.layer.0.memory_updater.memory_update_attention.key.weight", "moconn.layer.0.memory_updater.memory_update_attention.key.bias", "moconn.layer.0.memory_updater.memory_update_attention.value.weight", "moconn.layer.0.memory_updater.memory_update_attention.value.bias", "moconn.layer.0.memory_updater.mc.weight", "moconn.layer.0.memory_updater.sc.weight", "moconn.layer.0.memory_updater.sc.bias", "moconn.layer.0.memory_updater.mz.weight", "moconn.layer.0.memory_updater.sz.weight", "moconn.layer.0.memory_updater.sz.bias", "moconn.layer.0.memory_augmented_attention.query.weight", "moconn.layer.0.memory_augmented_attention.query.bias", "moconn.layer.0.memory_augmented_attention.key.weight", "moconn.layer.0.memory_augmented_attention.key.bias", "moconn.layer.0.memory_augmented_attention.value.weight", "moconn.layer.0.memory_augmented_attention.value.bias", "moconn.layer.0.hidden_intermediate.dense.weight", "moconn.layer.0.hidden_intermediate.dense.bias", "moconn.layer.0.memory_projection.weight", "moconn.layer.0.memory_projection.bias", "moconn.layer.0.output.dense.weight", "moconn.layer.0.output.dense.bias", "moconn.layer.0.output.LayerNorm.weight", "moconn.layer.0.output.LayerNorm.bias", "moconn.layer.1.attention.self.query.weight", "moconn.layer.1.attention.self.query.bias", "moconn.layer.1.attention.self.key.weight", "moconn.layer.1.attention.self.key.bias", "moconn.layer.1.attention.self.value.weight", "moconn.layer.1.attention.self.value.bias", "moconn.layer.1.attention.output.dense.weight", "moconn.layer.1.attention.output.dense.bias", "moconn.layer.1.attention.output.LayerNorm.weight", "moconn.layer.1.attention.output.LayerNorm.bias", "moconn.layer.1.memory_initilizer.init_memory_bias", "moconn.layer.1.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.1.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.1.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.1.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.1.memory_updater.memory_update_attention.query.weight", "moconn.layer.1.memory_updater.memory_update_attention.query.bias", "moconn.layer.1.memory_updater.memory_update_attention.key.weight", "moconn.layer.1.memory_updater.memory_update_attention.key.bias", "moconn.layer.1.memory_updater.memory_update_attention.value.weight", "moconn.layer.1.memory_updater.memory_update_attention.value.bias", "moconn.layer.1.memory_updater.mc.weight", "moconn.layer.1.memory_updater.sc.weight", "moconn.layer.1.memory_updater.sc.bias", "moconn.layer.1.memory_updater.mz.weight", "moconn.layer.1.memory_updater.sz.weight", "moconn.layer.1.memory_updater.sz.bias", "moconn.layer.1.memory_augmented_attention.query.weight", "moconn.layer.1.memory_augmented_attention.query.bias", "moconn.layer.1.memory_augmented_attention.key.weight", "moconn.layer.1.memory_augmented_attention.key.bias", "moconn.layer.1.memory_augmented_attention.value.weight", "moconn.layer.1.memory_augmented_attention.value.bias", "moconn.layer.1.hidden_intermediate.dense.weight", "moconn.layer.1.hidden_intermediate.dense.bias", "moconn.layer.1.memory_projection.weight", "moconn.layer.1.memory_projection.bias", "moconn.layer.1.output.dense.weight", "moconn.layer.1.output.dense.bias", "moconn.layer.1.output.LayerNorm.weight", "moconn.layer.1.output.LayerNorm.bias", "moconn.layer.2.attention.self.query.weight", "moconn.layer.2.attention.self.query.bias", "moconn.layer.2.attention.self.key.weight", "moconn.layer.2.attention.self.key.bias", "moconn.layer.2.attention.self.value.weight", "moconn.layer.2.attention.self.value.bias", "moconn.layer.2.attention.output.dense.weight", "moconn.layer.2.attention.output.dense.bias", "moconn.layer.2.attention.output.LayerNorm.weight", "moconn.layer.2.attention.output.LayerNorm.bias", "moconn.layer.2.memory_initilizer.init_memory_bias", "moconn.layer.2.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.2.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.2.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.2.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.2.memory_updater.memory_update_attention.query.weight", "moconn.layer.2.memory_updater.memory_update_attention.query.bias", "moconn.layer.2.memory_updater.memory_update_attention.key.weight", "moconn.layer.2.memory_updater.memory_update_attention.key.bias", "moconn.layer.2.memory_updater.memory_update_attention.value.weight", "moconn.layer.2.memory_updater.memory_update_attention.value.bias", "moconn.layer.2.memory_updater.mc.weight", "moconn.layer.2.memory_updater.sc.weight", "moconn.layer.2.memory_updater.sc.bias", "moconn.layer.2.memory_updater.mz.weight", "moconn.layer.2.memory_updater.sz.weight", "moconn.layer.2.memory_updater.sz.bias", "moconn.layer.2.memory_augmented_attention.query.weight", "moconn.layer.2.memory_augmented_attention.query.bias", "moconn.layer.2.memory_augmented_attention.key.weight", "moconn.layer.2.memory_augmented_attention.key.bias", "moconn.layer.2.memory_augmented_attention.value.weight", "moconn.layer.2.memory_augmented_attention.value.bias", "moconn.layer.2.hidden_intermediate.dense.weight", "moconn.layer.2.hidden_intermediate.dense.bias", "moconn.layer.2.memory_projection.weight", "moconn.layer.2.memory_projection.bias", "moconn.layer.2.output.dense.weight", "moconn.layer.2.output.dense.bias", "moconn.layer.2.output.LayerNorm.weight", "moconn.layer.2.output.LayerNorm.bias", "moconn.layer.3.attention.self.query.weight", "moconn.layer.3.attention.self.query.bias", "moconn.layer.3.attention.self.key.weight", "moconn.layer.3.attention.self.key.bias", "moconn.layer.3.attention.self.value.weight", "moconn.layer.3.attention.self.value.bias", "moconn.layer.3.attention.output.dense.weight", "moconn.layer.3.attention.output.dense.bias", "moconn.layer.3.attention.output.LayerNorm.weight", "moconn.layer.3.attention.output.LayerNorm.bias", "moconn.layer.3.memory_initilizer.init_memory_bias", "moconn.layer.3.memory_initilizer.init_memory_fc.0.weight", "moconn.layer.3.memory_initilizer.init_memory_fc.0.bias", "moconn.layer.3.memory_initilizer.init_memory_fc.1.weight", "moconn.layer.3.memory_initilizer.init_memory_fc.1.bias", "moconn.layer.3.memory_updater.memory_update_attention.query.weight", "moconn.layer.3.memory_updater.memory_update_attention.query.bias", "moconn.layer.3.memory_updater.memory_update_attention.key.weight", "moconn.layer.3.memory_updater.memory_update_attention.key.bias", "moconn.layer.3.memory_updater.memory_update_attention.value.weight", "moconn.layer.3.memory_updater.memory_update_attention.value.bias", "moconn.layer.3.memory_updater.mc.weight", "moconn.layer.3.memory_updater.sc.weight", "moconn.layer.3.memory_updater.sc.bias", "moconn.layer.3.memory_updater.mz.weight", "moconn.layer.3.memory_updater.sz.weight", "moconn.layer.3.memory_updater.sz.bias", "moconn.layer.3.memory_augmented_attention.query.weight", "moconn.layer.3.memory_augmented_attention.query.bias", "moconn.layer.3.memory_augmented_attention.key.weight", "moconn.layer.3.memory_augmented_attention.key.bias", "moconn.layer.3.memory_augmented_attention.value.weight", "moconn.layer.3.memory_augmented_attention.value.bias", "moconn.layer.3.hidden_intermediate.dense.weight", "moconn.layer.3.hidden_intermediate.dense.bias", "moconn.layer.3.memory_projection.weight", "moconn.layer.3.memory_projection.bias", "moconn.layer.3.output.dense.weight", "moconn.layer.3.output.dense.bias", "moconn.layer.3.output.LayerNorm.weight", "moconn.layer.3.output.LayerNorm.bias", "pooler.context_vector", "pooler.fc.0.weight", "pooler.fc.0.bias", "pooler.fc.1.weight", "pooler.fc.1.bias", "pooler.fc.1.running_mean", "pooler.fc.1.running_var", "embeddings.word_embeddings.weight", "embeddings.word_fc.0.weight", "embeddings.word_fc.0.bias", "embeddings.word_fc.2.weight", "embeddings.word_fc.2.bias", "embeddings.word_fc.4.weight", "embeddings.word_fc.4.bias", "embeddings.position_embeddings.pe", "embeddings.LayerNorm.weight", "embeddings.LayerNorm.bias", "tag_embeddings.weight", "map_embed.weight", "map_embed.bias", "ca_net.fc.weight", "ca_net.fc.bias", "fc.0.weight", "fc.1.weight", "fc.1.bias", "fc.1.running_mean", "fc.1.running_var", "filter_net.0.weight", "filter_net.0.bias", "filter_net.1.weight", "filter_net.1.bias", "filter_net.1.running_mean", "filter_net.1.running_var", "image_net.0.weight", "image_net.0.bias", "image_net.1.weight", "image_net.1.bias", "image_net.1.running_mean", "image_net.1.running_var", "mart_fc.0.weight", "mart_fc.0.bias", "mart_fc.1.weight", "mart_fc.1.bias", "mart_fc.1.running_mean", "mart_fc.1.running_var", "upsample1.1.weight", "upsample1.2.weight", "upsample1.2.bias", "upsample1.2.running_mean", "upsample1.2.running_var", "upsample2.1.weight", "upsample2.2.weight", "upsample2.2.bias", "upsample2.2.running_mean", "upsample2.2.running_var", "upsample3.1.weight", "upsample3.2.weight", "upsample3.2.bias", "upsample3.2.running_mean", "upsample3.2.running_var", "next_g.att.conv_context.weight", "next_g.att.conv_sentence_vis.weight", "next_g.att.linear.weight", "next_g.att.linear.bias", "next_g.residual.0.block.0.weight", "next_g.residual.0.block.1.weight", "next_g.residual.0.block.1.bias", "next_g.residual.0.block.1.running_mean", "next_g.residual.0.block.1.running_var", "next_g.residual.0.block.3.weight", "next_g.residual.0.block.4.weight", "next_g.residual.0.block.4.bias", "next_g.residual.0.block.4.running_mean", "next_g.residual.0.block.4.running_var", "next_g.residual.1.block.0.weight", "next_g.residual.1.block.1.weight", "next_g.residual.1.block.1.bias", "next_g.residual.1.block.1.running_mean", "next_g.residual.1.block.1.running_var", "next_g.residual.1.block.3.weight", "next_g.residual.1.block.4.weight", "next_g.residual.1.block.4.bias", "next_g.residual.1.block.4.running_mean", "next_g.residual.1.block.4.running_var", "next_g.residual.2.block.0.weight", "next_g.residual.2.block.1.weight", "next_g.residual.2.block.1.bias", "next_g.residual.2.block.1.running_mean", "next_g.residual.2.block.1.running_var", "next_g.residual.2.block.3.weight", "next_g.residual.2.block.4.weight", "next_g.residual.2.block.4.bias", "next_g.residual.2.block.4.running_mean", "next_g.residual.2.block.4.running_var", "next_g.residual.3.block.0.weight", "next_g.residual.3.block.1.weight", "next_g.residual.3.block.1.bias", "next_g.residual.3.block.1.running_mean", "next_g.residual.3.block.1.running_var", "next_g.residual.3.block.3.weight", "next_g.residual.3.block.4.weight", "next_g.residual.3.block.4.bias", "next_g.residual.3.block.4.running_mean", "next_g.residual.3.block.4.running_var", "next_g.upsample.1.weight", "next_g.upsample.2.weight", "next_g.upsample.2.bias", "next_g.upsample.2.running_mean", "next_g.upsample.2.running_var", "next_g.conv.weight", "next_img.0.weight", "next_img_.0.weight", "m_net.0.weight", "m_net.0.bias", "m_net.1.weight", "m_net.1.bias", "m_net.1.running_mean", "m_net.1.running_var", "c_net.0.weight", "c_net.0.bias", "c_net.1.weight", "c_net.1.bias", "c_net.1.running_mean", "c_net.1.running_var".
Unexpected key(s) in state_dict: "epoch", "netG_state_dict", "optimizer_state_dict".
def load_network_stageI(self):
from .model import StoryGAN, STAGE1_D_IMG, STAGE1_D_STY_V2, StoryMarttGAN
if self.use_martt:
netG = StoryMarttGAN(self.cfg, self.video_len)
else:
netG = StoryGAN(self.cfg, self.video_len)
netG.apply(weights_init)
print(netG)
if self.cfg.NET_G != '':
state_dict = \
torch.load(self.cfg.NET_G,
map_location=lambda storage, loc: storage)
netG.load_state_dict(state_dict)
print('Load from: ', self.cfg.NET_G)
if self.use_image_disc:
if self.cfg.DATASET_NAME == 'youcook2':
use_categories = False
else:
use_categories = True
netD_im = STAGE1_D_IMG(self.cfg, use_categories=use_categories)
netD_im.apply(weights_init)
print(netD_im)
if self.cfg.NET_D_IM != '':
state_dict = \
torch.load(self.cfg.NET_D_IM,
map_location=lambda storage, loc: storage)
netD_im.load_state_dict(state_dict)
print('Load from: ', self.cfg.NET_D_IM)
else:
netD_im = None
if self.use_story_disc:
netD_st = STAGE1_D_STY_V2(self.cfg)
netD_st.apply(weights_init)
# for m in netD_st.modules():
# print(m.__class__.__name__)
print(netD_st)
if self.cfg.NET_D_ST != '':
state_dict = \
torch.load(self.cfg.NET_D_ST,
map_location=lambda storage, loc: storage)
netD_st.load_state_dict(state_dict)
print('Load from: ', self.cfg.NET_D_ST)
else:
netD_st = None
can I enter 5 statements instead of the CSV file (description) and the model give me the images of those statements??
Hi, thank you for your great work!
I have a question about implementing how to evaluate the FID score on your generated images.
I tried to reproduce FID score using your pre-trained weight of DuCo-StoryGAN, but I couldn't reproduce your results shown in table 1 in your paper.
Could you elaborate about how to reproduce your FID score?
Thanks!
I made some changes in Desdescription file cuz I need to enter only 5 statements to make one story
so I want to know how did u make the desdescription_vec & desdescription_attr & desdescription.npy files ????????
Hi,
When I run the train as specified in the Readme , I encounter the issue seen in the screenshot below: mat1 and mat2 shapes cannot be multiplied. And when I tried debugging, I saw issues in algo.train() -> netG.sample_videos -> self.ca_net. And the reason of this error is that the content_input
shape is [12x1780] , while the fc in self.ca_net
in/output size is [640x248].
I'm unsure whether the dimension is incorrect, but I modified the fc layer dimension to run the code to the end. Unfortunately, the same errors have occurred several times after that, and since there is a case that the same Class is used in another function, I can't modify the layer anymore.
How can I solve it?
Or do I have a problem prior to training, such as 'prepare repository, extract constituency parses or dense captions', despite the fact that no issue has been shown to me?
Hello, there!
In the
VLCStoryGan/vlcgan/miscc/utils.py
Line 163 in 7411240
VLCStoryGan/vlcgan/miscc/utils.py
Line 186 in 7411240
Why to calculate like this? I didn't figure out the meaning of this. And what is the difference with the loss below?
loss = loss_fct(logits_per_image, labels)
I am getting the key error ":" (The missing key is ':') in the story loader .
File "train_vlcgan.py", line 206, in
algo.train(imageloader, storyloader, testloader, cfg.STAGE)
File "/home/dwivedi7/VLCStoryGan/vlcgan/trainer_vlc.py", line 246, in train
for i, data in tqdm(enumerate(storyloader, 0)):
File "/opt/conda/lib/python3.7/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
KeyError: Caught KeyError in DataLoader worker process 0..
Please let me know if this issue can be resolved.
Hello,
I'd like to processing with multi GPU. So I set the gpu_id from 0 to 0,1.
GPU_ID: '0,1'
in pororo_s1_vlc.yml .
And I used nn.parallel.data_parallel
too. (
VLCStoryGan/vlcgan/trainer_vlc.py
Line 310 in 7411240
AttributeError: 'DataParallel' object has no attribute 'sample_videos'
.
Traceback (most recent call last):
File "train_vlcgan.py", line 225, in <module>
PIL.Image.fromarray,
File "/project/6057220/xianzhen/storygan/vlcgan/trainer_vlc.py", line 353, in train
lr_st_fake, st_fake, m_mu, m_logvar, c_mu, c_logvar, s_word = netG.sample_videos(*st_inputs)
File "/home/xianzhen/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1177, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DataParallel' object has no attribute 'sample_videos'
I am trying to implement the training which should be working, but it is not due to the below error.
AttributeError: Can't get attribute 'video_transform' on <module 'mp_main' from 'C:\Users\eng_a------\VLC_StoryGAN\VLCStoryGan-master\train_vlcgan.py'>
May you help, please?
Thanks!
Hello.
I am getting this below error:
NameError: name 'parser' is not defined
Hi, very nice work. Thank you for sharing your code.
When I run the code, it reports an error that const_tag2idx.json is missing.
How can I get this file?
Thank you
Hi, I'm eager to evaluate FID, but I'm not sure how to run the eval_vfid.py.
Because there is no flintstones_data which should be imported.
Also I tried using eval_vfid.py in StoryViz, but there is another issue: no file named "img-%s-%s.png' % (item, k)". In my result folder, there is no file like "img-%s-%s.png' % (item, k)".
Would you tell me how to run eval_vfid.py file?
I have run the following command in the terminal but it is not working.
python train_vlcgan.py --cfg ./cfg/pororo_s1_vlc.yml --data_dir ./data --dataset pororo\
May I know what I am missing?
Thanks
Hello.
Thanks for sharing this superb and promising work!
I am facing some problems when loading the parser.
I have tried to keep the parse.py as it is but I got the following:
Loading parser
usage: [-h] [--sum] N [N ...]
: error: the following arguments are required: N
An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
But when I changed the parser.add_argument from positional arguments to the different argument I got the following:
"args has no attribute dataset".
Knowing that my args type is as outlined below:
Namespace(**{'data_dir <path_to_data_directory>': None, 'dataset pororo': 'pororo'})
which seems unworkable.
I am not able to load the parser correctly after implementing the parse.py code, knowing that I am using benepar_en3 (integrated with spaCy 3.2.0).
Error:
alueError: [E966] nlp.add_pipe
now takes the string name of the registered component factory, not a callable component. Expected string, but got <benepar.integrations.spacy_plugin.BeneparComponent object at 0x0000026D190BAB50> (name: 'None').
If you created your component with nlp.create_pipe('name')
: remove nlp.create_pipe and call nlp.add_pipe('name')
instead.
If you passed in a component like TextCategorizer()
: call nlp.add_pipe
with the string name instead, e.g. nlp.add_pipe('textcat')
.
If you're using a custom component: Add the decorator @Language.component
(for function components) or @Language.factory
(for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name')
. You can then run nlp.add_pipe('your_name')
to add it to the pipeline.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.