woozzu / dong_iccv_2017 Goto Github PK

View Code? Open in Web Editor NEW

146.0 146.0 25.0 2.28 MB

A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

License: MIT License

Python 94.14% Shell 5.86%

dong_iccv_2017's People

Contributors

Stargazers

Watchers

dong_iccv_2017's Issues

training model request

Hi, we are taking some experiments in different text guided image manipulation models. To be fair, could you provide your fine-tuned training model checkpoint file?

how to get the 'txt_feat, txt_feat_mismatch, txt_feat_relevant'?

Hi, I read your paper and try to implements it by myself, I don't know how to get the 'txt_feat_mismatch' and 'txt_feat_relevant', I can't understand that your step in the '/train-preprocess' , why we can't get it use 'np.roll()', am Imiss something? Please help, Thanks for you patient.

train_text_embedding.py the 'trainclasses_file' should be which .txt in config?

there is a classes.txt in datasets CUB_200_2011 which is from your link, I used the classes.py as the 'trainclasses_file' but get the error like below:
Traceback (most recent call last):
File "/home/xin/PycharmProjects/colorchage2/train_text_embedding.py", line 81, in
std=[0.229, 0.224, 0.225])
File "/home/xin/PycharmProjects/colorchage2/data.py", line 31, in init
self.data = self._load_dataset(img_root, caption_root, classes_fllename, word_embedding)
File "/home/xin/PycharmProjects/colorchage2/data.py", line 39, in _load_dataset
filenames = os.listdir(os.path.join(caption_root, cls))
OSError: [Errno 2] No such file or directory: '/home/xin/PycharmProjects/colorchage2/datasets/CUB_200_2011/cub_icml/1 001.Black_footed_Albatross'
I think it is because there are count numbers at the first place of every row in classes.txt, so it is not suitable to use this file.
so where you get the 'trainclasses_file'? can you give a link?

A mistake in modified vgg encoder?

dong_iccv_2017/model.py

Lines 77 to 83 in e7f371a

 nn.Sequential(*(self.encoder.features[i] for i in range(23) + range(24, 33))) 

 self.encoder[24].dilation = (2, 2) 

 self.encoder[24].padding = (2, 2) 

 self.encoder[27].dilation = (2, 2) 

 self.encoder[27].padding = (2, 2) 

 self.encoder[30].dilation = (2, 2) 

 self.encoder[30].padding = (2, 2)

It seems that you used VGG16bn, and you modified the conv4 layers to a dilated convolution layer. But I found the encoders[24,27,30] were batch normalization layers. It seems an error.

code changes required for image size128

Hi,
I wanted to run the code for images of size 128 as input instead of 64.
I see that image size 64 is hardcoded in "train.py" while transforming into tensors.

Is there any other place across the code where I need to change the image size (or other parameters dependent on image size) to run it on 128 size input image.

fasttext error! Couldn't not load_model after tried your fixed code!

Environment Azure NC6 56GB RAM Python2.7
after run './scripts/train_birds.sh', I got below error infomation:

Loading a pretrained fastText model...
Traceback (most recent call last):
File "train.py", line 88, in
word_embedding = fasttext.load_model(args.fasttext_model)
File "fasttext/fasttext.pyx", line 154, in fasttext.fasttext.load_model
Exception: fastText: Cannot load /home/zijie/research/data/fastText/wiki.en.bin due to C++ extension failed to allocate the memory

can not generate realistic image

I am new to GAN. I run your code and the generated images are not realistic.
this is generated images after 150epoch

and this is generated images after 570 epoch

It seems that they are similar, and there are no improvement after training many epochs. Can you give me some advice.
By the way, can you give me your pre-trained word-embedding model, the link you gave before is unavailable.

A wrong in loaddataset.

Sorry to bother you again, but I encountered an error loading the data：
Loading a dataset...
Traceback (most recent call last):
File "train_text_embedding.py", line 106, in
img = img[indices, ...]
File "/home/tjl/anaconda3/envs/tjl/lib/python3.6/site-packages/torch/autograd/variable.py", line 76, in getitem
return Index.apply(self, key)
File "/home/tjl/anaconda3/envs/tjl/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 16, in forward
result = i.index(ctx.index)
IndexError: When performing advanced indexing the indexing objects must be LongTensors or convertible to LongTensors
My pytorch version is 0.2.0 and other environments have been configured. I hope you can give me some Suggestions on how to deal with this problem！

Computational Details

Hi,

Thanks for creating this project.
I was trying to train this model and wanted to know the configuration details of the machine on which you trained this (GPU type and memory, CPU cores, RAM used, etc). Also, it would be great if you can tell how much time it took for you to train this model.

Thanks a lot.

The kld loss in UPDATE GENERATOR process

I noticed that you use the kld = torch.mean(-z_log_stddev + 0.5 * (torch.exp(2 * z_log_stddev) + torch.pow(z_mean, 2) - 1)) in UPDATE GENERATOR . But I don't understand that why you chose this as a part of your loss function. And it seems that it was not mentioned in the original paper. Could you please tell me its intention here?
And apart from this, the z_log_stddev and z_mean are just got from two different Linear+LeakyReLU layer. Emmm... why did you use the Linear+LeakyReLU layer rather than calculate mean and std directly?

Thanks for your help~

Error in train_[birds/flowers].sh

Hello! I've done the first step: train_text_embedding_birds.sh
When I run train_birds.sh, error like below occurs:

Traceback (most recent call last):
File "train.py", line 150, in
preprocess(img, desc, len_desc, txt_encoder)
File "train.py", line 67, in preprocess
desc[sorted_indices, ...].transpose(0, 1),
File "/media/server009/seagate/liuhan/anaconda2/envs/dongiccv/lib/python2.7/site-packages/torch/autograd/variable.py", line 78, in getitem
return Index.apply(self, key)
File "/media/server009/seagate/liuhan/anaconda2/envs/dongiccv/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 87, in forward
result = i.index(ctx.index)
IndexError: When performing advanced indexing the indexing objects must be LongTensors or convertible to LongTensors. The indexing object at position 0 is of type numpy.ndarray and cannot be converted

Could you please help me solve this problem? @woozzu
Thank you very much !
I'm looking forward to your reply.

Can I train a text encoder without pretrain embedding

Can I train a text encoder without a pretrained embedding? I am wondering the usage of fast text

ValueError: some of the strides of a given numpy array are negative.

I have successfully run train_text_embedding_flowers.sh, but when I run train_flowers.sh there is an error like following:
Traceback (most recent call last):
File "/home/xin/PycharmProjects/newcolorchage/train.py", line 149, in
preprocess(img, desc, len_desc, txt_encoder)
File "/home/xin/PycharmProjects/newcolorchage/train.py", line 68, in preprocess
desc[sorted_indices, ...].transpose(0, 1),
ValueError: some of the strides of a given numpy array are negative. This is currently not supported, but will be added in future releases.

could you please tell me why this happened? Thank you very much!

Severe mode collapse

Hi all,

I was training the model on birds dataset as well as my own data and very soon in training for both datasets (e.g from epoch 5) I start to see some ugly mode collapse that continues till epoch 100 and beyond.

Example from my data (dresses):

Epoch 3:

Epoch 5:

Epoch 23:

Epoch 97:

Do you have any ideas how to improve the training?

Could you provide the pre-trained text embedding?

Hi, thanks for your great implementation.
I meet some problems in training a visual-semantic embedding.
Would you mind directly offering the pre-trained text embedding so that I can directly train the model?
Thanks!

	nn.Sequential(*(self.encoder.features[i] for i in range(23) + range(24, 33)))
	self.encoder[24].dilation = (2, 2)
	self.encoder[24].padding = (2, 2)
	self.encoder[27].dilation = (2, 2)
	self.encoder[27].padding = (2, 2)
	self.encoder[30].dilation = (2, 2)
	self.encoder[30].padding = (2, 2)

woozzu / dong_iccv_2017 Goto Github PK

dong_iccv_2017's People

Contributors

Stargazers

Watchers

Forkers

dong_iccv_2017's Issues

Recommend Projects

Recommend Topics

Recommend Org