Git Product home page Git Product logo

dong_iccv_2017's People

Contributors

woozzu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dong_iccv_2017's Issues

training model request

Hi, we are taking some experiments in different text guided image manipulation models. To be fair, could you provide your fine-tuned training model checkpoint file?

how to get the 'txt_feat, txt_feat_mismatch, txt_feat_relevant'?

Hi, I read your paper and try to implements it by myself, I don't know how to get the 'txt_feat_mismatch' and 'txt_feat_relevant', I can't understand that your step in the '/train-preprocess' , why we can't get it use 'np.roll()', am Imiss something? Please help, Thanks for you patient.

train_text_embedding.py the 'trainclasses_file' should be which .txt in config?

there is a classes.txt in datasets CUB_200_2011 which is from your link, I used the classes.py as the 'trainclasses_file' but get the error like below:
Traceback (most recent call last):
File "/home/xin/PycharmProjects/colorchage2/train_text_embedding.py", line 81, in
std=[0.229, 0.224, 0.225])
File "/home/xin/PycharmProjects/colorchage2/data.py", line 31, in init
self.data = self._load_dataset(img_root, caption_root, classes_fllename, word_embedding)
File "/home/xin/PycharmProjects/colorchage2/data.py", line 39, in _load_dataset
filenames = os.listdir(os.path.join(caption_root, cls))
OSError: [Errno 2] No such file or directory: '/home/xin/PycharmProjects/colorchage2/datasets/CUB_200_2011/cub_icml/1 001.Black_footed_Albatross'
I think it is because there are count numbers at the first place of every row in classes.txt, so it is not suitable to use this file.
so where you get the 'trainclasses_file'? can you give a link?

A mistake in modified vgg encoder?

dong_iccv_2017/model.py

Lines 77 to 83 in e7f371a

nn.Sequential(*(self.encoder.features[i] for i in range(23) + range(24, 33)))
self.encoder[24].dilation = (2, 2)
self.encoder[24].padding = (2, 2)
self.encoder[27].dilation = (2, 2)
self.encoder[27].padding = (2, 2)
self.encoder[30].dilation = (2, 2)
self.encoder[30].padding = (2, 2)

It seems that you used VGG16bn, and you modified the conv4 layers to a dilated convolution layer. But I found the encoders[24,27,30] were batch normalization layers. It seems an error.

code changes required for image size128

Hi,
I wanted to run the code for images of size 128 as input instead of 64.
I see that image size 64 is hardcoded in "train.py" while transforming into tensors.

Is there any other place across the code where I need to change the image size (or other parameters dependent on image size) to run it on 128 size input image.

fasttext error! Couldn't not load_model after tried your fixed code!

Environment Azure NC6 56GB RAM Python2.7
after run './scripts/train_birds.sh', I got below error infomation:

Loading a pretrained fastText model...
Traceback (most recent call last):
File "train.py", line 88, in
word_embedding = fasttext.load_model(args.fasttext_model)
File "fasttext/fasttext.pyx", line 154, in fasttext.fasttext.load_model
Exception: fastText: Cannot load /home/zijie/research/data/fastText/wiki.en.bin due to C++ extension failed to allocate the memory

can not generate realistic image

I am new to GAN. I run your code and the generated images are not realistic.
this is generated images after 150epoch
epoch_150

and this is generated images after 570 epoch
epoch_570

It seems that they are similar, and there are no improvement after training many epochs. Can you give me some advice.
By the way, can you give me your pre-trained word-embedding model, the link you gave before is unavailable.

A wrong in loaddataset.

Sorry to bother you again, but I encountered an error loading the data:
Loading a dataset...
Traceback (most recent call last):
File "train_text_embedding.py", line 106, in
img = img[indices, ...]
File "/home/tjl/anaconda3/envs/tjl/lib/python3.6/site-packages/torch/autograd/variable.py", line 76, in getitem
return Index.apply(self, key)
File "/home/tjl/anaconda3/envs/tjl/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 16, in forward
result = i.index(ctx.index)
IndexError: When performing advanced indexing the indexing objects must be LongTensors or convertible to LongTensors
My pytorch version is 0.2.0 and other environments have been configured. I hope you can give me some Suggestions on how to deal with this problem!

Computational Details

Hi,

Thanks for creating this project.
I was trying to train this model and wanted to know the configuration details of the machine on which you trained this (GPU type and memory, CPU cores, RAM used, etc). Also, it would be great if you can tell how much time it took for you to train this model.

Thanks a lot.

The kld loss in UPDATE GENERATOR process

I noticed that you use the kld = torch.mean(-z_log_stddev + 0.5 * (torch.exp(2 * z_log_stddev) + torch.pow(z_mean, 2) - 1)) in UPDATE GENERATOR . But I don't understand that why you chose this as a part of your loss function. And it seems that it was not mentioned in the original paper. Could you please tell me its intention here?
And apart from this, the z_log_stddev and z_mean are just got from two different Linear+LeakyReLU layer. Emmm... why did you use the Linear+LeakyReLU layer rather than calculate mean and std directly?

Thanks for your help~

Error in train_[birds/flowers].sh

Hello! I've done the first step: train_text_embedding_birds.sh
When I run train_birds.sh, error like below occurs:

Traceback (most recent call last):
File "train.py", line 150, in
preprocess(img, desc, len_desc, txt_encoder)
File "train.py", line 67, in preprocess
desc[sorted_indices, ...].transpose(0, 1),
File "/media/server009/seagate/liuhan/anaconda2/envs/dongiccv/lib/python2.7/site-packages/torch/autograd/variable.py", line 78, in getitem
return Index.apply(self, key)
File "/media/server009/seagate/liuhan/anaconda2/envs/dongiccv/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 87, in forward
result = i.index(ctx.index)
IndexError: When performing advanced indexing the indexing objects must be LongTensors or convertible to LongTensors. The indexing object at position 0 is of type numpy.ndarray and cannot be converted

Could you please help me solve this problem? @woozzu
Thank you very much !
I'm looking forward to your reply.

ValueError: some of the strides of a given numpy array are negative.

I have successfully run train_text_embedding_flowers.sh, but when I run train_flowers.sh there is an error like following:
Traceback (most recent call last):
File "/home/xin/PycharmProjects/newcolorchage/train.py", line 149, in
preprocess(img, desc, len_desc, txt_encoder)
File "/home/xin/PycharmProjects/newcolorchage/train.py", line 68, in preprocess
desc[sorted_indices, ...].transpose(0, 1),
ValueError: some of the strides of a given numpy array are negative. This is currently not supported, but will be added in future releases.

could you please tell me why this happened? Thank you very much!

Severe mode collapse

Hi all,

I was training the model on birds dataset as well as my own data and very soon in training for both datasets (e.g from epoch 5) I start to see some ugly mode collapse that continues till epoch 100 and beyond.

Example from my data (dresses):

Epoch 3:
Screenshot 2019-06-07 at 15 33 01

Epoch 5:
Screenshot 2019-06-07 at 15 31 42

Epoch 23:
Screenshot 2019-06-07 at 15 32 00

Epoch 97:
Screenshot 2019-06-07 at 15 25 10

Do you have any ideas how to improve the training?

Could you provide the pre-trained text embedding?

Hi, thanks for your great implementation.
I meet some problems in training a visual-semantic embedding.
Would you mind directly offering the pre-trained text embedding so that I can directly train the model?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.