zhangjinso / pise Goto Github PK

View Code? Open in Web Editor NEW

124.0 124.0 28.0 183 KB

Python 100.00%

pise's People

Contributors

Stargazers

Watchers

pise's Issues

How to continue training after interruption?

About the training process

Hi，
To achieve your results, how many iterations and how long does the training process should take in your experiment?

Thanks！

how to visualize the parsing map?

Problems about the generated target parsing results of your pre-trained model

Hi,
I have a problem about the generated target parsing results of your pre-trained model for human pose transfer.
Using your pre-trained checkpoint, I visualize the generated target parsing results. (i.e., self.parsav in class Painet(BaseModel)).
As shown in the figure, however, it seems to exist some problems.

1、 It seems that the ParsingNet can only effectively generate parsing maps of some limited regions(e.g., '3':upper clothes and '5' : lower clothes(pants, shorts)), but cannot tackle other regions(e.g., skin, face, hair etc.).
2、It seems that the generated target parsing result is offset to the left relative to GT, which should be located in the middle of the image. In other words, the generated target parsing result is not aligned with the input target pose(i.e., self.input_BP2) in the spatial position.
In fact, using your pre-trained checkpoint, the generated target image result is also offset to the left relative to GT, as shown in the figure.

I’m not sure if it’s the problem with your model？ Please check it. I would be very grateful if you can provide your visualization results！

Thanks！

Regarding paper

Tried to read your paper , it is tough to follow , like what is decoupled gan , is it a type of gan , if it is then there was nothing mentioned about discriminator . I couldnt find any search result on decoupled gan . Are there any video explanations of your paper ,or regarding decoupled gan. If they are there can you please share

数据集问题

trainSPL8在哪里下载呢？

H36m keypoints

hi
can I generate images in target poses if my poses are in H36m format?

questions about the data size

Great work！I have some questions about the data size.
1、In my opinion, the loadsize of input image, pose map, and parsing map is all 256x256 in your method. However, the key points annotations are obtained from the cropped images with the resolution of 176x256, which means the oldsize should be 176x256. However, why do you set parser.set_defaults(old_size=(256, 256)) in fashion_dataset.py ?
2、Your parsing maps are obtained from the cropped images with the resolution of 176x256, and then padding to 256X256. Is it right?
3、The original images of DeepFashion dataset(256x256) have the backgrounds with inconsistent colors. Will it have a bad effect when used directly for training? Need I crop them to 176x256, and then padding them to 256X256？

Thanks very much!

Improvement

Hi,
I have read your paper and tried to implement it. But in order for it to be used for virtual try onns I tried to take this project to next level by adding a person identification of generated image in new pose with respect to the real image of the same person in order to calculate the final accuracy of the model??
So can you give me any idea on how to approach this?

A question about the paper

First thank you for the great job and sharing it with the community,

I have a question about the pipeline of the method. As I noticed the training procedure needs to be done in two different stages, one for generating the parsing map and then after for generating the final image. So my question is does the second stage provide any kind of gradients for the first one, or not these two steps are quite disconnected from each other?

Thank you in advance

generating samples by the pretrained model

Hello,

Thanks for sharing the great work. I'm trying to generate the samples using the pretrained model. But unfortunately my results are so dim, something like this:
Mine:

Previously available by the authors:

Any help is greatly appreciated by the entire community.

Invalid syntax

In painet_model.py iam getting invalid syntax on async = true .Iam using google colab .

The metric

Hi, nice job!

I was sorry to bother you, when I used the test instruction， I can get the eval_results. I was a little confused to get the results as your paper. May I use other calculate the metric program?

Thank you very much!

May you please share your checkpints for the discriminators?

About the Norm layer in your code

How to test the texture transfer model?

I want to test only the texture transfer model. It is mentioned that the model needs to be changed. How should I change it? What inputs are needed to test the texture transfer model?

模型复现不了的问题

你好

在使用README.md中的"Pre-trained checkpoint of human pose transfer"预测时，和你提供的预测结果不一样，效果比较差，请问是什么原因？除了把模型放到指定路径，还需要别的设置吗？

我在使用自己训练的模型时，效果还是可以的。

最后想问有可以直接迁移任意一张图像的demo吗？

非常感谢

FileNotFoundError: [Errno 2] No such file or directory: './fashion_data/fasion-pairs-train.csv'

Is it fasion-pairs-train.csv or fasion-resize-pairs-train.csv?

FileNotFoundError: [Errno 2] No such file or directory: '/home/zjs/.torch/models/vgg19-dcbb9e9d.pth'

problems about training the texture model

Hi, I have some problems about the difference between the pose transfer model and the texture model.
1、As you said, we can uncomment line 162-176 to train a texture model. In my opinion, it mainly changes the predicted par2 from Float to Int by the torch.argmax operation. What's the advantage or motivation of you doing it rather than use the predicted par2 directly?
2、Since argmax operation is non-differentiable, if we uncomment line 162-176 to train a texture model, the image generator can not provide the gradients for the parsing generator. Thus, the pre-trained parsing generator will not be updated during the training. Will it affect the quality of final generated images?
Besides, since the parsing generator are disconnected from the image generator, need we to calculate the parsing loss like loss_par and loss_par1 when tarining the image generator?

Thanks!

Do you have a trained model?

I want to only test and i don't have any gpus,so,please provide a model that can be used to test？
Thx a lot!

Error while training

while training we are getting this error eventhough we have downloaded datasets and put it in fashiondata directory and in train.
FileNotFoundError: [Errno 2] No such file or directory: './dataset/fashion_data/train/fashionWOMENJackets_Coatsid0000658203_3back.jpg'
Here Women and jackets are directories but it is showing as single name , and also 03 should be in folder of id_00006582

quality about the par_sav

I am curiosity about the quality of the output parsing map of p2.And I Try to train for several days on V100.
The final png is so good.But the par_sav is different from the SPL2.

And I have seen the constrains over them,which should helps to make them very similiar.But how this happends with no head,no skin in the img.
And I want to ask for whether the coordconv is useful.

error

while training the model we are getting this error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/zjs/.torch/models/vgg19-dcbb9e9d.pth'

I couldn't find the code locations that to "comments the line 177 and 178, and uncomments line 162-176." mentioned in README.

Hi Jinsong, I'm really interest in PISE and admire your work.
At your convenience, I'd like to confirm with you about the code locations that to "comments the line 177 and 178, and uncomments line 162-176." mentioned in the Train sector of README because I couldn't find these two pieces of code and I really want to study them.
Looking forward to your reply. Thanks a lot!

The correspondence loss is actually not used

Hi,

The correspondence loss of image generator in paper section 3.3 is not used in your code.

self.loss_names = [ 'app_gen','content_gen', 'style_gen', #'reg_gen',
'ad_gen', 'dis_img_gen', 'par', 'par1']

Does the correspondence loss not impact the final generated image? Thanks for your reply.

Respective / end-to-end training

Hello,

First of all, thank you for your work. I read in your paper that you first train the parsing generator and image generator respectively, then perform an end-to-end training. However, I was not able to locate in your code the parts that handle the switch of training strategy.
The reason for this question is that I would like to train only the parsing generator first and would not like to change everything in the code if the option is already there.

Thank you for your help

About the accurate correspondences between different parsing labels and indexes in your provided parsing data

Hi!
Could you tell me the accurate correspondences between different parsing labels and indexes in your provided parsing data?
The parsing labels mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background', which seems differs from that in your provided parsing data.
For example, in your provided parsing data, you set the 'Shoes' to a separate category，meanwhile combine the arm and leg skins into the same category.

Besides, when I visualize your provided parsing data, the area of region with index == 1 seems always equal to 0. Please check it.
PS: I don't solve my problem from the similar issue like #2

Thanks!

new issue

作者你好，我最近看了你的这篇论文很受启发，我想复现一下你的代码，你的训练数据集有提供吗，Readme里好像没有说明白。

About the LPIPS

Hi， great work！
I have a problem about the results of LPIPS. Since you split the dataset with the same way as GFLA , why are the LPIPS results in your paper(GFLA: 0.2219, PATN:0.2520) different from that reported in the paper of GFLA?(GFLA: 0.2341, PATN:0.2533). What is your test setup？
Thanks！

pretrained model quality is not so good than your paper results.

Hi.
Thank you for providing your excellent paper and its code to the world. It's so exciting to touch your paper for me.

Now, I use your code and Pre-trained checkpoint of human pose transfer from here and its result is not looks good than your result reported in paper.

Is there something I have to do for getting more good result? Do i have to retrain model from pre-trained one?

Below is results from pre-trained model.

There are also good ones.

Can you please explain how you have made the testSPL8 folder ? I also want to test it on my own dataset so i want to generate the same .pngs.

How to test my own images?

Hi, I'm very interested in your project.
Now I have my own dataset includes pairs of images and keypoints. Could I use your pre-trained model to test my own images (256 * 192)?

How long does the entire training process take?

ModuleNotFoundError: No module named 'dominate'

Good day!

I tried to run

python test.py --name=fashion --model=painet --gpu_ids=0

but I got

ModuleNotFoundError: No module named 'dominate'

in html.py file

PISE/util/html.py

Line 1 in f9dfacb

import dominate

Can you help me please?

How to edit the parsing map

Hi, for region editing, gievn a certain parsing map, how do you edit it to obtain the desired parsing maps？Can you share the scripts or tools you used? Thanks!

About the extraction of "Fp"

Hi,
In your paper, you "concatenate the source image Is, the source parsing map Ss, the generated parsing map Sg and the target pose Pt in depth (channel) dimension and extract its feature Fp ", as shown in the Fig.

However, in my opinion, Fp should aims to provide the target pose information. Why do you additionally use the source image Is and the source parsing map Ss as input? Do you try to only use Sg and Pt to extract Fp ?
Thanks!

What is the mapping of the semantic map of person image to the merged K=8 attribute?

Could you kindly tell me how to map 20 different attributes into K=8 attributes.

The current indexes are 'Hat', 'Hair', 'Glove', 'Sunglasses', 'Upper Clothes', 'Dress', 'Coat', 'Socks', 'Pants', 'Jumpsuits', 'Scarf', 'Skirt', 'Face', 'Left-arm', 'Right-arm', 'Left-leg', 'Right-leg', 'Left-shoe' and 'Right-shoe'.

And the merged indexes mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background'.

a question about the code

I have a question about the code. In my opinion, as shown in the Fig, the code from line 627-638 should be aligned with the line 605.
In other words, I think you should first finish the for loop about b_size to obtain the middle_avg before performing subsequent operations. Please check it. Thanks!

How to visualize the parsing map image？

When I open the TestSPL8 images, what I can see is black. Can you tell me how to visualize the parsing map image？Thank you!

zhangjinso / pise Goto Github PK

pise's People

Contributors

Stargazers

Watchers

Forkers

pise's Issues

Recommend Projects

Recommend Topics

Recommend Org