Git Product home page Git Product logo

pise's People

Contributors

zhangjinso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pise's Issues

About the training process

Hi,
To achieve your results, how many iterations and how long does the training process should take in your experiment?

Thanks!

Problems about the generated target parsing results of your pre-trained model

Hi,
I have a problem about the generated target parsing results of your pre-trained model for human pose transfer.
Using your pre-trained checkpoint, I visualize the generated target parsing results. (i.e., self.parsav in class Painet(BaseModel)).
As shown in the figure, however, it seems to exist some problems.
1

1、 It seems that the ParsingNet can only effectively generate parsing maps of some limited regions(e.g., '3':upper clothes and '5' : lower clothes(pants, shorts)), but cannot tackle other regions(e.g., skin, face, hair etc.).
2、It seems that the generated target parsing result is offset to the left relative to GT, which should be located in the middle of the image. In other words, the generated target parsing result is not aligned with the input target pose(i.e., self.input_BP2) in the spatial position.
In fact, using your pre-trained checkpoint, the generated target image result is also offset to the left relative to GT, as shown in the figure.
2

I’m not sure if it’s the problem with your model? Please check it. I would be very grateful if you can provide your visualization results!

Thanks!

Regarding paper

Tried to read your paper , it is tough to follow , like what is decoupled gan , is it a type of gan , if it is then there was nothing mentioned about discriminator . I couldnt find any search result on decoupled gan . Are there any video explanations of your paper ,or regarding decoupled gan. If they are there can you please share

H36m keypoints

hi
can I generate images in target poses if my poses are in H36m format?

questions about the data size

Great work!I have some questions about the data size.
1、In my opinion, the loadsize of input image, pose map, and parsing map is all 256x256 in your method. However, the key points annotations are obtained from the cropped images with the resolution of 176x256, which means the oldsize should be 176x256. However, why do you set parser.set_defaults(old_size=(256, 256)) in fashion_dataset.py ?
2、Your parsing maps are obtained from the cropped images with the resolution of 176x256, and then padding to 256X256. Is it right?
3、The original images of DeepFashion dataset(256x256) have the backgrounds with inconsistent colors. Will it have a bad effect when used directly for training? Need I crop them to 176x256, and then padding them to 256X256?
image
Thanks very much!

Improvement

Hi,
I have read your paper and tried to implement it. But in order for it to be used for virtual try onns I tried to take this project to next level by adding a person identification of generated image in new pose with respect to the real image of the same person in order to calculate the final accuracy of the model??
So can you give me any idea on how to approach this?

A question about the paper

First thank you for the great job and sharing it with the community,

I have a question about the pipeline of the method. As I noticed the training procedure needs to be done in two different stages, one for generating the parsing map and then after for generating the final image. So my question is does the second stage provide any kind of gradients for the first one, or not these two steps are quite disconnected from each other?

Thank you in advance

generating samples by the pretrained model

Hello,

Thanks for sharing the great work. I'm trying to generate the samples using the pretrained model. But unfortunately my results are so dim, something like this:
Mine:
fashionMENJackets_Vestsid0000488201_7additional_fashionMENJackets_Vestsid0000488201_1front_vis
Previously available by the authors:
fashionMENJackets_Vestsid0000488201_7additional_fashionMENJackets_Vestsid0000488201_1front_vis

Any help is greatly appreciated by the entire community.

Invalid syntax

In painet_model.py iam getting invalid syntax on async = true .Iam using google colab .

The metric

Hi, nice job!

I was sorry to bother you, when I used the test instruction, I can get the eval_results. I was a little confused to get the results as your paper. May I use other calculate the metric program?

Thank you very much!

How to test the texture transfer model?

I want to test only the texture transfer model. It is mentioned that the model needs to be changed. How should I change it? What inputs are needed to test the texture transfer model?

模型复现不了的问题

你好

在使用README.md中的"Pre-trained checkpoint of human pose transfer"预测时,和你提供的预测结果不一样,效果比较差,请问是什么原因?除了把模型放到指定路径,还需要别的设置吗?

我在使用自己训练的模型时,效果还是可以的。

最后想问 有可以直接迁移任意一张图像的demo吗?

非常感谢

problems about training the texture model

Hi, I have some problems about the difference between the pose transfer model and the texture model.
1、As you said, we can uncomment line 162-176 to train a texture model. In my opinion, it mainly changes the predicted par2 from Float to Int by the torch.argmax operation. What's the advantage or motivation of you doing it rather than use the predicted par2 directly?
2、Since argmax operation is non-differentiable, if we uncomment line 162-176 to train a texture model, the image generator can not provide the gradients for the parsing generator. Thus, the pre-trained parsing generator will not be updated during the training. Will it affect the quality of final generated images?
Besides, since the parsing generator are disconnected from the image generator, need we to calculate the parsing loss like loss_par and loss_par1 when tarining the image generator?

Thanks!

Do you have a trained model?

I want to only test and i don't have any gpus,so,please provide a model that can be used to test?
Thx a lot!

Error while training

while training we are getting this error eventhough we have downloaded datasets and put it in fashiondata directory and in train.
FileNotFoundError: [Errno 2] No such file or directory: './dataset/fashion_data/train/fashionWOMENJackets_Coatsid0000658203_3back.jpg'
Here Women and jackets are directories but it is showing as single name , and also 03 should be in folder of id_00006582

quality about the par_sav

I am curiosity about the quality of the output parsing map of p2.And I Try to train for several days on V100.
The final png is so good.But the par_sav is different from the SPL2.
image

image

image

image

And I have seen the constrains over them,which should helps to make them very similiar.But how this happends with no head,no skin in the img.
And I want to ask for whether the coordconv is useful.

error

while training the model we are getting this error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/zjs/.torch/models/vgg19-dcbb9e9d.pth'

I couldn't find the code locations that to "comments the line 177 and 178, and uncomments line 162-176." mentioned in README.

Hi Jinsong, I'm really interest in PISE and admire your work.
At your convenience, I'd like to confirm with you about the code locations that to "comments the line 177 and 178, and uncomments line 162-176." mentioned in the Train sector of README because I couldn't find these two pieces of code and I really want to study them.
Looking forward to your reply. Thanks a lot!

The correspondence loss is actually not used

Hi,

The correspondence loss of image generator in paper section 3.3 is not used in your code.

self.loss_names = [ 'app_gen','content_gen', 'style_gen', #'reg_gen',
'ad_gen', 'dis_img_gen', 'par', 'par1']

Does the correspondence loss not impact the final generated image? Thanks for your reply.

Respective / end-to-end training

Hello,

First of all, thank you for your work. I read in your paper that you first train the parsing generator and image generator respectively, then perform an end-to-end training. However, I was not able to locate in your code the parts that handle the switch of training strategy.
The reason for this question is that I would like to train only the parsing generator first and would not like to change everything in the code if the option is already there.

Thank you for your help

About the accurate correspondences between different parsing labels and indexes in your provided parsing data

Hi!
Could you tell me the accurate correspondences between different parsing labels and indexes in your provided parsing data?
The parsing labels mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background', which seems differs from that in your provided parsing data.
For example, in your provided parsing data, you set the 'Shoes' to a separate category,meanwhile combine the arm and leg skins into the same category.
image

Besides, when I visualize your provided parsing data, the area of region with index == 1 seems always equal to 0. Please check it.
PS: I don't solve my problem from the similar issue like #2

Thanks!

new issue

作者你好,我最近看了你的这篇论文很受启发,我想复现一下你的代码,你的训练数据集有提供吗,Readme里好像没有说明白。

About the LPIPS

Hi, great work!
I have a problem about the results of LPIPS. Since you split the dataset with the same way as GFLA , why are the LPIPS results in your paper(GFLA: 0.2219, PATN:0.2520) different from that reported in the paper of GFLA?(GFLA: 0.2341, PATN:0.2533). What is your test setup?
Thanks!

pretrained model quality is not so good than your paper results.

Hi.
Thank you for providing your excellent paper and its code to the world. It's so exciting to touch your paper for me.

Now, I use your code and Pre-trained checkpoint of human pose transfer from here and its result is not looks good than your result reported in paper.

Is there something I have to do for getting more good result? Do i have to retrain model from pre-trained one?

Below is results from pre-trained model.
image
image
image

There are also good ones.
image
image

How to test my own images?

Hi, I'm very interested in your project.
Now I have my own dataset includes pairs of images and keypoints. Could I use your pre-trained model to test my own images (256 * 192)?

How to edit the parsing map

Hi, for region editing, gievn a certain parsing map, how do you edit it to obtain the desired parsing maps?Can you share the scripts or tools you used? Thanks!

About the extraction of "Fp"

Hi,
In your paper, you "concatenate the source image Is, the source parsing map Ss, the generated parsing map Sg and the target pose Pt in depth (channel) dimension and extract its feature Fp ", as shown in the Fig.
image

However, in my opinion, Fp should aims to provide the target pose information. Why do you additionally use the source image Is and the source parsing map Ss as input? Do you try to only use Sg and Pt to extract Fp ?
Thanks!

What is the mapping of the semantic map of person image to the merged K=8 attribute?

Could you kindly tell me how to map 20 different attributes into K=8 attributes.

The current indexes are 'Hat', 'Hair', 'Glove', 'Sunglasses', 'Upper Clothes', 'Dress', 'Coat', 'Socks', 'Pants', 'Jumpsuits', 'Scarf', 'Skirt', 'Face', 'Left-arm', 'Right-arm', 'Left-leg', 'Right-leg', 'Left-shoe' and 'Right-shoe'.

And the merged indexes mentioned in your paper are 'Hair', 'Upper Clothes', 'Dress', 'Pants', 'Face', 'Upper skin', 'leg' and 'background'.

a question about the code

I have a question about the code. In my opinion, as shown in the Fig, the code from line 627-638 should be aligned with the line 605.
In other words, I think you should first finish the for loop about b_size to obtain the middle_avg before performing subsequent operations. Please check it. Thanks!
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.