yumingj / talk-to-edit Goto Github PK
View Code? Open in Web Editor NEWCode for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.
Home Page: https://www.mmlab-ntu.com/project/talkedit/
Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.
Home Page: https://www.mmlab-ntu.com/project/talkedit/
你好,我想自己训练自己的数据集(分辨率为256x256),但是我看你这项工作中128x128的图片好像都没有额头,请问你们是在一幅大图中利用人脸检测器检测出人脸然后resize到128的吗?
能提供一下预训练的arcface_resnet18_110.pth模型的链接吗,我在Arcface: Additive angular margin loss for deep
face recognition.的github上好像没有找到训练好的模型。
Hi,
I'm trying to use your pre-trained classifier for the five CelebA attributes that you use (Bangs, Eyeglasses, No_Beard, Smiling, Young). I'm building the model that you provide (the modified ResNet) using attributes_5.json
and I load the weights given in eval_predictor.pth.tar
.
As far as I can tell, for each of the above five attributes, you have a classification head. For instance, classifier32Smiling
which, at the top of it, it has a linear layer with 6 outputs. This is determined by the sub-dictionary
"32": {
"name": "Smiling",
"value":[0, 1, 2, 3, 4, 5],
"idx_scale": 1,
"idx_bias": 0
}
found in attributes_5.json
. Similarly you build the rest of the classifiers. My question is why you use these value
lists above (i.e., "value":[0, 1, 2, 3, 4, 5])? What are those classes?
I'd like to use this model for predicting a score for each of the five attributes for a batch of images. Would you think this is possible?
As a side note, the function that you use for post-processing the predictions, i.e., output_to_label
, gives NaNs in many cases. This is due to high values in predictions (in my case), which lead exp(.) to get to Inf, and thus softmax be NaN. Just says, that you could shift the maximum prediction to be zero before calculating softmax.
Thank you!
在https://github.com/yumingj/Talk-to-Edit/issues/5中看到有关自定义数据集的问题,但是这个里面都没有提及到输入数据的结构,我目前想在自定义数据集上训练属性预测器,但是训练数据的具体数据结构是什么样子的,有没有一个样式可供参考一下吗?非常感谢。
Thank you for the great work. I'm currently trying to use "editting_wo_dialog" on real images but the algorithm doesn't produce the results! I use exactly the same setting and models you provided. I just only change the input image. I get the following message at the end:
2021-10-25 12:42:50,212.212 - INFO: Sample 000 is already at the target class, skip.
2021-10-25 12:42:50,212.212 - INFO: This attribute is already at the degree that you want. Let's try a different attribute degree or another attribute.
sometimes it also gives something like this:
total: 0.2573; perceptual: 0.2314; mse: 0.0259; lr: 0.0000: 100%|█| 600/600 [00:59<00:00, 10
2021-10-25 12:40:12,199.199 - INFO: Sorry, we are unable to edit this attribute. Perhaps we can try something else.
Here I upload the input and output for your reference.
How can I run the code properly on real images?
I revise the code and i am wondering whether to encode the text-guided feature into the semantic field so that it can be modified to a cross-modal semantic field conditioned on the text.
你们提供的训练代码中导入数据有latent_codes,labels,scores;好像labels是表示人脸图像的各种属性的表现程度(是吗?),然后其中的scores好像并没有用于计算,这个scores是什么?,最后你们可以提供一下你们训练好的属性预测模型(输入图像,输出labels的模型)吗?
Thanks for your excellent work!
I have 2 questions, could you tell me:
请问一下,我对一幅1024的图片(整幅图片基本只有人脸,没有其他的背景)进行编辑的话,为什么之后smiling这个属性能进行编辑,而对其他属性进行编辑会报错“ Sorry, we are unable to edit this attribute. Perhaps we can try something else”,请问一下这个是检测器的问题还是哪部分的问题?另外,如果我将这幅图片resize为128的话,倒是基本每个属性都能进行编辑。
error msg: upfirdn2d(): incompatible function arguments. The following argument types are supported:
1. (arg0: at::Tensor, arg1: at::Tensor, arg2: int, arg3: int, arg4: int, arg5: int, arg6: int, arg7: int, arg8: int, arg9: int) -> at::Tensor
您好,感谢你的出色工作。我想问一下,如果我要使用自己的数据集从头训练一个模型,我都需要做哪些工作?1.先训练一个stylegan2模型;2.训练一个预测器;3.训练Talk-to-Edit;是这样的顺序吗?
Hi, I am trying to setup this repo on my own local machine but I am getting this error. I searched on internet but couldn't find a single solution of this. Any help will be appreciated. Thanks
ImportError: No module named 'fused'
Hi,
Thanks for your wonderful work. However, when I try to run the demo by using your pretrained models and default config parameters, that is:
python editing_wo_dialog.py \
--opt ./configs/editing/editing_wo_dialog.yml \
--attr 'Bangs' \
--target_val 5
I always get the following results:
This attribute is already at the degree that you want. Let's try a different attribute degree or another attribute.
or
Sorry, we are unable to edit this attribute. Perhaps we can try something else.
And I can only find the cropped face image and a simple start_image.png
in my results
folder.
And I also have tried some other attr
and target_val
combinations and got the above output as well.
I don't know what the problems they are. And I also not sure about the exact meaning about the target_val
.
BTW, in your README, you mentioned we can use Beard
attribute, but I found it have only No_Beard
attribute in your config files.
attr_to_idx:
Bangs: 0
Eyeglasses: 1
No_Beard: 2
Smiling: 3
Young: 4
Hope you could offer my help, thanks in advance.
Hi,
The link provided in github readme redirects to webpage titled "CelebA-Dialog Dataset".
But the link (zip icon) in the webpage reloads to this github repo.
Is the link for dataset in progress or am i missing something? How can I download the dataset?
Thanks for great work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.