wty-ustc / hairclip Goto Github PK

[CVPR 2022] HairCLIP: Design Your Hair by Text and Reference Image

License: GNU Lesser General Public License v2.1

Python 100.00%

hairclip's Introduction

HairCLIP: Design Your Hair by Text and Reference Image (CVPR2022)

This repository hosts the official PyTorch implementation of the paper: "HairCLIP: Design Your Hair by Text and Reference Image".

Our single framework supports hairstyle and hair color editing individually or jointly, and conditional inputs can come from either image or text domain.

Tianyi Wei¹, Dongdong Chen², Wenbo Zhou¹, Jing Liao³, Zhentao Tan¹, Lu Yuan², Weiming Zhang¹, Nenghai Yu¹
¹University of Science and Technology of China, ²Microsoft Cloud AI, ³City University of Hong Kong

News

2023.10.12: We propose the more performant HairCLIPv2 that supports various interaction modalities, which is accepted by ICCV2023! 🎉
2022.03.19: Our testing code and pretrained model are released.
2022.03.09: Our training code is released.
2022.03.02: Our paper is accepted by CVPR2022 and the code will be released soon.

Web Demo

Upload your own image and try HairCLIP here with Replicate

Getting Started

Prerequisites

$ conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0
$ pip install ftfy regex tqdm
$ pip install git+https://github.com/openai/CLIP.git
$ pip install tensorflow-io

Pretrained Model

Please download the pre-trained model from the following link. The HairCLIP model contains the entire architecture, including the mapper and decoder weights.

Path	Description
HairCLIP	Our pre-trained HairCLIP model.

If you wish to use the pretrained model for training or inference, you may do so using the flag --checkpoint_path.

Auxiliary Models and Latent Codes

In addition, we provide various auxiliary models and latent codes inverted by e4e needed for training your own HairCLIP model from scratch.

Path	Description
FFHQ StyleGAN	StyleGAN model pretrained on FFHQ taken from rosinality with 1024x1024 output resolution.
IR-SE50 Model	Pretrained IR-SE50 model taken from TreB1eN for use in our ID loss during HairCLIP training.
Train Set	CelebA-HQ train set latent codes inverted by e4e.
Test Set	CelebA-HQ test set latent codes inverted by e4e.

By default, we assume that all auxiliary models are downloaded and saved to the directory pretrained_models.

Training

Training HairCLIP

The main training script can be found in scripts/train.py.
Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs.
Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs.

Training the HairCLIP Mapper

cd mapper
python scripts/train.py \
--exp_dir=/path/to/experiment \
--hairstyle_description="hairstyle_list.txt" \
--color_description="purple, red, orange, yellow, green, blue, gray, brown, black, white, blond, pink" \
--latents_train_path=/path/to/train_faces.pt \
--latents_test_path=/path/to/test_faces.pt \
--hairstyle_ref_img_train_path=/path/to/celeba_hq_train \
--hairstyle_ref_img_test_path=/path/to/celeba_hq_val \
--color_ref_img_train_path=/path/to/celeba_hq_train \
--color_ref_img_test_path=/path/to/celeba_hq_val \
--color_ref_img_in_domain_path=/path/to/generated_hair_of_various colors \
--hairstyle_manipulation_prob=0.5 \
--color_manipulation_prob=0.2 \
--both_manipulation_prob=0.27 \
--hairstyle_text_manipulation_prob=0.5 \
--color_text_manipulation_prob=0.5 \
--color_in_domain_ref_manipulation_prob=0.25 \

Additional Notes

This version only supports batch size and test batch size to be 1.
See options/train_options.py for all training-specific flags.
See options/test_options.py for all test-specific flags.
You can customize your own HairCLIP by adjusting the different category probabilities. For example, if you want to train a HairCLIP that only performs hair color editing with text as the interaction mode, you can adjust the different probabilities as follows.
```
--hairstyle_manipulation_prob=0 \
--color_manipulation_prob=1 \
--both_manipulation_prob=0 \
--color_text_manipulation_prob=1 \
```
--color_ref_img_in_domain_path is a dataset of images with diverse hair colors generated by the HairCLIP trained from the above probabilistic configuration to enhance the diversity of the dataset when editing hair colors based on the reference image, which you may not use. If you choose to use this augmentation, you need to pre-train a text-based hair color HairCLIP according to above probabilistic configuration.
The weights of different losses in training are in the options/train_options.py, and you can adjust them to balance your needs. Empirically, the larger the loss weight, the better the corresponding effect, but it will affect the effect of other losses to some extent.

Testing

Inference

The main inference script can be found in scripts/inference.py. Inference results are saved to test_opts.exp_dir.

Example of Using Text to Edit Hairstyle

cd mapper
python scripts/inference.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=../pretrained_models/hairclip.pt \
--latents_test_path=/path/to/test_faces.pt \
--editing_type=hairstyle \
--input_type=text \
--hairstyle_description="hairstyle_list.txt" \

Example of Using Text to Edit Hairstyle Reference Image to Edit Hair Color

cd mapper
python scripts/inference.py \
--exp_dir=/path/to/experiment \
--checkpoint_path=../pretrained_models/hairclip.pt \
--latents_test_path=/path/to/test_faces.pt \
--editing_type=both \
--input_type=text_image \
--hairstyle_description="hairstyle_list.txt" \
--color_ref_img_test_path=/path/to/celeba_hq_test \

Additional Notes

See options/test_options.py for all test-specific flags.
--editing_type should be hairstyle, color, or both to indicate whether to edit only hairstyle, only hair color, or both hairstyle and hair color.
--input_type is used to indicate the interaction mode, text for text, and image for reference image. When editing both hairstyle and hair color, the two interactions are separated by _.
The --start_index and --end_index indicate the range of the edited test latent codes, where --start_index needs to be greater than 0 and --end_index cannot exceed the size of the whole test latent codes dataset.

Acknowledgements

This code is based on StyleCLIP.

Citation

If you find our work useful for your research, please consider citing the following papers :)

@article{wei2022hairclip,
  title={Hairclip: Design your hair by text and reference image},
  author={Wei, Tianyi and Chen, Dongdong and Zhou, Wenbo and Liao, Jing and Tan, Zhentao and Yuan, Lu and Zhang, Weiming and Yu, Nenghai},
  journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

@article{wei2023hairclipv2,
  title={HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending},
  author={Wei, Tianyi and Chen, Dongdong and Zhou, Wenbo and Liao, Jing and Zhang, Weiming and Hua, Gang and Yu, Nenghai},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2023}
}

hairclip's People

Contributors

Stargazers

Watchers

hairclip's Issues

code

Can you provide a script file to input a single picture for final prediction?

Why does training with default parameters report an error, celeba_hq for the data set

Hi，
When I am training the model，
Why does training with default parameters report an error, celeba_hq for the data set.

I notice that image manipulation loss is defined as "1-cos（，） " . But I feel confused that whether i can change that to "-cos"

How to preserve facial details better like "Babershop"?

Hi, thanks for your work.
I have found that the hairclip algorithm is not very good at preserving facial details, such as work like "Barbershop: Hair Transfer with GAN-Based Image Compositing Using Segmentation Masks". I tried "HFGI: High-Fidelity GAN Inversion for Image Attribute Editing (CVPR 2022)" as a latent encoder, but the effect is not very good.

Can you give some advice or methods on how to preserve the facial details? Very much looking forward to your answer, thank you!

hairclip demo
left image: input image, right image: result
hairclip demo use HFGI latents
left image: input image, right image: result
Babershop demo

When to release testing code and pretrained model

Excellent work! When to release testing code and pretrained model, can't wait to try.

what is ACD?

In your paper you mention using ACD as a measure of color differences. Where does this indicator come from? Is there any code we can use?

Hairstyles can only show so much

First of all , thanks for you excellent work!
There are many hairstyles in hairstyle.txt, but actually I found only a few styles in result images after trying all styles. More or less repeat the following images.

cornrows cut hairstyle
crew cut hairstyle

(the points on left glasses in right image is mouse)

the following is my command:

python scripts/inference.py 
--exp_dir=../result/test_1/
--checkpoint_path=../pretrained_models/hairclip.pt
--latents_test_path=../inference_data/test_1/latent.pt
--editing_type=hairstyle
--input_type=text
--hairstyle_description="hairstyle_list.txt"

What's the problem? Should I train with my own dataset?

I list some hairstyles which have the same effect:

1. the same as cornrows: crown braid hairstyle, dreadlocks hairstyle, finger waves hairstyle, french braid hairstyle and so on.
1. the same as crew cut hairstyle: caesar cut hairstyle, dido flip hairstyle, extensions hairstyle, fade hairstyle, fauxhawk hairstyle, frosted tops hairstyle ,full crown hairstyle, harvard clip hairstyle, high and tigh hairstyle, hime cut hairstyle, hi-top fade hairstyle and so son.

How do you introduce encoder4editing in predict.py without conflicting with the environment？

how to get test_faces.pt and latent.pt??

Ask about train hairstyles

Hi，

When I train my own hairstyle model, do I need to convert the images under the --hairstyle_ref_img_train_path=/path/to/celeba_hq_train \ parameter into latents through the e4e algorithm. So instead of --latents_train_path=/path/to/train_faces.pt \

How i can start this project??

sry , I'm begginer of ML , programming..
so I can't start this project with read me ...
plz help me

Demo Play ?

Hi. 🤗
This is an awesome work. 👍
Thanks for all of you, the contributors. 🌹
I am wondering if you could tell me if you have any plan to make one demo public on huggingface/spaces, etc. 🤔 ？

add web demo/models to Huggingface

Hi, would you be interested in adding HairCLIP to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

About training details

Hi,
I am trying to re-implement your paper but can not get good results on both image and text path.
So I would like to verify some implementation details:

Below is my implementation of Modulation Module inside Mapper(in pytorch):

import torch
import torch.nn as nn
import torch.nn.functional as F
from models.stylegan2.model import EqualLinear

class MapperBlock(nn.Module):
    def __init__(self, channels=512):
        super(MapperBlock, self).__init__()
        self.fc = EqualLinear(channels,channels)
        self.f_gamma = nn.Sequential(
            EqualLinear(channels,channels), nn.LayerNorm(channels), nn.LeakyReLU(0.2),
            EqualLinear(channels,channels)
        )
        self.f_beta = nn.Sequential(
            EqualLinear(channels,channels), nn.LayerNorm(channels), nn.LeakyReLU(0.2),
            EqualLinear(channels,channels)
        )
        self.act = nn.LeakyReLU(0.2)
    
    def modulation(self, x, e):
        gamma = self.f_gamma(e)
        beta = self.f_beta(e)

        # norm x
        x = F.layer_norm(x, (x.shape[-1],))
        
        # modulation
        return (1.0 + gamma) * x + beta

    def forward(self, x, e):
        x = self.fc(x)
        x = self.modulation(x, e)
        return self.act(x)

Is it correct?

According to your paper, the reference style/text is randomly set to image or text. My understanding is the image/text manipulation loss is only calculated when image/text reference is used, but the total loss value range is vary in different condition. Does the loss weights always keep the same in all condition or need to adjust for different condition?
In your paper: "we also generated several edited images using our text-guided hair editing method to augment the diversity of the
reference image set." Could you elaborate more details about your method? Or any other reference paper?

Thanks for your help.

About Video Hair Editing

Thanks you for you great works! Do you think video hair editing based on HairCLIP is achievable？ I have a little try, but the region of hairstyle still hard to control. Consistency in hair styles is quite difficult to maintain. Can you give me some insights about video-hairstyle-editing?

What is the data set of test input

Hello, the author, I want to ask whether the input test set is the original image or "e4e"

F and C

Hello, boss. I noticed that the neural network structure diagram may be incorrectly drawn in the paper. F should be fine, meaning high-level semantic information; C should be coarse, meaning low-level semantic information.

great work ! cannt wait for your code to try!

😄

What is the truncation for in function configure_datasets of coach.py?

about color_ref_img_in_domain_path

hello thanks for your talented work. I have a question about color_ref_img_in_domain_path. When I finished pre-train with the argument hairstyle_manipulation_prob=0 --color_manipulation_prob=1 --both_manipulation_prob=0 --hairstyle_text_manipulation_prob=0.5 --color_text_manipulation_prob=1 --. How should I set the color_ref_img_in_domain_path. Is that path should be logs/image_train, but I got this error, I don't know where to find these documents. Looking forward to your reply

the error is
FileNotFoundError: [Errno 2] No such file or directory: '/home/code/HairCLIP/logs/images_train/red hair/02951.jpg'

Can I use my own image test?

Hello, can I use my own image for the resend test, I found that the input was test_face.pt (test data set ?) file, and I did not find the input image content in the code, The only thing that feels like an input image is w(w=torch.Size([1, 18, 512])), But it's not the size of a picture

Getting error on inference when using reference image hairstyle to paste on input image.

I am getting error when I try to take inference. I am using this command
python scripts/inference.py
--exp_dir=/content/resultss
--editing_type=both
--input_type=image_image
--hairstyle_ref_img_test_path=/content/oriental1.png
--color_ref_img_test_path=/content/oriental1.png
--num_of_ref_img 1
--checkpoint_path=/content/drive/MyDrive/data/hairclip.pt
--latents_test_path=/content/drive/MyDrive/data/latents.pt
What I am trying to is to take transfer hairstyle of refrence Image to input Image. I have converted input image to e4e to get latent code. Please do let me know. thnx.

Using images to edit hairstyle and color does not work

Based on the pre-trained model you provided, edit the hair style with text and edit the hair color with image, but the hair color editing did not work. Do I have to retrain the new model myself? And How to obtain the model specified by the test parameter "--parsenet_weights"?

scripts files are missing

Hi, I found there are no scripts files in this repo.

Error while training the model on my dataset.

This is the command I am using.
%%shell
eval "$(conda shell.bash hook)"
conda activate myenviroment
python scripts/train.py
--exp_dir=/content/outss
--hairstyle_description="hairstyle_list.txt"
--color_description=black,brown,yellow
--checkpoint_path=/content/drive/MyDrive/data/hairclip.pt
--ir_se50_weights=/content/drive/MyDrive/data/model_ir_se50.pth
--latents_train_path=/content/drive/MyDrive/data/trainlatent/latents.pt
--latents_test_path=/content/drive/MyDrive/data/testlatent/latents1.pt
--hairstyle_ref_img_train_path=/content/inversions
--hairstyle_ref_img_test_path=/content/test/inversions
--color_ref_img_train_path=/content/inversions
--color_ref_img_test_path=/content/test/inversions
--color_ref_img_in_domain_path=/content/inversions
--hairstyle_manipulation_prob=0.5
--color_manipulation_prob=0.2
--both_manipulation_prob=0.27
--hairstyle_text_manipulation_prob=0.5
--color_text_manipulation_prob=0
--color_in_domain_ref_manipulation_prob=0.25 \

The generated image is quite different from the reference image

I tested the effect and found that the hair style of the generated image is quite different from that of the reference image. Here is my test script. The reference image is selected from CelebAMask-HQ dataset. Is there a problem in my test process？

python scripts/inference.py \ --exp_dir=../outputs/0321/ \ --checkpoint_path=../pretrained_models/hairclip.pt \ --latents_test_path=../pretrained_models/test_faces.pt \ --editing_type=both \ --input_type=image_image \ --color_ref_img_test_path=../input/16 \ --hairstyle_ref_img_test_path=../input/16 --num_of_ref_img 1

用两张图片测试的时候报错

输入命令：
E:\Linux\XSpace\papers\HairCLIP\mapper>python scripts/inference.py --exp_dir=E:\Linux\XSpace\pap
ers\HairCLIP\data\exp --checkpoint_path=F:\Dataset\CelebA\Data\hairclip.pt --latents_test_path=F:\Dataset\CelebA\Data\test_faces.pt --editin
g_type=color --input_type=image --hairstyle_description="hairstyle_list.txt" --color_ref_img_test_path=E:\Linux\XSpace\papers\HairCLIP\data
ref

在 latent_mappers.py 中的 x = clip_model.encode_image(masked_generated_renormed) 报错了，错误信息如下：

*** RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9591.py", line 19, in encode_image
_0 = self.visual
input = torch.to(image, torch.device("cuda:0"), 5, False, False, None)
return (_0).forward(input, )
~~~~~~~~~~~ <--- HERE
def encode_text(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9591.Multimodal,
input: Tensor) -> Tensor:
File "code/torch/multimodal/model/multimodal_transformer.py", line 34, in forward
x2 = torch.add(x1, torch.to(_4, 5, False, False, None), alpha=1)
x3 = torch.permute((_3).forward(x2, ), [1, 0, 2])
x4 = torch.permute((_2).forward(x3, ), [1, 0, 2])
~~~~~~~~~~~ <--- HERE
_15 = torch.slice(x4, 0, 0, 9223372036854775807, 1)
x5 = torch.slice(torch.select(_15, 1, 0), 1, 0, 9223372036854775807, 1)
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9477.py", line 8, in forward
def forward(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9477.Transformer,
x: Tensor) -> Tensor:
return (self.resblocks).forward(x, )
~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
def forward1(self: torch.multimodal.model.multimodal_transformer.___torch_mangle_9477.Transformer,
x: Tensor) -> Tensor:
File "code/torch/torch/nn/modules/container/___torch_mangle_9476.py", line 29, in forward
_8 = getattr(self, "3")
_9 = getattr(self, "2")
_10 = (getattr(self, "1")).forward((getattr(self, "0")).forward(x, ), )
~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_11 = (_7).forward((_8).forward((_9).forward(_10, ), ), )
_12 = (_4).forward((_5).forward((_6).forward(_11, ), ), )
File "code/torch/multimodal/model/multimodal_transformer/___torch_mangle_9376.py", line 13, in forward
_0 = self.mlp
_1 = self.ln_2
_2 = (self.attn).forward((self.ln_1).forward(x, ), )
~~~~~~~~~~~~~~~~~~ <--- HERE
x0 = torch.add(x, _2, alpha=1)
x1 = torch.add(x0, (_0).forward((_1).forward(x0, ), ), alpha=1)
File "code/torch/torch/nn/modules/activation/___torch_mangle_9369.py", line 38, in forward
_16 = [-1, int(torch.mul(bsz, CONSTANTS.c0)), _8]
v0 = torch.transpose(torch.view(_15, _16), 0, 1)
attn_output_weights = torch.bmm(q2, torch.transpose(k0, 1, 2))
~~~~~~~~~ <--- HERE
input = torch.softmax(attn_output_weights, -1, None)
attn_output_weights0 = torch.dropout(input, 0., True)

Traceback of TorchScript, original code (most recent call last):
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py(4294): multi_head_attention_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/activation.py(985): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(45): attention
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(48): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py(117): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(63): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(93): forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(709): _slow_forward
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py(725): _call_impl
/root/workspace/multimodal-pytorch/multimodal/model/multimodal_transformer.py(221): visual_forward
/opt/conda/lib/python3.7/site-packages/torch/jit/_trace.py(940): trace_module
(36): export_torchscript_models
(3):
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3418): run_code
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3338): run_ast_nodes
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(3147): run_cell_async
/opt/conda/lib/python3.7/site-packages/IPython/core/async_helpers.py(68): _pseudo_sync_runner
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2923): _run_cell
/opt/conda/lib/python3.7/site-packages/IPython/core/interactiveshell.py(2878): run_cell
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(555): interact
/opt/conda/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py(564): mainloop
/opt/conda/lib/python3.7/site-packages/IPython/terminal/ipapp.py(356): start
/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py(845): launch_instance
/opt/conda/lib/python3.7/site-packages/IPython/init.py(126): start_ipython
/opt/conda/bin/ipython(8):
RuntimeError: cublas runtime error : unknown error at C:/cb/pytorch_1000000000000/work/aten/src/THC/THCBlas.cu:225
(Pdb) img_tensor.shape
torch.Size([1, 3, 1024, 1024])

请问是输入的tensor大小不对吗

训练数据集能否分享一下呢？

我这边生成的celeba-hq数据大部分都是带有很多噪声，尝试过重新编译链接libjpeg-8d，但是没有起作用

How to change the hairstyle of a given image with a image containing the target hairstyle?

Is is normal speed?

Hello, I want to ask if the speed of run the inferrence.py for testing is normal. This is my executive code: cd mapper
python scripts/inference.py
--exp_dir=/home/ps/HairCLIP/mapper/path/to/experiment
--checkpoint_path=/home/ps/HairCLIP/pretrained_models/hairclip.pt
--latents_test_path=/home/ps/HairCLIP/mapper/path/to/test_faces.pt
--editing_type=hairstyle
--input_type=text
--hairstyle_description="/home/ps/HairCLIP/mapper/hairstyle_list.txt" \

Will stylegan inversion encoder be trained?

Will stylegan inversion encoder be trained? I found that CLIP image encoder and CLIP Text Encoder used detach() to make it untrained. I look forward to your answer. Thank you!

About the training details.

Thank you for your great project!

In this paper, you said “We train and evaluate our hair mapper on the CelebA-HQ dataset. Since we use e4e [43] as our inversion encoder, we follow its division of the training set and test set.” However, I found that e4e used the FFHQ dataset for training and the CelebA-HQ test dataset for evaluation. Hence, I feel confused.
My question is that how to split the training and test datasets on the CelebA-HQ dataset?

Inference on image data without converting to e4e

Thanks for making such a good model. Actually, I wanted to inference the model on my image data without converting it e4e. Could u please help with this? Thanks.

local variable 'shape' referenced before assignment

I test the feature on replicate but notice that some photo can result local variable 'shape' referenced before assignment. Is there any way we can fix this?

File "predict.py", line 168, in run_alignment
aligned_image = align_face(filepath=image_path, predictor=predictor)
File "/src/encoder4editing/utils/alignment.py", line 35, in align_face
lm = get_landmark(filepath, predictor)
File "/src/encoder4editing/utils/alignment.py", line 21, in get_landmark
t = list(shape.parts())
UnboundLocalError: local variable 'shape' referenced before assignment

about pretrained unet infer

mask_512 = (torch.unsqueeze(torch.max(labels_predict, 1)[1], 1)==13).float()
1.why hair equal 13, bg not equal 13?
2.unet infer results that have 19 channels, what did they means?

What this mean in

How to slove the problem "train.py: error: the following arguments are required: --hairstyle_description, --color_description " when i run code in Colab?

Name conflicts between soruce code and source code of encoder4editing

The Source code dont include encoder4editing and I have to copy from another repo to encoder4editing directory.
But After Added. many module name conflics such as models criterria.
Is there a clear description on how to add encoder4editing?

What is the code license?

question of split database(train.pt and test.pt)

@wty-ustc Thank you for the amazing work!
I try to split the CelebA-HQ by official list_eval_partition.txt. Eventually, I got 24183/2993/2824 images for training/validation/testing split. but i found the len of train.pt is 24176 ...so... I'm very confused about what data you're used?

how much size of input image?

How many images were used to calculate the three metrics IDS, PSNR, and SSIM derived from Table 1？

Hello, your work is very good. I have a question for you about your paper, in the comparison with the latest method, how many images were used to calculate the three metrics IDS, PSNR, and SSIM derived from Table 1?

About modulation module

Hi,
Great work!
But I have a question about the modulation module of mapper network.
I assume the dimension of x and e should be 1x1xC.
If so, what is the mean and std of x? channel-wise average?
And how about the output dimensions of fr(e) and fb(e)?

Thanks.

How to get the reference dataset?

Could you please tell me where I can get the celeba_hq_train and celeba_hq_val?

Except hair coloe change only, but hair style of some results are change

I want to change hair color on FFHQ data, however hairstyle of some of results are change.
Did I do wrong?
The following is my command

python scripts/inference.py
--exp_dir=./experiment
--checkpoint_path=../pretrained_models/hairclip.pt
--latents_test_path=./latents.pt
--editing_type=color
--input_type=text
--color_description=red

Hosting HairCLIP model

Hi!

First off, thank you for your work!

I'm trying to create a Colab Notebook to play with your model, but since the weights and stuff are hosted inside google drive, the download limits seems to restrict me from simply downloading it with gdown or wget.

Could I download it and move it to another hosting service (i.e archive.org) to avoid this issue? Of course, I would add all the references to all the authors and parties involved.

Again, thanks for your work!

How to run predict.py

HI ，Thank you for your work

In line 11, （from cog import BasePredictor, Path, Input）this sentence means ？
I have a red wavy line here

wty-ustc / hairclip Goto Github PK

hairclip's Introduction

HairCLIP: Design Your Hair by Text and Reference Image (CVPR2022)

News

Web Demo

Getting Started

Prerequisites

Pretrained Model

Auxiliary Models and Latent Codes

Training

Training HairCLIP

Training the HairCLIP Mapper

Additional Notes

Testing

Inference

Example of Using Text to Edit Hairstyle

Example of Using Text to Edit Hairstyle Reference Image to Edit Hair Color

Additional Notes

Acknowledgements

Citation

hairclip's People

Contributors

Stargazers

Watchers

Forkers

hairclip's Issues

Recommend Projects

Recommend Topics

Recommend Org