mchong6 / jojogan Goto Github PK
View Code? Open in Web Editor NEWOfficial PyTorch repo for JoJoGAN: One Shot Face Stylization
License: MIT License
Official PyTorch repo for JoJoGAN: One Shot Face Stylization
License: MIT License
我真是嗨到不行了!
The following methods are used in the source code to preserve the color
if preserve_color:
id_swap = [9,11,15,16,17]
else:
id_swap = list(range(7, generator.n_latent))
I wanna know how to get the exact ID of these layers.
Are there some papers that introduce such things? Or through a lot of attempts to know which layers control the color?
I'm interested in this grateful project.
And I've tried to train my own model on Colab.
Please let me know how to resume training.
hitting drive limits on downloading models, another solution other than pydrive is weights could be under project release
see torch hub
https://pytorch.org/docs/stable/hub.html
"Pretrained weights can either be stored locally in the github repo, or loadable by torch.hub.load_state_dict_from_url(). If less than 2GB, it’s recommended to attach it to a project release and use the url from the release. In the example above torchvision.models.resnet.resnet18 handles pretrained, alternatively you can put the following logic in the entrypoint definition."
and a similar example from animegan hubconf, although the weights are much smaller in size here
https://github.com/bryandlee/animegan2-pytorch/blob/main/hubconf.py
or the models can be hosted on huggingface, see
I took a picture with a face in it and generated an image using the align_face
function. Then, using the same picture, I generated another image by manual cropping as mentioned in #18 (comment). When I pass both of these through the e4e_projection
function and view the final image, the results are very different although the images which were passed are very similar. Do you have any idea of why this might be?
In the notebook, you choose [9,11,15,16,17] as the swapping layer. I wonder about the consideration of this choice.
Thanks in advance.
Hello. Thank you for providing a greate code.
I have a issue on running prediction.
if i predice on cuda:1, the inference is not working.....
i trace the code step by step and I found "op/fused_act.py" load fused_bias_Act.cpp and fused_bias_act_kernel.cu.
that cpp code cannot another gpu...
how can i predict with other gpu?.....
Hi! very interesting work! Thanks for providing the resource for playing. BTW, could u provide more details in training the jinx style ? since there is a little difference between the model trained with Colab (based on jinx only) and the pretrained model, as shown below:
my setting is listed:
num_iter=2000
preserve_color: false
Thanks for help
Hi!
Thanks for sharing this awesome work :-)
I'm wondering on the difference in perceptual image quality when using the LPIPS model (as stated in the paper) vs. the StyleGAN discriminator (as used in the updated collab notebook) for the perceptual loss.
In your experience, what kind of difference does using the StyleGAN discriminator have on the image quality, when compared to using LPIPS?
I got the error " AttributeError: module 'IPython.utils.traitlets' has no attribute 'Unicode' " when I am running stylize.py script. The error is generated on this line of code " from google.colab import files "
I have tried different solutions to solve this issue but this issue is persistently occuring
Hello, I would like to ask, I read in the paper that you use the restile to Gan inversion, but it seems that the encoder is used in the code is e4e , and in the colab it has not been updated , I want to reproduce the effect in the paper(use ReStyle), to achieve in the colab, what should I do? tks
On finetuning, Stylegan's style mapping network(FC layer) is trained??
for the finetuning stage in colab, wandb is useful to track metrics, see https://github.com/danielroich/PTI#weights-and-biases for example. This would be helpful when running multiple experiments in colab, if you have time, otherwise I can also look into this as a PR. Thanks
when set device to cpu in Colab, e4e_projection gives error "input must be a CUDA tensor"
on video input, would be interesting how it performs
hello, I try to run "Train with your own style images" but got "CUDA out of memory" (2080Ti, 12G GPU memory), can you tell me how much GPU memory the Train process cost?
Hello and thank you so much for the amazing project.
My problem is that the setup process described in the repo seems to not work for Ampere GPUs (in my case RTX 3080 Ti).
First I use the e4e/environment/e4e_env.yaml to create the Conda env. Then I follow the commands in the first cell of stylize.ipynb. However, I get ValueError: Unknown CUDA arch (8.6) or GPU not supported.
I think this may be because the default CUDA installed is 10.x, but my attempts to fix this by setting up the env differently have so far been unsuccessful. Would it be possible to add a fix for Ampere GPUs? Thanks in advance!
Hi, thank you for sharing source code.
I can't find how many paired image of dataset C. From your experiment, at least how many paired (wi, y) can have a good result?
(jojo) (jojo) PS C:\Users\Admin\Documents\JoJoGAN> python train_custom_style.py --model_name sophie --alpha 0.0 --preserve_color False --num_iter 300 --device cuda 0%| | 0/300 [00:02<?, ?it/s] Traceback (most recent call last): File "train_custom_style.py", line 103, in <module> fake_feat = discriminator(img) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 665, in forward out = block(out) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 621, in forward out = self.conv2(out) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\container.py", line 141, in forward input = module(input) File "C:\Users\Admin\MiniConda3\envs\jojo\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\Admin\Documents\JoJoGAN\model.py", line 126, in forward padding=self.padding, File "C:\Users\Admin\Documents\JoJoGAN\op\conv2d_gradfix.py", line 32, in conv2d ).apply(input, weight, bias) File "C:\Users\Admin\Documents\JoJoGAN\op\conv2d_gradfix.py", line 138, in forward out = F.conv2d(input=input, weight=weight, bias=bias, **common_kwargs) RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 8.00 GiB total capacity; 4.95 GiB already allocated; 0 bytes free; 5.49 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Is there any way to lower the batch size? how to do that and I don't have any Idea whether that can work, but please can you still fix this problem.
dlib is very slow to build, possible to use a alternative like from here https://github.com/happy-jihye/FFHQ-Alignment, for example https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/ffhq-align.py
more example usage here: https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/FFHQ-Alignment.ipynb
when device is set to cpu in colab and hardware accelerator is none
IndexError Traceback (most recent call last)
in ()
24 from tqdm import tqdm
25 import lpips
---> 26 from model import *
27 from e4e_projection import projection as e4e_projection
28 from restyle_projection import projection as restyle_projection
7 frames
/usr/local/lib/python3.7/dist-packages/torch/utils/cpp_extension.py in _get_cuda_arch_flags(cflags)
1604 arch_list.append(arch)
1605 arch_list = sorted(arch_list)
-> 1606 arch_list[-1] += '+PTX'
1607 else:
1608 # Deal with lists that are ' ' separated (only deal with ';' after)
IndexError: list index out of range
In colab, the cartoon data is generated from pretrained models that are download from google drive.
Could you share how you get the cartoon pretrained models?
Because I saw your paper and tried to reproduce step 1 and 2, but I cannot reproduce them.
Hi, I have some questions about the number of finetune data pairs. According to stylize.ipynb's part: Finetune StyleGAN, I find the variable "random_alpha" is not be used. If use only one reference style image, then I only have one pair to finetune the styleGAN?Could you plz tell me what am I doing wrong? Thanks a lot.
Amazing work. Can you please guide me how to convert your model into CoreML?
Hi,
I'm playing with your notebook (awesome work btw!) and I try to give it another pretrained GAN from Awesome Pretrained StyleGAN2.
I used the anime one (PyTorch implementation from here) but I get
TypeError Traceback (most recent call last)
[<ipython-input-10-2395bdddca96>](https://localhost:8080/#) in <module>()
22
23 #print(ckpt)
---> 24 generator.load_state_dict(ckpt["g"], strict=False)
25
26 #@title Generate results
TypeError: 'Generator' object is not subscriptable
Do I need further operation on the model to make it compatible?
Thank you
Could you please add requirements.txt file.
I would like to run the code locally in a docker container.
Currently, I keep getting CUDA run out of memory error if I use more than 4 reference image with 16GB GPU. Is there a way to train with more images? I'd like the model to be more general.
Hey, great work!
I had a couple of queries:
n_latent
= 18.are the finetuned models saved currently in the colab?
Hi all, did anyone try to apply JoJoGAN on car images (or other kinds of images)? I tried to replace both the e4e pretrained weight file and StyleGAN2 pretrained weight file with the car-specific one and then finetuned the StyleGAN generator. But the result was not good. It seems that the generator was not finetuned at all...
Hello, thank you for your awesome work!
Instead of the pretrained model, I tried to convert the image by uploading a style image with replicate API.
But an error has occured.
Traceback (most recent call last):
File "C:\\Users\\SSAFY\\PycharmProjects\\JoJoGAN\\jojogan_api.py", line 45, in <module>
output = version.predict(**inputs)
File "C:\\Users\\SSAFY\\anaconda3\\envs\\reboot_JoJoGAN\\lib\\site-packages\\replicate\\version.py", line 31, in predict
raise ModelError(prediction.error)
replicate.exceptions.ModelError: name 'latent_dim' is not defined
The same error occurs on the demo page.
How can I run on something other than faces? When using another image it tells me face is not detected.
Tried using 3-4 different images, but getting the same error in all cases.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-16-59b843d3fc1f> in <module>()
13 try:
---> 14 aligned_face = align_face(filepath)
15 except:
2 frames
/content/JoJoGAN/util.py in align_face(filepath, output_size, transform_size, enable_padding)
131 predictor = dlib.shape_predictor("models/dlibshape_predictor_68_face_landmarks.dat")
--> 132 lm = get_landmark(filepath, predictor)
133
/content/JoJoGAN/util.py in get_landmark(filepath, predictor)
109
--> 110 img = dlib.load_rgb_image(filepath)
111 dets = detector(img, 1)
RuntimeError: Unable to open file: test_input/content/JoJoGAN/pic.jpeg
During handling of the above exception, another exception occurred:
Exception Traceback (most recent call last)
<ipython-input-16-59b843d3fc1f> in <module>()
14 aligned_face = align_face(filepath)
15 except:
---> 16 raise Exception('Face not detected. Try a different image.')
17
18 # my_w = restyle_projection(aligned_face, name, device, n_iters=1).unsqueeze(0)
Exception: Face not detected. Try a different image.
I have tried two different images, one with a white background and one with a brown background, but both give the same error below. What are the suggestions for the photo being modified?
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-14-81c009706d54> in <module>()
11
12 # aligns and crops face
---> 13 aligned_face = align_face(filepath)
14
15 # my_w = restyle_projection(aligned_face, name, device, n_iters=1).unsqueeze(0)
1 frames
/content/JoJoGAN/util.py in get_landmark(filepath, predictor)
110 img = dlib.load_rgb_image(filepath)
111 dets = detector(img, 1)
--> 112 assert len(dets) > 0, "Face not detected, try another face image"
113
114 for k, d in enumerate(dets):
AssertionError: Face not detected, try another face image
Context: running setup on Colab
Error:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
imbalanced-learn 0.8.1 requires scikit-learn>=0.24, but you have scikit-learn 0.22 which is incompatible.
Thanks for the amazing work.
I am looking into the paper and found you mentioned that 26 style modulation layers were used to map the feature into style space.
I wonder if the style modulation layers is the number of MLP in mapping_network?
Also if I reduce this number from 26 to 8, how the quality will drop?
Appreciate your reply in advance.
replicate.exceptions.ModelError: stack expects a non-empty TensorList
I am confused by this error. I have uploaded my own style images and used the supplied iu.jpeg file to transform. I get this error in the last cell. I have verified that the style images are 3 channel images.
How many style images are needed? Does the format of the images matter?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.