systemerrorwang / white-box-cartoonization Goto Github PK

View Code? Open in Web Editor NEW

3.9K 3.9K 737.0 44.61 MB

Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

Python 89.35% HTML 5.45% CSS 5.19%

white-box-cartoonization's People

Contributors

Stargazers

Watchers

Forkers

t1mkeeper liannice jjandnn lllyasviel lulu1315 wow55qq jacklongking ml-and-ai-repo bitgeyser peterzhousz zlou xiusdk d0cx4nd3r papercatnku dongyuya zhenqingkai wangzheallen liaoxy169 qweasdzxc110 miaochenguo kazuhito00 satoshirobatofujimoto zikex chang810249 shuxiangguo wenbank miracle-fmh justfitting rj-inovino tomerwei hideo130 monkey-x-byte lhanappa czzyyy uccme jimlee99 zhuoyao boogermann mukosame hakanaku1234 wynmew younetcq martincastellano spontanious p-ranav iveskins hernan604 vdt ahito89 suchoudh huongp00 lilleswing designium darlwen matthova neuroradiology opcheese proteanblank hatagolianne colwin tchigher luca-git cprakashagr ak9250 d3v3l0 peternara yqgans ai-repositories talentam intfrr deeplearningresource rrmina saonam arsenexie phaniramavarapu basantasharmajec deepak-rai-1027 deeplearningera mrex-tech khaledmaghnia mzkaramat abhishekiitd327 huguensjean dtegegn maxcodextc sunlisa mohammadwasil sparsh35 hakrosabir azmatsiddique anandpawara manikant92 pzl1744 gpsbird nataliagutierrez sajin333 ahtsham-pyds gabeochieng ritikdutta suvrajeet01

white-box-cartoonization's Issues

convert weights to onnx or pb

nvm resolved

A tutorial using this model to create a mobile application

Hi @SystemErrorWang.

Thanks for this work, truly amazing!

@margaretmz and I have been working on an end to end tutorial covering the following:

Converting the original model to TFLite (for easy deployment to Android)
Benchmarking the TFLite models for practical purposes
Creating an Android application with the TFLite models using the latest Android features

We just wanted to give you a heads-up and also wanted to let you know that the TFLite models that we converted are available on TensorFlow Hub: https://tfhub.dev/sayakpaul/lite-model/cartoongan/dr/1.

Here's the GitHub repository we are going to accompany the tutorial: https://github.com/margaretmz/CartoonGAN-e2e-tflite-tutorial. We (obviously) cite your work in all three places:

Tutorial
GitHub repository
TensorFlow Hub model page

Issues about surface representation

Hi,
An interesting topic and a great model! Thanks for sharing.
Here is an issue about surface representation after I read the paper: How do you define the surface representation F_dgf? I didn't find the definition in section3.1 and after.

I'm not quite expert with tf v1, if my understand corrects, L_total in the section 3.4 paper is related to "g_loss_total = 1e4tv_loss + 1e-1g_loss_blur + g_loss_gray + 2e2*recon_loss" line 77, train.py, Right? Well, how to map the four items to the section 3.4?

One more general question, I think this model contributes a combination of losses functions. So my question is why you design these 5 items or 3 parts: structure, texture, and surface? Is there any reference supporting, because I always confused with these terminologies.

Thanks.

[del]

No module named 'util'

raceback (most recent call last):
File "train.py", line 11, in
import utils
File "E:\White-box-Cartoonization\train_code\utils.py", line 11, in
from selective_search.util import switch_color_space
File "E:\White-box-Cartoonization\train_code\selective_search_init_.py", line 1, in
from .core import selective_search, box_filter
File "E:\White-box-Cartoonization\train_code\selective_search\core.py", line 3, in
from util import oversegmentation, switch_color_space, load_strategy
ModuleNotFoundError: No module named 'util'

Windows10 2004，i7-9750H+1660TI
python3.6.8
TensorFlow1.12.0/1.13.0均报此错误
但是可以正常使用预训练模型进行推理

用tensorflow-cpu替换gpu，导出的图片是全黑的

因为没有NVIDIA显卡，Mac上试了试cpu，请问gpu是必须的么？

环境：
python27 64bit
tensorflow-cpu==1.15.0

Noise in result images during training

Thank you for your awesome work.
I have trained your model with my own data (about 10k real face and 10k anime face), the result looks potential but there are a lot of abnormal pixels. The figure bellow shows a sample in iter 50k.
Could you give me some advice about this problem? Thank you so much.

More contours

Is there a way to emphasize the subject contours? When comparing some inference results with a self dataset, the overall style was successfully extracted, but the contours where a little off. I've tried increasing the weight on texture loss, but the results weren't pleasing. Following there is an example.

Separate pre-trained weights available for the UNET generator?

Hi.

Thank you for this super impressive work.

I was wondering if you have the separate pre-trained weights for the UNET-based generator network. If so, could you please help me find it?

Train model with self dataset

Nice job! I train model with myself dataset. But I find that the results of pretrained model has a lot of noise. And it also happens in the final results. I can not find the reason.

Pretrained model download url requests

Hi，thanks for your great work！
I wanted to use pretrained-model to run demo, but found the url (https://drive.google.com/open?id=1JfJzJbNjAWBIHGm9mc_R9dXv7DAw3tZc) is already not available. I want to ask whethre you can release a new url for pretrained model.
Hope for your reply, thank you very much!

Question about the training schedule

Hi, I am wondering how many iterations does it take the achieve the showed results? I am using exactly the same training settings along with the dataset provided by the author. Thanks!

数据集

巡礼

一个巡礼网站少量 map real life anime locations

专门做完layout（基本语义布局）后的bili video
cutmix 幸运星巡礼

bili 动态例
 bili 动态例

这些巡礼的图片对应动漫scene,Semantic对不上啊
然后有个Semantic Correspondence任务，感觉能用

二三次元对比，动漫x现实

风格化，调色后期

摄影合成/图像融合二次元与现实融合（Image Harmonization）
@msinsanity link

@bilibili-go海盗
并且提供的白背景的分割素材（动态id：22632857，看评论）

动漫梗图，少样本标注作风格增广

anime related style / content parody collection
例如danbooru里的【parody，meme，alternate】使用的特殊tag

梗推荐：
Sailor Moon Redraw | #sailormoonredraw
#STAYHOMEwithTRIGGER
Tokyo snowfalls Japanese TV interviewed a couple | #Special Feeling/Special Mood 特別な気分
shiba inu&fox | #Right-Hook Dog 右フック犬
drink resting on breasts | #tapioca Challenge #タピオカチャレンジ
未来日記由乃 Yandere Pose | Yuno Face #Yandere Trance
新房45° シャフト角度 | #Shaft Head Tilt

gal白给的大量数据集

差分，设定背景等等
不用白不用（可能有严重的版权问题）

In bilibili's demo video, it doesn't seem to perform well at night?

In bilibili's demo video, it doesn't seem to perform well at night? Is it because the training data rarely involve night scenes?
I see that the comparison results given in the thesis are relatively bright pictures and the pictures all have a certain composition, without involving the night scene.

A question

hi ，关注这个项目几天了，挺棒。做了一些测试，我比较关注人的卡通化，在我测试过程中，我发现人脸卡通化后，容易出现很多纹路，看起来感觉变老了。如果我通过重新训练来改善人脸，在数据准备等方面有哪些思路呢，谢谢。

python cartoonize.py command

python cartoonize.py command is where I hit a wall to build this environment to work. It brings this


(FO5) C:\Users\janne\wbc\White-box-Cartoonization\test_code>python cartoonize.py
Traceback (most recent call last):
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\Users\janne\anaconda3\envs\FO5\lib\imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "C:\Users\janne\anaconda3\envs\FO5\lib\imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: Määritettyä osaa ei löydy. [specific part isnt found /translated]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "cartoonize.py", line 4, in <module>
    import tensorflow as tf
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "C:\Users\janne\anaconda3\envs\FO5\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\Users\janne\anaconda3\envs\FO5\lib\imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "C:\Users\janne\anaconda3\envs\FO5\lib\imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: Määritettyä osaa ei löydy. [specific part isnt found /translated]


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

I've downloaded these dependencies
pip install opencv-python
pip install tensorflow-gpu==1.12.0
pip install scikit-image==0.14.5
pip install ffmpeg
pip install tqdm

Thanks for any help beforehand :)

Saved model missing .meta file

Hello, I am trying to convert the checkpoints into frozen tf graph but the it's missing the .meta file. Could someone tell me how to get it?

replacing slim layers with tf layers

Hi,

Thank you for all your work this is great! no bug here, but i was asking myself why you choose tf_slim instead of tf for convolutions layers?

thx

pytorch

您好，请问pytorch版本的代码会公布吗？

Re-making the model

I was trying to remake the model by TensorFlow 2.4 and bit easy to understand the format.
I made the generator part. could you please help me with the training dataset?

also can you please guide me through the code as I'm finding it difficult to grasp.

here's a link to my trial:

G(I_p)貌似都被替换成了F_dgf(G(I_p))?

作者你好，按照论文里面的描述的结构损失和纹理损失中，输入的生成图像应该是没有进行guided filter的，但是train.py中，生成器生成的图像都被替换成了滤波后的图像，这样后面只要用到G(I_p)的地方都被改成了F_dgf(G(I_p))，这是本来所期望的吗？

    output = network.unet_generator(input_photo)
    output = guided_filter(input_photo, output, r=1)

Pytorch Implementation

Hello, thanks for your facinating work, and I wanna know is there any pytorch implementaion? I wanna re-implement it with pytorch if there is no existed pytorch version.

Not able to cartoonize videos

One of the demos implements the cartoonization of videos. When I try to run the code on a video, I am not able to run the same. Can you please advise how I can run this cartoonization on videos?

Typos

train_code/

train.py
line194 miss: 'images',
utils.py
line73 def color_ss_map(image, seg_num=200, power=1, color_space='Lab', k=10, sim_strategy='CTSF'):
line 115 miss: return image

效果很惊艳，License的问题想咨询一下，是列上许可号就可以吗？我这边想用这个模型展示芯片的算力

Another online demo

A Chinese online demo：Cartoonize

Texture Representation

Where is texture representation extracted?

Video Version

Do you have plans to release the video version?, is it already out?

ValueError: Can't load save_path when it is None.

WARNING:tensorflow:From White-box-Cartoonization/test_code/cartoonize.py:38: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

Traceback (most recent call last):
File "White-box-Cartoonization/test_code/cartoonize.py", line 65, in
cartoonize(load_folder, save_folder, model_path)
File "White-box-Cartoonization/test_code/cartoonize.py", line 39, in cartoonize
saver.restore(sess, tf.train.latest_checkpoint(model_path))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/saver.py", line 1277, in restore
raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.

I'm using tf-gpu==1.15

about license used

It's a great work. I would like to ask if we train a ourself model and then providing cartoonish services to our users on the server, is it allowed to do so? If not, how can we do this.

How to reduce texture of images?

The comparison between my result and author's:
mine_1

demo_1

mine_2

demo_2

The texture is so obvious.
I try different methods:
firstly, I increase weight of different losses respectively, but the result doesn't change too much or even worse. And then I only train the scenery, the result becomes much worse. After that I change the lg_loss according to ### Least Squares Generative Adversarial Networks:
d_loss = 0.5*(tf.reduce_mean((real_logit - 1)**2) + tf.reduce_mean(fake_logit**2))
# some parameters need to be change according to Least Squares Generative Adversarial Networks d_loss = 0.5*(tf.reduce_mean((real_logit - 1)**2) + 0.5*tf.reduce_mean(fake_logit**2))
But, the result doesn't change.
What's more, I check the tensorboard. and find out that both the g_loss_blur and g_loss_gray don't change much. I know that the loss function has much relationship with that. I read some papers such as Martin Arjovsky's ### Wassertein GAN, ### Improved Training of Wasserstein GANs and ### Spectral Normalization for Generative Adversarial Networks.
But I still don't know how to decrease g_loss_blur and g_loss_gray during training. I can't find the reason. Anyone doing the training would like to do me a favor? Thanks very much.

test_code生成不了图片

您好，我关注这个仓库有好几天了，我发现test_code里面生成不了预期的图片，控制台输入正常，输出的图片像素值全部为空（Nan），我检查了代码发现你没有用.mete加载图的方式，而是给generator变量赋值的方式，理论上是可以的，但我还是失败了，希望您有空可以下载仓库的代码跑一下试试，也许没问题，到时候我再仔细检查一遍我这里的代码

奇怪的颜色斑块问题

请问一下，训练跑了20000次迭代，生成的图片都有红色、蓝色或者是绿色的斑点斑块，不规律分布，这是怎么回事呢

The training hyper-parameters

Hello, Wang. I'm very interested in your job. I have two questions. Could you help me?

Q1:
I tried the training with λ1=1, λ2=10, λ3=λ4=2000, λ5=10000 as your paper suggests. But the result is not that good. I'm not sure it's the question of dataset or hyper-parameters?
Q2:
What's more, the hyper-parameters in the selective search is not sure.
I tried seg_num=200, power=1.2, γ1=20, γ2=40 and the output of image was very black, as the following:

The result of simple search:

After that, I tried seg_num=1500, power=0.35, γ1=20, γ2=40 and the output of image was not that bad:

I know the value of pixel must be too large with power=1.2. So I just want to make sure that the parameters is suitable?

Thanks very much.

GPU memory question？

How much GPU memory does this model occupy in inference stage ?
On my v100-GPU, it takes about 7GB of GPU memory, is this normal ? Do you have any suggestions on reducing GPU memory usage?

texture transfer fluff trees to anime style

Hi, experts
That's a nice work and I finished to rewrite your code using pytorch and the training result as attached file.
I found out network try transfer to more "smooth area" on trees especially fluff
May you give me some suggestion about whether to decrease surface weight or superpixel weight?
I have eamail you last week, please check if you are free, thanks

使用github的代码和参数以及作者提供的数据，无法达到一模一样的动漫效果

没有任何改动，训练发现，每个iteration的效果，颜色都在变化，虽然也有动漫的效果，但是神韵和平滑性都不如作者提供的模型效果。还有什么trick吗？

np ndarray does not have 'median'

in train_code/utils.py line 67 "color = image[mask].median(axis=0)", will trigger error as numpy ndarray does not have attribute 'median'

What is the license used?

Since you have released the project as open source, can you clarify on the license you have used for the project?

Can't load save_path when it is None.

Hi, When I run your test code directly, this situation happened. Is it a model problem or something else?

  File "test_code/cartoonize.py", line 68, in <module>
    cartoonize(load_folder, save_folder, model_path)
  File "test_code/cartoonize.py", line 41, in cartoonize
    saver.restore(sess, tf.train.latest_checkpoint(model_path))
  File "/root/cartoon/lib/python3.6/site-packages/tensorflow_core/python/training/saver.py", line 1277, in restore
    raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.

Should the random weights in color_shift be `tf`, instead of `np`?

White-box-Cartoonization/train_code/utils.py

Lines 22 to 35 in aed441c

 def color_shift(image1, image2, alpha=0.2, mode='uniform'): 

 b1, g1, r1 = tf.split(image1, num_or_size_splits=3, axis=3) 

 b2, g2, r2 = tf.split(image2, num_or_size_splits=3, axis=3) 

 if mode == 'normal': 

 b_weight = np.random.normal(0.114, alpha) 

 g_weight = np.random.normal(0.587, alpha) 

 r_weight = np.random.normal(0.299, alpha) 

 elif mode == 'uniform': 

 b_weight = np.random.uniform(0.114-alpha, 0.114+alpha) 

 g_weight = np.random.uniform(0.587-alpha, 0.587+alpha) 

 r_weight = np.random.uniform(0.299-alpha, 0.299+alpha) 

 output1 = (b_weight*b1+g_weight*g1+r_weight*r1)/(b_weight+g_weight+r_weight) 

 output2 = (b_weight*b2+g_weight*g2+r_weight*r2)/(b_weight+g_weight+r_weight) 

 return output1, output2

If we use np.random.uniform to generate b, g, r's weight, then they would be constant values, which will not change during training. That should not be our original purpose of the paper.

Is that correct? Thanks.

AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'RegisterShape'

hello, i`m trying to run cartoonize.py or pretrain.py and everytime i have same error:

python cartoonize.py
2020-07-30 13:32:57.785610: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
Traceback (most recent call last):
File "cartoonize.py", line 5, in
import network
File "C:\Users\Admin\Desktop\dfl\White-box-Cartoonization-master\White-box-Cartoonization-master\test_code\network.py", line 3, in
import tensorflow.contrib.slim as slim
File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib_init_.py", line 40, in
from tensorflow.contrib import coder
File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder_init_.py", line 22, in
from tensorflow.contrib.coder.python.ops.coder_ops import *
File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder\python\ops\coder_ops.py", line 22, in
from tensorflow.contrib.coder.python.ops import gen_coder_ops
File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder\python\ops\gen_coder_ops.py", line 99, in
_ops.RegisterShape("PmfToQuantizedCdf")(None)
AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'RegisterShape'

more info:
cuda is 10.1
scikit-image==0.14.5
tensorflow-gpu==1.12.0

how to use it using AMD RX 570?

model doesn't converge

Hi, thanks for the excellent work. I tried to use the default settings and the datasets provided by the author, but the model doesn't converge (diverge after about 10 iterations). (To solve the pixel overflow issue, I added a Tanh layer at the end of the generator) Appreciate for your help!

when the face is with small pixels,the face is generated without any detail information.

when I test your model with a large image contain many people, whose face only accupy a little pixels. the genertated face would not genrate any detail information, such as nose, ears, mouth. Is this the shortcoming of your method?

Save and restore trained model in train.py

I am trying to extend train_code/train.py with a save/resore functionality. However, it does not seem to be enough to restore only the model which is saved in saved_models/ (instead of the pretrained model).
Which variables also have to be saved and loaded to allow for a temporally distributed training (e.g. on colab)?

Datasets

Very interesting work and results. I have a couple of questions. From the paper:

For cartoon images, we collect 10000 images from animations for the human face and 10000 images for landscape. Producers of collected animations include Kyoto animation, P.A.Works, Shinkai Makoto, Hosoda Mamoru, and Miyazaki Hayao.

Can you share more details about how were these images collected? 10000 images of what size? Any particular algorithm of what images are used and what images are discarded from the animations? Any kind of balance in the dataset? Buildings, nature, etc.

Another question, in different parts of the paper, the code and the readme VGG19 and VGG16 appears to be used interchangeably, but they are not the same. Which one was used VGG19 or VGG16? Was it fine-tuned in any way or only used the stock pretrained model to extract the high level features?

Lastly, where in the code is the style interpolation used? Or is it only for inference to interpolate models trained with different loss weights?

about adaptive coloring algorithm and structure representation

In train.py, I found that the structure loss is not built with adaptive coloring algorithm. It is same as the content loss. So why didn't use adaptive coloring algorithm?

network.py is not working

Getting nan discriminator and generator loss

Hi,
Thank you for uploading such a great work.

I am training the model with my custom dataset, only for portraits.
I followed the steps mentioned, the pretrain.py runs properly and saves the model. But, on running train.py I am getting both discriminator and generator losses as NaN. The reconstruction loss does have some value, not sure why I am getting this.

Your help would mean a lot.

One more question. What should be the size of the dataset in order to get decent results. For instance, Paper mentioned use of 10000 cartoon faces, Is it possible to get great results with a smaller dataset?

	def color_shift(image1, image2, alpha=0.2, mode='uniform'):
	b1, g1, r1 = tf.split(image1, num_or_size_splits=3, axis=3)
	b2, g2, r2 = tf.split(image2, num_or_size_splits=3, axis=3)
	if mode == 'normal':
	b_weight = np.random.normal(0.114, alpha)
	g_weight = np.random.normal(0.587, alpha)
	r_weight = np.random.normal(0.299, alpha)
	elif mode == 'uniform':
	b_weight = np.random.uniform(0.114-alpha, 0.114+alpha)
	g_weight = np.random.uniform(0.587-alpha, 0.587+alpha)
	r_weight = np.random.uniform(0.299-alpha, 0.299+alpha)
	output1 = (b_weightb1+g_weightg1+r_weight*r1)/(b_weight+g_weight+r_weight)
	output2 = (b_weightb2+g_weightg2+r_weight*r2)/(b_weight+g_weight+r_weight)
	return output1, output2