nvidia / fastphotostyle Goto Github PK

View Code? Open in Web Editor NEW

11.1K 276.0 1.2K 50.73 MB

Style transfer, deep learning, feature transform

License: Other

Python 96.01% Shell 2.50% Dockerfile 1.49%

fastphotostyle's Introduction

FastPhotoStyle

License

What's new

Date	News
2018-07-25	Migrate to pytorch 0.4.0. For pytorch 0.3.0 user, check out FastPhotoStyle for pytorch 0.3.0.
	Add a tutorial showing 3 ways of using the FastPhotoStyle algorithm.
2018-07-10	Our paper is accepted by the ECCV 2018 conference!!!

About

Given a content photo and a style photo, the code can transfer the style of the style photo to the content photo. The details of the algorithm behind the code is documented in our arxiv paper. Please cite the paper if this code repository is used in your publications.

A Closed-form Solution to Photorealistic Image Stylization
Yijun Li (UC Merced), Ming-Yu Liu (NVIDIA), Xueting Li (UC Merced), Ming-Hsuan Yang (NVIDIA, UC Merced), Jan Kautz (NVIDIA)
European Conference on Computer Vision (ECCV), 2018

Tutorial

Please check out the tutorial.

fastphotostyle's People

Contributors

Stargazers

Watchers

Forkers

sarathknv ideaplexus hargovindarora shubhampachori12110095 fatty-ricky 19ai shyamalschandra gregbahm citron kod3r codeaudit volker48 oguzhanmeteozturk xueyangfu kern srikalyan abpin bssrdf pjebs ncammarata fendaq kevin0722 shubhgo yijunmaverick z1111n arkhant morganwang010 shaunstanislauslau keyky jing-vision repoarchiver jh-sh agtlucas nimdraugsael yoritap 4quantoss a-douboy austinroy jerusalemsbell manascool sashka degtyareowsa jingweiz aeppic kennychou0529 gurpreetshanky yorwba devlfm a-b lp249839965 timelf123 kisioj manojsukhavasi samim23 f0x stes n89nanda sysbot tony32769 frank-en-stein jock78 kal93 vmdanilov gshysun yan7109 neuroradiology skipure matijagrcic emsi vishal0027 fabian7593 michaelsync mohammedri kmader templeblock rfederowicz xxradon burakerbora labimage liangsongyou shangor dasmithii spencerx terarachang esmevane kurodash happy-ferret wantongtang bigrlab sanskar107 umairakhtar123 fuxiocteract mxsab deeplearningsky tkylin hset911 zaork nclaudiuf marvin521 kkdevenda

fastphotostyle's Issues

Move trained models out of Google Drive

Downloading from Google Drive is a pain on the command line with wget/curl. This is where most people are probably trying to download the models from. Could you host the models somewhere else that allows simple downloading via URL?

ValueError: unknown file extension

fixed the "cannot reshape array..." following this: #36

But now running
python demo.py --content_image_path ./images/content1.png --content_seg_path ./images/labelc/label.png --style_image_path ./images/style1.png --style_seg_path ./images/labels/label.png --output_image_path ./results

I get
ValueError: unknown file extension:

How to get results of a batch of content/style pairs?

As the title described, I now have 50 content/style picture pairs, I want to get results of them in a batch, not one by one shown in example, can any one tell me how to make it?

Cpu mode

Will this support cpu mode?

RuntimeError: cuda runtime error (2) : out of memory

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1501969512886/work/pytorch-0.1.12/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "demo.py", line 68, in
stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
File "/home/boss/FastPhotoStyle-master/photo_wct.py", line 36, in transform
sF4,sF3,sF2,sF1 = self.e4.forward_multiple(styl_img)
File "/home/boss/FastPhotoStyle-master/models.py", line 393, in forward_multiple
out1 = self.conv3(out1)
File "/home/boss/anaconda2/envs/NVIDIA/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/boss/anaconda2/envs/NVIDIA/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 237, in forward
self.padding, self.dilation, self.groups)
File "/home/boss/anaconda2/envs/NVIDIA/lib/python3.5/site-packages/torch/nn/functional.py", line 40, in conv2d
return f(input, weight, bias)
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1501969512886/work/pytorch-0.1.12/torch/lib/THC/generic/THCStorage.cu:66

ValueError: cannot reshape array

Hi,

I got the following error when running style transfer with segmentation masks.

Elapsed time in stylization: 0.003414
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
...
...
/home/yueqi/Dropbox/lib/FastPhotoStyle/process_stylization.pyc in stylization(p_wct, content_image_path, style_image_path, content_seg_path, style_seg_path, output_image_path, cuda)
     60 
     61     with Timer("Elapsed time in stylization: %f"):
---> 62         stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
     63     utils.save_image(stylized_img.data.cpu().float(), output_image_path, nrow=1)
     64 

/home/yueqi/Dropbox/lib/FastPhotoStyle/photo_wct.py in transform(self, cont_img, styl_img, cont_seg, styl_seg)
     33         sF4 = sF4.data.squeeze(0)
     34         cF4 = cF4.data.squeeze(0)
---> 35         csF4 = self.__feature_wct(cF4, sF4, cont_seg, styl_seg)
     36         Im4 = self.d4(csF4, cpool_idx, cpool1, cpool_idx2, cpool2, cpool_idx3, cpool3)
     37 

/home/yueqi/Dropbox/lib/FastPhotoStyle/photo_wct.py in __feature_wct(self, cont_feat, styl_feat, cont_seg, styl_seg)
     86                 if self.label_indicator[l] == 0:
     87                     continue
---> 88                 cont_mask = np.where(t_cont_seg.reshape(t_cont_seg.shape[0] * t_cont_seg.shape[1]) == l)
     89                 styl_mask = np.where(t_styl_seg.reshape(t_styl_seg.shape[0] * t_styl_seg.shape[1]) == l)
     90                 if cont_mask[0].size <= 0 or styl_mask[0].size <= 0:

ValueError: cannot reshape array of size 4128 into shape (1376,)

Fixing these two lines

t_cont_seg = np.asarray(Image.fromarray(cont_seg, mode='RGB').resize((cont_w, cont_h), Image.NEAREST))
t_styl_seg = np.asarray(Image.fromarray(styl_seg, mode='RGB').resize((styl_w, styl_h), Image.NEAREST))

to the following resolved the issue.

t_cont_seg = np.asarray(Image.fromarray(cont_seg).resize((cont_w, cont_h), Image.NEAREST))
t_styl_seg = np.asarray(Image.fromarray(styl_seg).resize((styl_w, styl_h), Image.NEAREST))

ValueError: not enough values to unpack (expected 2, got 1)

I'm running the 3rd example like this in a Google colab notebook:

!rm -rf /content/FastPhotoStyle/segmentation
!git clone https://github.com/mingyuliutw/semantic-segmentation-pytorch /content/FastPhotoStyle/segmentation
  
!cd  /content/FastPhotoStyle/segmentation/ && sh ./demo_test.sh

if not ':/content/FastPhotoStyle/segmentation' in os.environ['PYTHONPATH']:
  os.environ['PYTHONPATH'] += ':/content/FastPhotoStyle/segmentation'

!mkdir /content/FastPhotoStyle/images -p && mkdir /content/FastPhotoStyle/results -p;
!rm /content/FastPhotoStyle/images/content3.png -rf;
!rm /content/FastPhotoStyle/images/style3.png -rf;
!rm /content/FastPhotoStyle/results/ -rf

!curl -o /content/FastPhotoStyle/images/content3.png '{content}'
!curl -o /content/FastPhotoStyle/images/style3.png '{style}'

!cd /content/FastPhotoStyle/ && python ./demo_with_ade20k_ssn.py --output_visualization

And this is my output:

Cloning into '/content/FastPhotoStyle/segmentation'...
remote: Counting objects: 521, done.
remote: Total 521 (delta 0), reused 0 (delta 0), pack-reused 521
Receiving objects: 100% (521/521), 3.92 MiB | 3.10 MiB/s, done.
Resolving deltas: 100% (293/293), done.
wget: /usr/local/lib/libcrypto.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)

Redirecting output to ‘wget-log’.
wget: /usr/local/lib/libcrypto.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)

Redirecting output to ‘wget-log.1’.
wget: /usr/local/lib/libcrypto.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)
wget: /usr/local/lib/libssl.so.1.0.0: no version information available (required by wget)

Redirecting output to ‘wget-log.2’.
Namespace(arch_decoder='ppm_bilinear_deepsup', arch_encoder='resnet50_dilated8', batch_size=1, fc_dim=2048, gpu_id=0, imgMaxSize=1000, imgSize=[300, 400, 500, 600], model_path='baseline-resnet50_dilated8-ppm_bilinear_deepsup', num_class=150, num_val=-1, padding_constant=8, result='./', segm_downsampling_rate=8, suffix='_epoch_20.pth', test_img='ADE_val_00001519.jpg')
Loading weights for net_encoder
Loading weights for net_decoder
# samples: 1
/usr/local/lib/python3.6/site-packages/torch/nn/functional.py:1890: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
[2018-09-18 16:17:30] iter 0
Inference done!
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  180k    0  180k    0     0   637k      0 --:--:-- --:--:-- --:--:--  635k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  442k  100  442k    0     0  27.0M      0 --:--:-- --:--:-- --:--:-- 27.0M
Loading weights for net_encoder
Loading weights for net_decoder
/usr/local/lib/python3.6/site-packages/torch/nn/functional.py:1890: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
Resize image: (266,177)->(266,177)
Resize image: (800,452)->(800,452)
Traceback (most recent call last):
  File "./demo_with_ade20k_ssn.py", line 133, in <module>
    output_visualization=args.output_visualization
  File "/content/FastPhotoStyle/process_stylization_ade20k_ssn.py", line 162, in stylization
    cont_seg = label_remapping.self_remapping(cont_seg)
  File "/content/FastPhotoStyle/process_stylization_ade20k_ssn.py", line 98, in self_remapping
    [h,w] = new_seg.shape
ValueError: not enough values to unpack (expected 2, got 1)

I don't think this is a mistake from my side. Any suggestions?

not able to run my docker with nvidia runtime

docker run -v /home:/home --runtime=nvidia -i -t your-docker-image:v1.0 /bin/bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused "process_linux.go:385: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig --device=all --compute --utility --require=cuda>=9.0 --pid=8878 /var/lib/docker/devicemapper/mnt/aae876e5f0e27dc382e77bbc397f6bf5ced6649656681151fe6e397c2c52ebee/rootfs]\\nnvidia-container-cli: initialization error: driver error: failed to process request\\n\""": unknown.

Running FastPhotoStyle on MacOS

Hi I'm a bit new to Python and have trouble understanding the messages I get when running "converter.py" in Terminal:

usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvXc] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvXc] source_file ... target_directory

What am I supposed to do next? These don't seem to be standard Python asks and I couldn't find a user guide on how to use this script. Forgive me if I'm missing something obvious

Much Slower Than the Reported Time

Hi,
I tested your code by running demo.sh with a K40m GPU, but my CUDA version is 8.0 (not 9.1). The total time is about 145s, more than 10 times slower than the reported time in the paper (11.39s for 1K image size). Besides a better GPU (Titan XP), I wonder whether the new CUDA is the key for the high performance. Thanks.

TypeError: unhashable type: 'list' using Windows 10 x64 and Python3.6

Test in environment：

Window 10 x64
Python 3.6 from Anaconda3

I followed the manual to install dependencies, but still throw an error：TypeError: unhashable type: 'list'

Whitening

I am slightly confused by the whitening step. In your reference [14] it seems to have a slightly different equation for whitening.

It translates to removing the {E_s} term from your equation for {P_s}. I don't understand why you have the {E_s} at the start of this, could you explain? It seems as if you are scaling by the orthonormal eigenvector matrix.

There is a similar observation in the coloring step.
Thanks!

How to run the demo in CPU mode?

I noticed that the new version supports running WCT in cpu mode, but how? When I comment the code "p_wct.cuda(0)" or change is_cuda to always return false, I get the following errors. Any idea what to do?

ps: The demo runs well in GPU mode.

Elapsed time in stylization: 0.656382
Traceback (most recent call last):
File "demo.py", line 40, in
output_image_path=args.output_image_path,
File "/home/wusai/PycharmProjects/FastPhotoStyle/process_stylization.py", line 55, in stylization
stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
File "/home/wusai/PycharmProjects/FastPhotoStyle/photo_wct.py", line 35, in transform
csF4 = self.__feature_wct(cF4, sF4, cont_seg, styl_seg)
File "/home/wusai/PycharmProjects/FastPhotoStyle/photo_wct.py", line 78, in __feature_wct
target_feature = self.__wct_core(cont_feat_view, styl_feat_view)
File "/home/wusai/PycharmProjects/FastPhotoStyle/photo_wct.py", line 118, in __wct_core
contentConv = torch.mm(cont_feat, cont_feat.t()).div(cFSize[1] - 1) + iden
File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 305, in add
return self.add(other)
TypeError: add received an invalid combination of arguments - got (torch.FloatTensor), but expected one of:

(float value)
didn't match because some of the arguments have invalid types: (torch.FloatTensor)
(torch.cuda.FloatTensor other)
didn't match because some of the arguments have invalid types: (torch.FloatTensor)
(torch.cuda.sparse.FloatTensor other)
didn't match because some of the arguments have invalid types: (torch.FloatTensor)
(float value, torch.cuda.FloatTensor other)
(float value, torch.cuda.sparse.FloatTensor other)

Add Dockerfile

Please add a Dockerfile that creates an image with the program inside, ready to execute.

windows 7

how to get this working on windows 7 or 10? pls help since i dont have a Linux machine

How to run on Google Colab Jupyter Notbook?

This is how I've tried to run the demo on a Jupyter Notebook on Google Colab

Download the models

# Download the file we just uploaded.
#
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz
file_id = '1ENgQm9TgabE1R99zhNf5q6meBvX6WFuq'

import io
import sys
from googleapiclient.http import MediaIoBaseDownload
from googleapiclient.discovery import build
drive_service = build('drive', 'v3')

filename = 'data.zip'

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.FileIO(filename, 'wb')
downloader = MediaIoBaseDownload(downloaded, request)
done = False
while done is False:
  status, done = downloader.next_chunk()
  sys.stdout.write("\r{0:.0f}%".format(status.progress() * 100))
  sys.stdout.flush()

print('\rDownloaded file: {}'.format(filename))

Install dependencies

# Anaconda
! wget https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-x86_64.sh
! bash Anaconda2-5.1.0-Linux-x86_64.sh -b -p $HOME/anaconda

# PyTorch
! PATH=$HOME/anaconda/bin:$PATH && conda install pytorch=0.3.0 torchvision cuda90 -y -c pytorch
  
# Image magic
! apt-get install -y axel imagemagick

# Cupy
! PATH=$HOME/anaconda/bin:$PATH && pip install scikit-umfpack
! PATH=$HOME/anaconda/bin:$PATH && pip install -U setuptools
! PATH=$HOME/anaconda/bin:$PATH && pip install pynvrtc

! apt -y install libcusparse8.0 libnvrtc8.0 libnvtoolsext1
! ln -snf /usr/lib/x86_64-linux-gnu/libnvrtc-builtins.so.8.0 /usr/lib/x86_64-linux-gnu/libnvrtc-builtins.so
! PATH=$HOME/anaconda/bin:$PATH && pip install 'cupy-cuda90==4.0.0b4' 'chainer==4.0.0b4'

Copy the repository and the models

! git clone https://github.com/NVIDIA/FastPhotoStyle
! unzip data.zip -d FastPhotoStyle
! mkdir -p FastPhotoStyle/images && rm -fr FastPhotoStyle/images/* && mkdir -p FastPhotoStyle/results && rm -fr FastPhotoStyle/results/*
! axel -n 1 http://freebigpictures.com/wp-content/uploads/shady-forest.jpg --output=FastPhotoStyle/images/content1.png  
! axel -n 1 https://vignette.wikia.nocookie.net/strangerthings8338/images/e/e0/Wiki-background.jpeg/revision/latest?cb=20170522192233 --output=FastPhotoStyle/images/style1.png
! cd FastPhotoStyle/images && convert -resize 25% content1.png content1.png && convert -resize 50% style1.png style1.png

Run demo

! cd FastPhotoStyle && python demo.py

Output

Traceback (most recent call last):
  File "demo.py", line 8, in <module>
    import process_stylization
  File "/content/FastPhotoStyle/process_stylization.py", line 17, in <module>
    from smooth_filter import smooth_filter
  File "/content/FastPhotoStyle/smooth_filter.py", line 326, in <module>
    from cupy.cuda import function
ImportError: No module named cupy.cuda

I'm not sure what dependencies are missing or whether Cuda is already installed in the VM.

Label Maps Question

Could you provide an example of using label maps? I have a 3-channel jpg as content, a 3 channel jpg as style, then 3 channel PNGs as label maps, but I get the following error:

Elapsed time in stylization: 0.020673
Traceback (most recent call last):
  File "demo.py", line 43, in <module>
    cuda=args.cuda,
  File "/home/ubuntu/repos/FastPhotoStyle/process_stylization.py", line 62, in stylization
    stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
  File "/home/ubuntu/repos/FastPhotoStyle/photo_wct.py", line 28, in transform
    self.__compute_label_info(cont_seg, styl_seg)
  File "/home/ubuntu/repos/FastPhotoStyle/photo_wct.py", line 67, in __compute_label_info
    o_cont_mask = np.where(cont_seg.reshape(cont_seg.shape[0] * cont_seg.shape[1]) == l)
ValueError: cannot reshape array of size 1555200 into shape (518400,)

Since there are no examples of using label maps, it's hard for me to figure out if my masks should be 1 channel PNGs or some other format. Perhaps I should also mention that my PNGs only have two labels (background and not background) which are 0 or 255 to create black and white PNG masks.

But an example of using the masks with the appropriate inputs would clear things up and I could quickly solve this problem on my own.

AttributeError: 'Program' object has no attribute '_program'

Running on ubuntu 16.04 with setup as outlined, and receive this error:
Elapsed time in stylization: 3.351004
Elapsed time in propagation: 15.440228
Traceback (most recent call last):
File "demo.py", line 75, in
out_img = smooth_filter(output_image_path, content_image_path, f_radius=15, f_edge=1e-1)
File "/home/ubuntu/FastPhotoStyle-master/smooth_filter.py", line 392, in smooth_filter
best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, f_radius, f_edge)
File "/home/ubuntu/FastPhotoStyle-master/smooth_filter.py", line 333, in smooth_local_affine
program = Program(src, 'best_local_affine_kernel.cu')
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/pynvrtc/compiler.py", line 52, in init
include_names)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/pynvrtc/interface.py", line 185, in nvrtcCreateProgram
c_char_p(src), c_char_p(name),
TypeError: bytes or integer address expected instead of str instance
Exception ignored in: <bound method Program.del of <pynvrtc.compiler.Program object at 0x7f3e7eed4f28>>
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/pynvrtc/compiler.py", line 56, in del
self._interface.nvrtcDestroyProgram(self._program)
AttributeError: 'Program' object has no attribute '_program'

I applied the code to stylize my photos. But the result is not as good as those shown in README. What could go wrong?

Hi,
First of all thank you very much for sharing the code and for good documentation. I was able to run the dockerised app easily on Ubuntu 17 without any issues.

Regarding results, photos are not as shown in your README page. I got totally different results. Please check the following examples below:

Content:

Style:

Output:

Please note that I haven't used the labelme, just the general one.

Is that expected result? or there is something I am missing?

Kind Regards,
Oras

THCudaCheck FAIL / RuntimeError: cuda runtime error (30) : unknown error

i have used the exact same setup as the code usage suggested. (i even make a clean install for ubuntu) i am using geforce GTX 950m and installed the driver up to date.

i receive the following cuda runtime error (30) : unknown error code.

any suggestion?

ubuntu:~/FastPhoto$ python demo.py
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1512378422383/work/torch/lib/THC/THCGeneral.c line=70 error=30 : unknown error
Traceback (most recent call last):
File "demo.py", line 34, in
p_wct.cuda(0)
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 216, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 146, in _apply
module._apply(fn)
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 152, in _apply
param.data = fn(param.data)
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 216, in
return self._apply(lambda t: t.cuda(device))
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/_utils.py", line 61, in _cuda
with torch.cuda.device(device):
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/cuda/init.py", line 186, in enter
_lazy_init()
File "/home/deeperubuntu/anaconda2/lib/python2.7/site-packages/torch/cuda/init.py", line 121, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1512378422383/work/torch/lib/THC/THCGeneral.c:70

Smoothing twice

Within the paper, I can only see smoothing mentioned once. However in the implementation smoothing is performed twice in photo_smooth.py and smooth_filter.py.

Am I misunderstanding the paper/implementation regarding the second smoothing technique, or is this an addition made? If so, can you explain why this was added?

RuntimeError: MAGMA gesdd : the updating process of SBDSDC did not converge

Enviroment: build from docker

sudo docker run --runtime=nvidia -v /GITHUB/FastPhotoStyle:/home -i -t fastphotostyle:v1.0 /bin/bash

The pytorch found the GPU
torch.cuda.get_device_name(0) => 'GeForce .....'

python demo.py => I Got the error below, does that mean the code didn't use cuda mostly ?

#######################################################
Intel MKL ERROR: Parameter 4 was incorrect on entry to SLASCL.

Intel MKL ERROR: Parameter 4 was incorrect on entry to SLASCL.
Traceback (most recent call last):
File "demo.py", line 68, in
stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
File "/home/photo_wct.py", line 47, in transform
csF3 = self.__feature_wct(cF3, sF3, cont_seg, styl_seg)
File "/home/photo_wct.py", line 119, in __feature_wct
tmp_target_feature = self.__wct_core(cont_feat_view, styl_feat_view)
File "/home/photo_wct.py", line 154, in __wct_core
c_u, c_e, c_v = torch.svd(contentConv, some=False)
RuntimeError: MAGMA gesdd : the updating process of SBDSDC did not converge (error: 14) at /opt/conda/conda-bld/pytorch_1518238581238/work/torch/lib/THC/generic/THCTensorMathMagma.cu:325

Automatic semantic label maps generation

Sorry to bother you guys, but is there any tools which can label objects in automatic mode like ./tool image.jpg --output image_labels.json?

GPU Memory Usage

I tested this demo on multiple GPUs which are Nvidia GeForce 1080 and GTX 750ti, i encountered the following issues:

Maximum photo size 1080 GPU could handle was 800x600, and GTX 750ti could handle max 600x480 if display is not on this gpu (750ti) and 320x240 if display is connected to this gpu. Larger sizes than this result in CUDA out of memory error as Python process takes all the available gpu memory, shown in nvidia-smi.
If this demo is run as a function in another program, it doesn't release GPU memory after stylization of the image.

FP16 option ?

Did you guys try running it in FP16 or mixed precision ?

Failed with UMFPACK_ERROR_out_of_memory

Thanks for the great code. When I run the algorithm with my own high-resolution images (655 * 1280), I find that when using scipy.sparse.linalg.spsolve with scikit-umfpack as solver, it requires too much memory (larger than 128GB).
After some investigations, I found the problem might be OS dependent. However, I actually followed the instructions: my OS is Ubuntu 16.04, also the same CUDA and python version.

I wonder if anyone struggles at the same issue with me, and if there is any other solver. Thanks.

ValueError: total size of new array must be unchanged

What am I doing wrong? The simple demo with global style works, but trying with label maps I get an error.

Picture of the images and the visualized label maps:

I run this command:

python demo.py \
--content_image_path images/custom2/content1.png \
--content_seg_path images/custom2/content1.label/label.png \
--style_image_path images/custom2/style1.png \
--style_seg_path images/custom2/style1.label/label.png \
--output_image_path results/example2.png

Output and error:

Elapsed time in stylization: 0.417996
Traceback (most recent call last):
  File "demo.py", line 43, in <module>
    cuda=args.cuda,
  File "/home/ubuntu/.fast-photo-style/process_stylization.py", line 62, in stylization
    stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
  File "/home/ubuntu/.fast-photo-style/photo_wct.py", line 35, in transform
    csF4 = self.__feature_wct(cF4, sF4, cont_seg, styl_seg)
  File "/home/ubuntu/.fast-photo-style/photo_wct.py", line 88, in __feature_wct
    cont_mask = np.where(t_cont_seg.reshape(t_cont_seg.shape[0] * t_cont_seg.shape[1]) == l)
ValueError: total size of new array must be unchanged

CUDA_ERROR_INVALID_PTX: a PTX JIT compilation failed

I am running with cuda9.1 and using similar setup as mentioned in the user manual. Getting the following error:

Elapsed time in stylization: 2.802349
Elapsed time in propagation: 18.942090
Elapsed time in post processing: 0.243305
Traceback (most recent call last):
File "demo.py", line 39, in
output_image_path=args.output_image_path,
File "/scratch0/Projects/style_transfer/FastPhotoStyle/process_stylization.py", line 63, in stylization
out_img = smooth_filter(output_image_path, content_image_path, f_radius=15, f_edge=1e-1)
File "/scratch0/Projects/style_transfer/FastPhotoStyle/smooth_filter.py", line 391, in smooth_filter
best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, f_radius, f_edge)
File "/scratch0/Projects/style_transfer/FastPhotoStyle/smooth_filter.py", line 335, in smooth_local_affine
m.load(bytes(ptx.encode()))
File "cupy/cuda/function.pyx", line 175, in cupy.cuda.function.Module.load
File "cupy/cuda/function.pyx", line 176, in cupy.cuda.function.Module.load
File "cupy/cuda/driver.pyx", line 141, in cupy.cuda.driver.moduleLoadData
File "cupy/cuda/driver.pyx", line 72, in cupy.cuda.driver.check_status
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_INVALID_PTX: a PTX JIT compilation failed

Training the model failed

Hi,
I tried to train my model based on the coco's training set which contains 87k images.
now, I can get the good decoder for relu1, relu2, but I can't get good decoder for relu3 and relu4. Can you share the parameter setting for the trained model?

Thanks!

train photo wct on new dataset

Could you please provide the training code for photo wct since sometimes it's necessary to train on new dataset. Thanks!

Question about matting affinity being negative

From my experience and as noted in the original paper, matting affinity values (https://github.com/NVIDIA/FastPhotoStyle/blob/master/photo_smooth.py#L43) can be negative.

Can you please comment on the meaning of the smoothing objective (equation 4 of your paper) for negative w_ij?

Thanks!

Are you guys presenting any talks/demos on this at GTC ?

Couldn't Reproduce the Results

Hi, I found the same images(content and style) which you used in the paper to reproduce your results.

However, when I ran your codes, I couldn't get the result similar to yours.

Is there anything I missed?

Seems not suitable for portrait stylization

I am wondering how to improve the results.

broken link

Hey,

the link on https://arxiv.org/abs/1802.06474 that's pointing to this repo is broken.

can't find vgg_normalised_conv1_1_mask.t7

No such file or directory: './models/vgg_normalised_conv1_1_mask.t7'

is this the mask of content image?

RuntimeError

I am receiving this error: RuntimeError: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at /opt/conda/conda-bld/pytorch_1501972792122/work/pytorch-0.1.12/torch/lib/THC/generic/THCTensor.c:299

Look at the memory consumption: should be able to process HD size picture

Hi, I came across the fact memory requirements are very high for this model.
I suggest you review possible optimizations like in-place operations and buffer reuse.
Also, 2 other optimization steps would bring the memory requirements down:

move to FP16 by default
look into splitting the net across a few GPUs.

Usage prerequisites missing

sudo apt-get install -y axel imagemagick is missing the from the prerequisites list on USAGE.md

Can I only transfer the image with only L band?

The max size I can transfer now is 960, in order to transfer larger image, can I transfer it with L band than add ab band to get the final result?
Thanks a lot!!

Spatial Control

Did you implement the spatial Control?

FileNotFoundError: [Errno 2] No such file or directory: './models/vgg_normalised_conv1_1_mask.t7'

Hello
sorry i'm new to machine learning and image processing, when i was following Tutorial i faced this issue
Traceback (most recent call last): File "./demo.py", line 30, in <module> p_wct = PhotoWCT(args) File "/media/<User>/Data/Knowledeg/SourceCodes/Nvidia/Python/FastPhotoStyle/photo_wct.py", line 18, in __init__ vgg1 = load_lua(args.vgg1) File "/home/<User>/anaconda3/lib/python3.6/site-packages/torch/utils/serialization/read_lua_file.py", line 606, in load_lua with open(filename, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: './models/vgg_normalised_conv1_1_mask.t7'
Note that i'm using Cuda 9.2 and pytorch 0.4.0 and Anaconda3 on Ubuntu 18.04.1 LTS

Runtime error running demo.py

CUDA 9.1.85-1
Ubuntu 16.04.3
Python 2.7.14

paperspace@abcd:~/style/FastPhotoStyle$ python demo.py
Traceback (most recent call last):
File "demo.py", line 63, in
stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
File "/home/paperspace/style/FastPhotoStyle/photo_wct.py", line 41, in transform
csF4 = self.__feature_wct(cF4, sF4, cont_seg, styl_seg)
File "/home/paperspace/style/FastPhotoStyle/photo_wct.py", line 119, in __feature_wct
tmp_target_feature = self.__wct_core(cont_feat_view, styl_feat_view)
File "/home/paperspace/style/FastPhotoStyle/photo_wct.py", line 148, in __wct_core
c_mean = c_mean.unsqueeze(1).expand_as(cont_feat)
File "/home/paperspace/anaconda3/envs/style/lib/python2.7/site-packages/torch/tensor.py", line 215, in expand_as
return self.expand(tensor.size())
RuntimeError: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at /opt/conda/conda-bld/pytorch_1501972792122/work/pytorch-0.1.12/torch/lib/THC/generic/THCTensor.c:299

ValueError: cannot reshape array of size 67500 into shape (22500,)

Traceback (most recent call last):
File "demo.py", line 43, in
cuda=args.cuda,
File "/home/zhuangwei.zw/FastPhotoStyle/process_stylization.py", line 62, in stylization
stylized_img = p_wct.transform(cont_img, styl_img, cont_seg, styl_seg)
File "/home/zhuangwei.zw/FastPhotoStyle/photo_wct.py", line 35, in transform
csF4 = self.__feature_wct(cF4, sF4, cont_seg, styl_seg)
File "/home/zhuangwei.zw/FastPhotoStyle/photo_wct.py", line 88, in __feature_wct
cont_mask = np.where(t_cont_seg.reshape(t_cont_seg.shape[0] * t_cont_seg.shape[1]) == l)
ValueError: cannot reshape array of size 67500 into shape (22500,)

I followed the Example2 manual, but throw this error.

Requires a physical GPU?

Hi,

Like my issue title written, does it requires a physical GPU to be able to run demo this source?

Relicense

It is very unusual for a software project to be licensed under a Creative Commons license, which is intended for cultural works. Normally, software is released under a "Free Software" licensed such as the Apache Public License, the BSD Public License, or the GNU Public License.

For instance, Creative Commons themselves advises against using CC for software.

Using a CC license will hinder the adoption of this project.

Incorrect Anaconda version in Dockerfile

RUN wget https://repo.continuum.io/archive/Anaconda2-5.0.1-Linux-x86_64.sh -P /tmp

Should be

RUN wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh -P /tmp

Implicit promotion to float64 in _compute_laplacian

In https://github.com/NVIDIA/FastPhotoStyle/blob/master/photo_smooth.py#L77 reads:

inv = np.linalg.inv(win_var + (eps/win_size)*np.eye(3))

Adding with np.eye(3) causes promotion of inv and X to float64. It may be fine since in this codebase since _compute_laplacian is called only with float64 inputs, but for porting to other data types (like float32, fp16) this is rather unobvious.

It promotion to float64 is necessary, it would be better if it was more explicit.

P.S. Thanks for open-sourcing this package :)

Does the code work for CUDA-8.0?

Traceback (most recent call last):
File "demo.py", line 75, in
out_img = smooth_filter(output_image_path, content_image_path, f_radius=15, f_edge=1e-1)
File "/home/whx/ml/FastPhotoStyle/smooth_filter.py", line 392, in smooth_filter
best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, f_radius, f_edge)
File "/home/whx/ml/FastPhotoStyle/smooth_filter.py", line 336, in smooth_local_affine
m.load(bytes(ptx.encode()))
File "cupy/cuda/function.pyx", line 175, in cupy.cuda.function.Module.load
File "cupy/cuda/function.pyx", line 176, in cupy.cuda.function.Module.load
File "cupy/cuda/driver.pyx", line 141, in cupy.cuda.driver.moduleLoadData
File "cupy/cuda/driver.pyx", line 72, in cupy.cuda.driver.check_status
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_CONTEXT_IS_DESTROYED: context is destroyed

cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version

Traceback (most recent call last):
File "demo.py", line 47, in
no_post=args.no_post
File "/home/key/workspace/FastPhotoStyle/process_stylization.py", line 135, in stylization
out_img = smooth_filter(out_img, cont_pilimg, f_radius=15, f_edge=1e-1)
File "/home/key/workspace/FastPhotoStyle/smooth_filter.py", line 402, in smooth_filter
best_ = smooth_local_affine(output_, input_, 1e-7, 3, H, W, f_radius, f_edge)
File "/home/key/workspace/FastPhotoStyle/smooth_filter.py", line 338, in smooth_local_affine
m.load(bytes(ptx.encode()))
File "cupy/cuda/function.pyx", line 181, in cupy.cuda.function.Module.load
File "cupy/cuda/function.pyx", line 182, in cupy.cuda.function.Module.load
File "cupy/cuda/runtime.pyx", line 435, in cupy.cuda.runtime._ensure_context
File "cupy/cuda/runtime.pyx", line 250, in cupy.cuda.runtime.memGetInfo
File "cupy/cuda/runtime.pyx", line 137, in cupy.cuda.runtime.check_status
cupy.cuda.runtime.CUDARuntimeError: cudaErrorInsufficientDriver: CUDA driver version is insufficient for CUDA runtime version

I got such error in both CUDA9.0 with cudnn 7.1.2 and CUDA9.1 with cudnn 7.1.2, btw my nvidia driver version is 390.77.