6o6o / chainer-fast-neuralstyle Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yusuketomoto/chainer-fast-neuralstyle

28.0 9.0 5.0 37.17 MB

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution".

Python 94.39% Shell 5.61%

chainer-fast-neuralstyle's Introduction

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Fast artistic style transfer by using feed forward network.

input image size: 1024x768
process time(CPU): 17.78sec (Core i7-5930K)
process time(GPU): 0.994sec (GPU TitanX)

Differences from original

Training

default --image_size set to 512 (original uses 256). It's slow, but time is the price you have to pay for quality
ability to switch off dataset cropping with --fullsize option. Crops by default to preserve aspect ratio
cropping implementation uses ImageOps.fit, which always scales and crops, whereas original uses custom solution, which upscales the image if it's smaller than --image_size, otherwise just crops without scaling
bicubic and Lanczos resampling when scaling dataset and input style images respectively provides sharper shrinking, whereas original uses nearest neighbour

Generating

Ability to specify multiple files for input to exclude model reloading every iteration. The format is standard Unix path expansion rules, like file* or file?.png Don't forget to quote, otherwise the shell will expand it first. Saves about 0.5 sec on each image.
Output specifies path prefix if multiple files are used for input, otherwise an explicit filename
Option -x indicates content image scaling factor before transformation
Preserve original content colors with --original_colors flag. More info: Transfer style but not the colors

Video Processing

The repo includes a bash script to transform your videos. It depends on ffmpeg. Compilation instructions

./genvid.sh input_video output_video model start_time duration

The first three arguments are mandatory and should contain path to files.
The last two are optional and indicate starting position and duration in seconds.

I integrated Optical Flow implementation by @larspars to provide more consistent output for sequence of images by smoothing out the differences between frames. It requires opencv-python. Separate thanks to @genekogan for providing a thorough explanation of remarkably simple, yet efficient steps to put this together.

To use it, append the -flow option followed by amount of alpha blending like so

python generate.py 'frames/image*.png' -m models/any.model -o dir/prefix_ -flow 0.02

I find values between 0.02 and 0.05 to work best. It calculates motion vector between previous and current source frames, applies the distortion to previously transformed frame, overlays it on top of current source frame with -flow opacity and finally transforms it. This helps the network transformation to reveal the same features in current frame as were discovered in the previous. It only affects sequence of images so if there's a single image in the list, you won't see any difference.

Requirement

Chainer

$ pip install chainer

Prerequisite

Download VGG16 model and convert it into smaller file so that we use only the convolutional layers which are 10% of the entire model.

sh setup_model.sh

Train

Need to train one image transformation network model per one style target. According to the paper, the models are trained on the Microsoft COCO dataset.

python train.py -s <style_image_path> -d <training_dataset_path> -g 0

Generate

python generate.py <input_image_path> -m <model_path> -o <output_image_path>

This repo has pretrained models as an example.

example:

python generate.py sample_images/tubingen.jpg -m models/composition.model -o sample_images/output.jpg

python generate.py sample_images/tubingen.jpg -m models/seurat.model -o sample_images/output.jpg

Difference from paper

Convolution kernel size 4 instead of 3.
Training with batchsize(n>=2) causes unstable result.

No Backward Compatibility

Jul. 19, 2016

This version is not compatible with the previous versions. You can't use models trained by the previous implementation. Sorry for the inconvenience!

License

MIT

Reference

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Codes written in this repository based on following nice works, thanks to the author.

chainer-gogh Chainer implementation of neural-style. I heavily referenced it.
chainer-cifar10 Residual block implementation is referred.

chainer-fast-neuralstyle's People

Contributors

Stargazers

Watchers

Forkers

ttoinou crypdick googol-lab abecadel ai4art

chainer-fast-neuralstyle's Issues

error in sh setup_model.sh

macdeMacBook-Pro:chainer-fast-neuralstyle-master yep$ sh setup_model.sh
load VGG16 caffemodel
Traceback (most recent call last):
File "create_chainer_model.py", line 34, in
ref = CaffeFunction('VGG_ILSVRC_16_layers.caffemodel')
File "/usr/local/lib/python3.6/site-packages/chainer/links/caffe/caffe_function.py", line 139, in init
net.MergeFromString(model_file.read())
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1063, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1089, in InternalParse
new_pos = local_SkipField(buffer, new_pos, end, tag_bytes)
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 850, in SkipField
return WIRETYPE_TO_SKIPPER[wire_type](buffer, pos, end)
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 820, in _RaiseInvalidWireType
raise _DecodeError('Tag had invalid wire type.')
google.protobuf.message.DecodeError: Tag had invalid wire type.

Error: while run genvid.sh

when i run the following scripts:

./genvid.sh fox.mp4 fox-udnie.mp4 models/udnie_1.model

Error msg:

Could find no file with path 'frames/trans_fox_%d.png' and index in the range 0-4
frames/trans_fox_%d.png: No such file or directory

Style weights?

Hey @6o6o -- awesome improvements here to the code written at yusuketomoto/chainer-fast-neuralstyle. The model training is slow but I've already seen marked improvements after the first epoch.

Say I wanted my image to have only a "50%" style transfer. How would you modify the generate.py code to apply a weighted percentage of the model?

I'm happy to make a contribution if you can point me in the right direction.

Cheers,
@timeemit

transfer video error: flow...: integer argument expected, got float

Linux version : Linux subuntu 4.4.0-31-generic yusuketomoto#50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
GPU version: NVIDIA Corporation Device [10de:10f0] (rev a1)

In Video Processing, after transform my videos to .png images
I got the following error:

$ python generate.py 'frames/*.png' -m models/seurat.model -o dir/prefix_ -flow 0.02
frames/VID_20170413_212021_1.png 18.1061830521 sec
Calculating flow
Traceback (most recent call last):
  File "generate.py", line 57, in <module>
    flow = cv2.calcOpticalFlowFarneback(img1, img2, 0.5, 3, 15, 3, 5, 1.2, 0)
TypeError: integer argument expected, got float

What's wrong?

.State and .Model

Hi when training I set Checkpoint to save every 1000 it writes two files

.State and .Model

Can i use these to test? Which file do I use

also if training crashes can I restart training from one of these checkpoints?

Converting VGG model

When executing command:
sh setup_model.sh
VGG model consumes most of my memory (~3.3 GB) and it stucks here:
ref = CaffeFunction('VGG_ILSVRC_16_layers.caffemodel')

How long does it take for you to load this model? I was waiting a lot of time and script didnt even started copying weights.

Artifakts after training new model

Hey guys!

I'm having some issue with the chainer implementation of neural style. Any help is super appreciated. I've tried this twice now and I still get these artifakts in image.

This is the style:

I ran it on about 50k images and it gives this

As you can see in the above image there seems to be some artifakt where it looks super noisy. What could cause this? Have you seen this before? Any advice on fixing it?

Thank you for your help!

Calculating flow
Traceback (most recent call last):
  File "generate.py", line 57, in <module>
    flow = cv2.calcOpticalFlowFarneback(img1, img2, 0.5, 3, 15, 3, 5, 1.2, 0)
TypeError: integer argument expected, got float