Git Product home page Git Product logo

chainer-fast-neuralstyle's Introduction

Chainer implementation of "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

Fast artistic style transfer by using feed forward network.

  • input image size: 1024x768
  • process time(CPU): 17.78sec (Core i7-5930K)
  • process time(GPU): 0.994sec (GPU TitanX)

Differences from original

Training

  • default --image_size set to 512 (original uses 256). It's slow, but time is the price you have to pay for quality
  • ability to switch off dataset cropping with --fullsize option. Crops by default to preserve aspect ratio
  • cropping implementation uses ImageOps.fit, which always scales and crops, whereas original uses custom solution, which upscales the image if it's smaller than --image_size, otherwise just crops without scaling
  • bicubic and Lanczos resampling when scaling dataset and input style images respectively provides sharper shrinking, whereas original uses nearest neighbour

Generating

  • Ability to specify multiple files for input to exclude model reloading every iteration. The format is standard Unix path expansion rules, like file* or file?.png Don't forget to quote, otherwise the shell will expand it first. Saves about 0.5 sec on each image.
  • Output specifies path prefix if multiple files are used for input, otherwise an explicit filename
  • Option -x indicates content image scaling factor before transformation
  • Preserve original content colors with --original_colors flag. More info: Transfer style but not the colors

Video Processing

The repo includes a bash script to transform your videos. It depends on ffmpeg. Compilation instructions

./genvid.sh input_video output_video model start_time duration

The first three arguments are mandatory and should contain path to files.
The last two are optional and indicate starting position and duration in seconds.

I integrated Optical Flow implementation by @larspars to provide more consistent output for sequence of images by smoothing out the differences between frames. It requires opencv-python. Separate thanks to @genekogan for providing a thorough explanation of remarkably simple, yet efficient steps to put this together.

To use it, append the -flow option followed by amount of alpha blending like so

python generate.py 'frames/image*.png' -m models/any.model -o dir/prefix_ -flow 0.02

I find values between 0.02 and 0.05 to work best. It calculates motion vector between previous and current source frames, applies the distortion to previously transformed frame, overlays it on top of current source frame with -flow opacity and finally transforms it. This helps the network transformation to reveal the same features in current frame as were discovered in the previous. It only affects sequence of images so if there's a single image in the list, you won't see any difference.

Requirement

$ pip install chainer

Prerequisite

Download VGG16 model and convert it into smaller file so that we use only the convolutional layers which are 10% of the entire model.

sh setup_model.sh

Train

Need to train one image transformation network model per one style target. According to the paper, the models are trained on the Microsoft COCO dataset.

python train.py -s <style_image_path> -d <training_dataset_path> -g 0

Generate

python generate.py <input_image_path> -m <model_path> -o <output_image_path>

This repo has pretrained models as an example.

  • example:
python generate.py sample_images/tubingen.jpg -m models/composition.model -o sample_images/output.jpg

or

python generate.py sample_images/tubingen.jpg -m models/seurat.model -o sample_images/output.jpg

Difference from paper

  • Convolution kernel size 4 instead of 3.
  • Training with batchsize(n>=2) causes unstable result.

No Backward Compatibility

Jul. 19, 2016

This version is not compatible with the previous versions. You can't use models trained by the previous implementation. Sorry for the inconvenience!

License

MIT

Reference

Codes written in this repository based on following nice works, thanks to the author.

  • chainer-gogh Chainer implementation of neural-style. I heavily referenced it.
  • chainer-cifar10 Residual block implementation is referred.

chainer-fast-neuralstyle's People

Contributors

6o6o avatar hiyorimi avatar yusuketomoto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chainer-fast-neuralstyle's Issues

error in sh setup_model.sh

macdeMacBook-Pro:chainer-fast-neuralstyle-master yep$ sh setup_model.sh
load VGG16 caffemodel
Traceback (most recent call last):
File "create_chainer_model.py", line 34, in
ref = CaffeFunction('VGG_ILSVRC_16_layers.caffemodel')
File "/usr/local/lib/python3.6/site-packages/chainer/links/caffe/caffe_function.py", line 139, in init
net.MergeFromString(model_file.read())
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1063, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1089, in InternalParse
new_pos = local_SkipField(buffer, new_pos, end, tag_bytes)
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 850, in SkipField
return WIRETYPE_TO_SKIPPER[wire_type](buffer, pos, end)
File "/usr/local/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 820, in _RaiseInvalidWireType
raise _DecodeError('Tag had invalid wire type.')
google.protobuf.message.DecodeError: Tag had invalid wire type.

Error: while run genvid.sh

when i run the following scripts:

./genvid.sh fox.mp4 fox-udnie.mp4 models/udnie_1.model

Error msg:

Could find no file with path 'frames/trans_fox_%d.png' and index in the range 0-4
frames/trans_fox_%d.png: No such file or directory

Style weights?

Hey @6o6o -- awesome improvements here to the code written at yusuketomoto/chainer-fast-neuralstyle. The model training is slow but I've already seen marked improvements after the first epoch.

Say I wanted my image to have only a "50%" style transfer. How would you modify the generate.py code to apply a weighted percentage of the model?

I'm happy to make a contribution if you can point me in the right direction.

Cheers,
@timeemit

transfer video error: flow...: integer argument expected, got float

Linux version : Linux subuntu 4.4.0-31-generic yusuketomoto#50~14.04.1-Ubuntu SMP Wed Jul 13 01:07:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
GPU version: NVIDIA Corporation Device [10de:10f0] (rev a1)

In Video Processing, after transform my videos to .png images
I got the following error:

$ python generate.py 'frames/*.png' -m models/seurat.model -o dir/prefix_ -flow 0.02
frames/VID_20170413_212021_1.png 18.1061830521 sec
Calculating flow
Traceback (most recent call last):
  File "generate.py", line 57, in <module>
    flow = cv2.calcOpticalFlowFarneback(img1, img2, 0.5, 3, 15, 3, 5, 1.2, 0)
TypeError: integer argument expected, got float

What's wrong?

.State and .Model

Hi when training I set Checkpoint to save every 1000 it writes two files

.State and .Model

Can i use these to test? Which file do I use

also if training crashes can I restart training from one of these checkpoints?

Converting VGG model

When executing command:
sh setup_model.sh
VGG model consumes most of my memory (~3.3 GB) and it stucks here:
ref = CaffeFunction('VGG_ILSVRC_16_layers.caffemodel')

How long does it take for you to load this model? I was waiting a lot of time and script didnt even started copying weights.

Artifakts after training new model

Hey guys!

I'm having some issue with the chainer implementation of neural style. Any help is super appreciated. I've tried this twice now and I still get these artifakts in image.

This is the style:
scream-remix-crop

I ran it on about 50k images and it gives this

9_scream-remix_output
9_scream-remix_output

As you can see in the above image there seems to be some artifakt where it looks super noisy. What could cause this? Have you seen this before? Any advice on fixing it?

Thank you for your help!

Dramatically slow after upgrading Debian

Hi!
Could anyone help? After apt-get upgrade made by me in my Debian
I have found that generate.py works now 1800 sec instead of 20 sec earlier!
I am very sad now and understand nothing!
What is the reason of this Hell?
Help me please!

Larger style features

Any advice on training setting to achieve larger or resulting in more abstract features?
For example seeing more detail in large brush strokes etc?

TypeError: integer argument expected, got float

when i run the following commands, I get the following error msg:

Calculating flow
Traceback (most recent call last):
  File "generate.py", line 57, in <module>
    flow = cv2.calcOpticalFlowFarneback(img1, img2, 0.5, 3, 15, 3, 5, 1.2, 0)
TypeError: integer argument expected, got float

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.