vanhuyz / cyclegan-tensorflow Goto Github PK
View Code? Open in Web Editor NEWAn implementation of CycleGan using TensorFlow
License: MIT License
An implementation of CycleGan using TensorFlow
License: MIT License
When I train the model the lines below would show up:
2017-10-25 21:03:30.854048: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854067: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854071: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854074: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-10-25 21:03:30.854077: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
But I don't want to use CPU to do the computations how could I change this situation?
Hi, first of all, thanks for this great work!!
But I'm curious that why do you set the true label as 0.9?
I don't find any description in CycleGAN paper.
Is there any problem that setting true label as 1.0?
Thanks for your explanation :)
Which parameters in the code should I change ? I have also .bmp files.
Thanks.
If your image_size
is different from 256, calling export_graph
won't work. E.g. if image_size < 256
, errors are thrown that certain weights and biases could not be found in the checkpoint.
Solution:
image_size
needs to be passed to the CycleGAN
constructor, otherwise the network will have the wrong structure. Adding image_size=FLAGS.image_size
to the parameters fixed this for me.
In here tf.gfile.FastGFile
use 'b' mode. While with Python3.4 and Python3.5 it returns an error like: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
.
Using 'rb' mode is okay for me.
Hi, shouldn't default ngf=32, since you're trying to get 32 feature maps after 1st convolution?
the implementing of residual block, function RK
in file ops.py returns:
...........
output = input+normalized2
return output
but in the original paper, the addition should followed a ReLU activation function.
I think should like this:
...........
output = input+normalized2
return tf.nn.relu(output)
I used monet2photo downloaded from https://github.com/junyanz/CycleGAN
the trainning step is 9100,
the other parameters are all default,
and the result looks like this
What's the possible reasons?
As is known, data to update G and D net is of two different batches respectively in the classic gan (here I just consider one of these two gan nets). It's my first time to try TFRecord and so I have one question about it. Is the data of same batch here. If not, I guess data is updated once whether optimization of G or optimization of D is happened?
Hi, I'm trying to make a smile face, using common faces as X and genki4k as Y。
X include 600 pics, Y include 500 pics, all resized to 96*96。
after run about 41k steps, I tried to gen Y from X,but faces are very hard to recognize。
please advice。
INFO:root:-----------Step 41300:-------------
INFO:root: G_loss : 2.0845248699188232
INFO:root: D_Y_loss : 0.11265528202056885
INFO:root: F_loss : 2.6741137504577637
INFO:root: D_X_loss : 0.1405256986618042
This repository processes the fixed-size images, however, I need to input and output images of different sizes and make sure that the images remain the same sizes as its original ones.
So can it be modified to support this need? And how to modify it?
Thank you very much.
what if i change:
with tf.variable_scope(self.name, reuse=self.reuse)
to:
with tf.variable_scope(self.name, reuse=tf.AUTO_REUSE)
will it get same result?
Hi,
thank you for sharing this implementation.
do you plan on adding support for the identity mapping loss as described in the original paper?
thanks
How to handle multiple images in a single time in inference. Py?
Hello,
What is the effect of using imagepool? This can make the training of discriminator more stable?
(apple2orange)The default parameters used for all of the parameters. Do you need to make some changes?
Hello. I have a question about that how can I use your code to train aligned data?
Can CycleGAN deal with input data images with different size?
Should I change images size?
Thank You for your attention.
Hello,when I run your project,raise a error:ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,64,64,256]
[[Node: G_6/R256_2/layer2/instance_norm/moments/sufficient_statistics/Sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](G_6/R256_2/layer2/Conv2D, G_6/R256_2/layer2/instance_norm/moments/StopGradient)]]
[[Node: add_1/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_84861_add_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
can you help me?
Thanks for your good code! May I ask why the padding=2 in your resnet block (which was shaved latter)? Will it make a better results?
Will in work with grayscale images? What do I need to change?
Thanks.
I can not understand how to set the max-iterations in the code, Please help me to understand,thanks
Thanks for sharing your code, excellent job! I found a small typo in your code.
https://github.com/vanhuyz/CycleGAN-TensorFlow/blob/master/ops.py#L84
conv2 = tf.nn.conv2d(padded2, weights1, strides=[1, 1, 1, 1], padding='VALID')
It should be weights2 instead of weights1, right?
Thanks
After 10000 steps to train the mode, the training progress continued. How should I stop it? Or how should I change in the code to change this situation?
Hello,
I have a mistake after ~1900 steps:
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1226, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Nan in summary histogram for: D_X/fake
[[Node: D_X/fake = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](D_X/fake/tag, D_X_4/output/add/_1361)]]
INFO:root:-----------Step 1400:-------------
INFO:root: G_loss : 4.85528945923
INFO:root: D_Y_loss : 0.0743376016617
INFO:root: F_loss : 4.97825098038
INFO:root: D_X_loss : 0.0959008038044
INFO:root:-----------Step 1900:-------------
INFO:root: G_loss : 4.92536830902
INFO:root: D_Y_loss : 0.184448152781
INFO:root: F_loss : 4.78839635849
INFO:root: D_X_loss : 0.231521636248
INFO:root:-----------Step 2000:-------------
INFO:root: G_loss : 5.98963975906
INFO:root: D_Y_loss : 0.09952506423
INFO:root: F_loss : 5.93069791794
INFO:root: D_X_loss : 0.215163201094
Any advice would be much appreciated!!!
Hi I think that the utility function convert2float called in preprocess function of reader is incorrect. I believe the desired functionality is to convert the image from [0,255] int format to [-1,1] float format.
The code is
def convert2float(image): """ Transfrom from int image ([0,255]) to float tensor ([-1.,1.]) """ image = tf.image.convert_image_dtype(image, dtype=tf.float32) return (image/127.5) - 1.0
The issue is in dividing the image by 127.5 after scaling. The tf.image.convert_image_dtype function scales images to [0,1] floats already, so you actually need to multiply by 2 and then subtract 1, not divide by 127.5.
I see that the cycle_loss function already computes both forward & backward loss. Was there any performance boost by including it twice in Gan Loss for both x & y? The original paper & implementation seems to be including this only once
Whatever value you pass to the inference
script, the output always gets written to output_sample.jpg
.
I trained a model on a small datasets.
However, I have a much bigger and better datasets now.
Does anyone know how to reuse the pretrained model with the new datasets?
I replaced the former .tfrecord file with the new one, and use
$ tensorboard --logdir checkpoints/20180410-1445
to reload the model, but it didn't work.
It still shows the previous imgs in the tensorboard
Does anyone know how to reuse the pretrained model with different datasets?
Thank you, plz
Great project! One feature I think would add a lot to it--it would be valuable to be able to load a previously-saved model and continue training.
Hi there,
Thanks for your great work! As you mentioned when
high contrast background colors between input and generated images are observed (e.g. black becomes white), you should restart your training!
I have actually observed this problem thru tensorboard at around 15th epoch(see images below), is it due to insufficient training or the model has already collapsed? Cycleloss seems to still have a really slow decreasing trend.
Thanks,
What should I do to train my own database? Just replace the pictures in data folder or I should do something else?
It may be my environment's problem, but I really no idea how to handle it.I even don't understand what is this mean.
$a@a >python inference.py --model pretrained/man2woman.pb --input data/test.jpg --output data/output.jpg --image_size 256
2017-09-06 16:39:42.586351: I C:\tf_jenkins\home\workspace\nightly-win\M\window
\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports ins
ructions that this TensorFlow binary was not compiled to use: AVX
Is it caused by model or platform??
Hi,
I try training on my own dataset (for images which are not square) but when TF tries to calculate the discriminator "error_fake" for some reason I get that the input and the weights belong to different graphs and thus I get the following error:
"%s must be from the same graph as %s." % (item, original_item))
ValueError: Tensor("Placeholder_1:0", shape=(1, 300, 640, 3), dtype=float32) must be from the same graph as Tensor("D_Y/C64/weights:0", shape=(4, 4, 3, 64), dtype=float32_ref).
it's weird to me since it's very similar to calculating the "error_real" (same D) only here D works on fake_y and not on y.
any idea on where this is coming from?
thanks
Thank you for your wonderful code! Should the updates_collections
argument in batch normalization should be set to None
as told here?
If you have pretrained models with you please upload others as well. It will be of great use for those who is doing artistic style transfer comparison between CNN and GAN based models.
great work man 💯
Thanks!
ValueError: Attempted to map inputs that were not found in graph_def: [input_image:0]
Hi there, thanks for putting this repo together! I'm wondering what kind of throughput people are seeing for training? Im getting about 1 iteration every 3 seconds with a batch size of 1. Seems a bit slow to me. What are other people getting with this implementation?
My own data include 20 photos in trainA and 20 photos in trainB . The photos in trainA are Two-channel pictures. and the photos in trainB are Three-channel pictures. How can i change the code to make code process successfully? thank you!
Why sigmoid is not used in Discriminator last layer for lsgan, I do not understand
when I run "python train.py ", it warns “ E tensorflow/core/grappler/optimizers/dependency_optimizer.cc:484] The graph couldn't be sorted in topological order.”x2 This message doesn't influence program going.I want to confirm whether this issue is just ok?
I run apple - > orange but when transfer apple to orange the result is very pool, using the default image size, I also trained the original pytorch version, it generates just nice picture, is I am not training enough epoch?
The model was very fluid when I used the test images you provided. But when I used my own images for training, after typing 'python train.py' the model give me no response, nothing but flashing cursor. How did that happen? And how can I fix that?
Hi so in continuation with my earlier point about the conversion to [-1,1] not being correct in utils, i have uncovered a bit of an odd behavior in tensorflow that is relevant.
So the tf.image.resize_images function takes in an image and then resizes it using some interpolation function (default is bilinear). This by default will convert your image to float as it needs to interpolate scalars.
The tf.image.convert_image_dtype function will convert your uint8 image in range [0,255] to the floating range [0,1], but only if it is actually in uint8. If you pass in an already floating point image with this call tf.image.convert_image_dtype(image, dtype=tf.float32) , the function **does nothing.
So the current code is (in reader)
`
def _preprocess(self, image):
image = tf.image.resize_images(image, size=(self.image_size, self.image_size))
image = utils.convert2float(image)
image.set_shape([self.image_size, self.image_size, 3])
return image
`
The first part implicitly converts the image to float, which means the call to conver_image_dtype in utils.convert2float does nothing. This is why the current code sort of works, even though it is wrong. When i changed the /127.5 part in utils to *2, i got super high values and the order of the calls in preprocess is why.
In summary,
import tensorflow as tf
import os.path
import matplotlib.image as mpimg
from PIL import Image
SAVE_PATH = "C:\CycleGAN-TensorFlow-master\datasetB_new.tfrecords"
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def load_data(datafile, width, high, method=0, save=False):
train_list = open(datafile,'r')
writer = tf.python_io.TFRecordWriter(SAVE_PATH)
with tf.Session() as sess:
label=0
for line in train_list:
tmp = line.strip().split(' ')
img_path = tmp[0]
image = tf.gfile.FastGFile(img_path, 'rb').read()
image = tf.image.decode_jpeg(image)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = sess.run(image)
image_raw = image.tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'image_raw': _bytes_feature(image_raw),
'label': _int64_feature(label),
}))
label = label+1
writer.write(example.SerializeToString())
writer.close()
load_data('C:\CycleGAN-TensorFlow-master\samples\monet2photo\monet2photo\trainB.txt', 256,256)
For example, I have a lot of unpaired images from two domain, A and B.
After training, what should I do to transfer A to B with the model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.