TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

License: Apache License 2.0

Python 100.00%

tensorflow synthetic-images deep-learning apple generative-model

simulated-unsupervised-tensorflow's Introduction

Simulated+Unsupervised (S+U) Learning in TensorFlow

TensorFlow implementation of Learning from Simulated and Unsupervised Images through Adversarial Training.

Requirements

Python 2.7
TensorFlow 0.12.1
SciPy
pillow
tqdm

Usage

To generate synthetic dataset:

Run UnityEyes with changing resolution to 640x480 and Camera parameters to [0, 0, 20, 40].
Move generated images and json files into data/gaze/UnityEyes.

The data directory should looks like:

data
├── gaze
│   ├── MPIIGaze
│   │   └── Data
│   │       └── Normalized
│   │           ├── p00
│   │           ├── p01
│   │           └── ...
│   └── UnityEyes # contains images of UnityEyes
│       ├── 1.jpg
│       ├── 1.json
│       ├── 2.jpg
│       ├── 2.json
│       └── ...
├── __init__.py
├── gaze_data.py
├── hand_data.py
└── utils.py

To train a model (samples will be generated in samples directory):

$ python main.py
$ tensorboard --logdir=logs --host=0.0.0.0

To refine all synthetic images with a pretrained model:

$ python main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/"

Training results

Differences with the paper

Used Adam and Stochatstic Gradient Descent optimizer.
Only used 83K (14% of 1.2M used by the paper) synthetic images from UnityEyes.
Manually choose hyperparameters for B and lambda because those are not specified in the paper.

Experiments #1

For these synthetic images,

Result of lambda=1.0 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=sgd

Result of lambda=0.5 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=sgd

Training loss of discriminator and refiner when lambda is 1.0 (green) and 0.5 (yellow).

Experiments #2

For these synthetic images,

Result of lambda=1.0 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=adam

Result of lambda=0.5 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=adam

Result of lambda=0.1 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.1 --optimizer=adam

Training loss of discriminator and refiner when lambda is 1.0 (blue), 0.5 (purple) and 0.1 (green).

Author

Taehoon Kim / @carpedm20

simulated-unsupervised-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala jdc08161063 ml-lab paengs junmyung ilyeong-ai xielm12 metricle eivado diggerdu synpon george-wu509 tonydeep geevi wanjinchang albert-lzg ynswon xypan1232 xsongx yaokaichun leezqcst daiyl soledad89 zjucsxxd ashispapu mldl georghildebrand jaysquare87 chunniunai220ml algpower unknown-yuser kmyi a70 simplysimleprob adolf-my colawithice rahimentezari albertlzg matrix-revolution meteora9479 ieee820 jinsongbo childhoo nanfengpo liviust shuolongbj andresvivancov umass-sensors mnicnc404 19ai jy9387 colinsongf xhuvom minhtannguyen tony32769 alex-mocanu xiuxiuzhang1995 guoxiansong feixuedudiao karanahujax cvpr17 labimage kuonanhong milestonesvn mikigom csu-gh w0lker nexa attilaborcs xujinchang lixiaosi33 belvo fuckmi liaoheping whynzlywt jkznst zhengkang86 kangxz matthewpurri swamika001 miriamhu shubhampachori12110095 margotbbt astroman5516 xzllxls dreamwingc raven38 elenita1221 waceunmn itaiarad ocean1211 pandinosaurus xyyue latte0 kedarvkunte uduako heinlein-vi ctmackay machinelearningch supercman

simulated-unsupervised-tensorflow's Issues

Better explanation on testing

Hi, first of all, thanks for the great work! It was really easy to use.
I had a problem with refining my images with a pretrained model though.

In readme you say: "To refine all synthetic images with a pretrained model":

$ python main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/"

You are missing the load_path argument. Apparently, it is path to the directory with models, and it is relative to logs dir. For example:

$ python main.py  --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/" --load_path generative_2017-03-07_01-40-07

There is no indication whatsoever, that the model is loaded rather than initialized during testing. And of course if it is initialized then it will simply write garbage to refined images, leaving you wondering.

I don't know though how to make it clear whether the model was loaded or initialized if you use tf.train.Supervisor, maybe specify wait_for_checkpoint=True when calling prepare_or_wait_for_session.

The result looks different from the paper

The refined image looks as same as the synthetic image while the result in the paper looks more different.It there need more steps to train?

activation_fn in refiner and discriminator is default None.

In layers.py

def conv2d(inputs, num_outputs, kernel_size, stride,
           layer_dict={}, activation_fn=None,
           #weights_initializer=tf.random_normal_initializer(0, 0.001),
           weights_initializer=tf.contrib.layers.xavier_initializer(),
           scope=None, name="", **kargv):
  outputs = slim.conv2d(
      inputs, num_outputs, kernel_size,
      stride, activation_fn=activation_fn, 
      weights_initializer=weights_initializer,
      biases_initializer=tf.zeros_initializer(dtype=tf.float32), scope=scope, **kargv)
  if name:
    scope = "{}/{}".format(name, scope)
  _update_dict(layer_dict, scope, outputs)
  return outputs

and in model.py

  def _build_refiner(self, layer):
    with tf.variable_scope("refiner") as sc:
      layer = conv2d(layer, 64, 3, 1, scope="conv_1")
      layer = repeat(layer, 4, resnet_block, scope="resnet")
      layer = conv2d(layer, 1, 1, 1, 
                     activation_fn=None, scope="conv_2")
      output = tanh(layer, name="tanh")
      self.refiner_vars = tf.contrib.framework.get_variables(sc)
    return output 

  def _build_discrim(self, layer, name, reuse=False):
    with tf.variable_scope("discriminator", reuse=reuse) as sc:
      layer = conv2d(layer, 96, 3, 2, scope="conv_1", name=name)
      layer = conv2d(layer, 64, 3, 2, scope="conv_2", name=name)
      layer = max_pool2d(layer, 3, 1, scope="max_1", name=name)
      layer = conv2d(layer, 32, 3, 1, scope="conv_3", name=name)
      layer = conv2d(layer, 32, 1, 1, scope="conv_4", name=name)
      logits = conv2d(layer, 2, 1, 1, scope="conv_5", name=name)
      output = tf.nn.softmax(logits, name="softmax")
      self.discrim_vars = tf.contrib.framework.get_variables(sc)
    return output, logits

Activation is None in most convolution layers.
Is this OK? I think that gradients do not propagate properly.

zeros_initializer() error

Hi,
I'm just using tensorflow for my first time so I don't know why it happens.
Anyway, when trying your code with a fresh installed tensorflow 0.12.1 I get this error:
TypeError: zeros_initializer() takes at least 1 argument (1 given)
inside the layers.py file.
I fixed the error substituting biases_initializer=tf.zeros_initializer(dtype=tf.float32) with
biases_initializer=tf.zeros_initializer
As suggested by an user here for a similar issue #5742

Cannot find *_cropped.png images?

How to get the *_cropped.png images? Thanks a lot.

Multi-GPU

Hi,
can this code run on multi-gpus?
Thanks

error

I am getting this error, unable to correct it, please help

`Traceback (most recent call last):
  File "main.py", line 29, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/Users/neutrino/apple/apple/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 17, in main
    trainer = Trainer(config, rng)
  File "/Users/neutrino/apple/simulated-unsupervised-tensorflow/trainer.py", line 35, in __init__
    self.data_loader = DataLoader(config, rng=self.rng)
  File "/Users/neutrino/apple/simulated-unsupervised-tensorflow/data/gaze_data.py", line 167, in __init__
    self.synthetic_data_dims = list(imread(self.synthetic_data_paths[0]).shape) + [1]
IndexError: index 0 is out of bounds for axis 0 with size 0`

ValueError: all input arrays must have the same shape

Hello, when I prepared dastaset and tried to run ur code , this problem happened, I don't know what's wrong, could u please help me out, thank u!

[] MODEL dir: logs/generative_2017-09-18_16-23-09
[] PARAM path: logs/generative_2017-09-18_16-23-09/params.json
[*] Training starts...
Traceback (most recent call last):
File "main.py", line 29, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 21, in main
trainer.train()
File "/home/zhangboshen/src/tensorflow/simulated-unsupervised-tensorflow/trainer.py", line 83, in train
self.data_loader.synthetic_data_paths[idxs]]
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/numpy/core/shape_base.py", line 347, in stack
raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

The value of coordinate in .json file

I used 640×480 UnityEyes image data, and check the size of output data is 55×35.
Next, I want to correct the value of coordinate in .json files.
I read gaze_data.py, but I only checked correction of coordinate point in crop part.
Will .json file be modified to 55x35 when the output image is refined?

Cropping of Unity Eye Dataset

I just had a small doubt regarding the data generation of Unity.
If we run the UNITY exe file, then we get an image against a 50% grey area but I see that you use properly cropped eye images. Can you please let me know how is this automated
cropping possible?

Batch Norm missing?

Hey, just wondering if there was a reason that batch norm was left out of the resnet blocks. Thanks for the repo!

About Hand Pose Estimation

Great job!
Would you please implement "Hand Pose Estimation from Depth Images" of the paper?

TypeError: zeros_initializer() takes at least 1 argument

I got this error below when running the main.py script. I am using tensorflow 0.12.1. Can somebody help my out. Thanks a lot.

[!] Found images in data/gaze/UnityEyes.
[] # of synthetic data: 7373, # of cropped_data: 7373
[] Finished preprocessing synthetic gaze data.
[*] Save samples images in data/gaze
/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2699: VisibleDeprecationWarning: rank is deprecated; use the ndim attribute or function instead. To find the rank of a matrix see numpy.linalg.matrix_rank.
VisibleDeprecationWarning)
Traceback (most recent call last):
File "main.py", line 29, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "main.py", line 17, in main
trainer = Trainer(config, rng)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/trainer.py", line 37, in init
self.model = Model(config, self.data_loader)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 27, in init
self._build_model()
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 111, in _build_model
self.R_x = self._build_refiner(self.normalized_x)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 294, in _build_refiner
layer = repeat(layer, 4, resnet_block, scope="resnet")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 47, in repeat
outputs = slim.repeat(inputs, repetitions, layer, **kargv)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1670, in repeat
outputs = layer(outputs, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 37, in resnet_block
padding=padding, activation_fn=tf.nn.relu, scope="conv1")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 61, in conv2d
biases_initializer=tf.zeros_initializer(dtype=tf.float32), scope=scope, **kargv)
TypeError: zeros_initializer() takes at least 1 argument (1 given)

I found the tf.zero_initializer() function should take another argument named shape, but I have no idea what the shape argument should be set. Besides, when I removed the dtype=tf.float32 argument, errors appeared in other parts.

Semantic Image Segmentation

Hello,

Can this method be used for semantic image segmentation? I have a dataset of unlabeled real images and a dataset of labeled synthetic images? Will the big resolution of the image be a problem?

Curious about the constant used in the normalize function

In layers.py, there is a normalize function that has a constant of 127.5:

def normalize(layer):
return layer/127.5 - 1.

I'm a little confused as to where the 127.5 comes from. It's a very specific question of course, but i'm interested in extending the regularization loss function with other types of transforms outside of the identity mapping used in the paper. If you have any tips or pointers in modifying that I'd love to hear. Great work and thanks for doing this!

Memory footprint

What is the memory footprint for the training? What size vram is required as is?

Motivation behind the denormalize function

The denormalize function in layers.py is defined as (layer + 1.)/2. If the aim is to revert the earlier normalization, shouldn't we have (layer + 1.)*127.5?

Asking because I'm facing a problem where the refined images are extremely dark (low intensity), since the pixel values are very low after denormalization.

Does it support GPU?

The training procedure is a little bit slow. How to add the support of GPU training?

Where is output folder?

I tried to refine all images.
python2 main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/" --log_dir="./logs/*******/

But, in output folder, it didn't contain refined images.
Where are refined files?Do I have to designate output folder?

Hyperparameter choice, cropping decisions

First off -- great code; a joy to read. Thank you for sharing it.

You mention regarding a few of the hyperparams: "Manually choose hyperparameters for B and lambda because those are not specified in the paper." (Have you attempted to ask the authors, rather than guessing?) Looks like you ran a few lambda experiments. For the buffer size B, it seems that a buffer size of 25600 vs. a batch size of 512 results in a rather long history (with random replacement). I wonder how much history is too much.

Regarding cropping gaze data:
synth renders start as 640x480, cropped to 140x84, resized to 55x35
real data start as 60x36, resized to 55x35 (specified by paper)
Are the crop parameters just based on a best-guess visual alignment of synth to real?

More generally, what is your level of confidence of the fidelity of this implementation to the original? Your refined eyeballs don't like quite as compelling as the S+U paper, but I don't know if that's a function of fewer synth images used, or fewer training iterations, or what.

Thanks again!

Can you provide a fully trained model?

Is there anyone providing a fully trained model? Thank you in advance. My email is [email protected].

Does anyone have a collection of refined photos I could use?

I'd like to run the refined photos through a CGI vs. photo CNN and see the results. Does anyone have an already generated set of refined photos by any chance I could use?

Local Adversarial loss

Hi,

In the paper the authors mentioned about local adversarial loss and the fact that the discriminator is a fully convolutional, however I couldn't find that part in your code. Could you point me to that.

Thanks

have a error when pythoning main.py

can u help me solving the problem in the photo?

SimGAN input

@carpedm20 hello, I'm a beginner in image processing.
Could you plz tell me the input of simGan is a patch of a synthetic image or a whole synthetic image?
If the input of simGan is a patch of a synthetic image， what is the patch size?
Thank you very much!

eye of infrared camera

hello, can I refine the UnityEyes eye image with real infrared image and keep the labels???

carpedm20 / simulated-unsupervised-tensorflow Goto Github PK