Git Product home page Git Product logo

simulated-unsupervised-tensorflow's Introduction

Simulated+Unsupervised (S+U) Learning in TensorFlow

TensorFlow implementation of Learning from Simulated and Unsupervised Images through Adversarial Training.

model

Requirements

Usage

To generate synthetic dataset:

  1. Run UnityEyes with changing resolution to 640x480 and Camera parameters to [0, 0, 20, 40].
  2. Move generated images and json files into data/gaze/UnityEyes.

The data directory should looks like:

data
├── gaze
│   ├── MPIIGaze
│   │   └── Data
│   │       └── Normalized
│   │           ├── p00
│   │           ├── p01
│   │           └── ...
│   └── UnityEyes # contains images of UnityEyes
│       ├── 1.jpg
│       ├── 1.json
│       ├── 2.jpg
│       ├── 2.json
│       └── ...
├── __init__.py
├── gaze_data.py
├── hand_data.py
└── utils.py

To train a model (samples will be generated in samples directory):

$ python main.py
$ tensorboard --logdir=logs --host=0.0.0.0

To refine all synthetic images with a pretrained model:

$ python main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/"

Training results

Differences with the paper

  • Used Adam and Stochatstic Gradient Descent optimizer.
  • Only used 83K (14% of 1.2M used by the paper) synthetic images from UnityEyes.
  • Manually choose hyperparameters for B and lambda because those are not specified in the paper.

Experiments #1

For these synthetic images,

UnityEyes_sample

Result of lambda=1.0 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=sgd

Refined_sample_with_lambd=1.0

Result of lambda=0.5 with optimizer=sgd after 8,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=sgd

Refined_sample_with_lambd=1.0

Training loss of discriminator and refiner when lambda is 1.0 (green) and 0.5 (yellow).

loss

Experiments #2

For these synthetic images,

UnityEyes_sample

Result of lambda=1.0 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=1.0 --optimizer=adam

Refined_sample_with_lambd=1.0

Result of lambda=0.5 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.5 --optimizer=adam

Refined_sample_with_lambd=0.5

Result of lambda=0.1 with optimizer=adam after 4,000 steps.

$ python main.py --reg_scale=0.1 --optimizer=adam

Refined_sample_with_lambd=0.1

Training loss of discriminator and refiner when lambda is 1.0 (blue), 0.5 (purple) and 0.1 (green).

loss

Author

Taehoon Kim / @carpedm20

simulated-unsupervised-tensorflow's People

Contributors

alex-mocanu avatar carpedm20 avatar kmyi avatar soledad89 avatar yunjey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simulated-unsupervised-tensorflow's Issues

Better explanation on testing

Hi, first of all, thanks for the great work! It was really easy to use.
I had a problem with refining my images with a pretrained model though.

In readme you say: "To refine all synthetic images with a pretrained model":

$ python main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/"

You are missing the load_path argument. Apparently, it is path to the directory with models, and it is relative to logs dir. For example:

$ python main.py  --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/" --load_path generative_2017-03-07_01-40-07

There is no indication whatsoever, that the model is loaded rather than initialized during testing. And of course if it is initialized then it will simply write garbage to refined images, leaving you wondering.

I don't know though how to make it clear whether the model was loaded or initialized if you use tf.train.Supervisor, maybe specify wait_for_checkpoint=True when calling prepare_or_wait_for_session.

activation_fn in refiner and discriminator is default None.

In layers.py

def conv2d(inputs, num_outputs, kernel_size, stride,
           layer_dict={}, activation_fn=None,
           #weights_initializer=tf.random_normal_initializer(0, 0.001),
           weights_initializer=tf.contrib.layers.xavier_initializer(),
           scope=None, name="", **kargv):
  outputs = slim.conv2d(
      inputs, num_outputs, kernel_size,
      stride, activation_fn=activation_fn, 
      weights_initializer=weights_initializer,
      biases_initializer=tf.zeros_initializer(dtype=tf.float32), scope=scope, **kargv)
  if name:
    scope = "{}/{}".format(name, scope)
  _update_dict(layer_dict, scope, outputs)
  return outputs

and in model.py

  def _build_refiner(self, layer):
    with tf.variable_scope("refiner") as sc:
      layer = conv2d(layer, 64, 3, 1, scope="conv_1")
      layer = repeat(layer, 4, resnet_block, scope="resnet")
      layer = conv2d(layer, 1, 1, 1, 
                     activation_fn=None, scope="conv_2")
      output = tanh(layer, name="tanh")
      self.refiner_vars = tf.contrib.framework.get_variables(sc)
    return output 

  def _build_discrim(self, layer, name, reuse=False):
    with tf.variable_scope("discriminator", reuse=reuse) as sc:
      layer = conv2d(layer, 96, 3, 2, scope="conv_1", name=name)
      layer = conv2d(layer, 64, 3, 2, scope="conv_2", name=name)
      layer = max_pool2d(layer, 3, 1, scope="max_1", name=name)
      layer = conv2d(layer, 32, 3, 1, scope="conv_3", name=name)
      layer = conv2d(layer, 32, 1, 1, scope="conv_4", name=name)
      logits = conv2d(layer, 2, 1, 1, scope="conv_5", name=name)
      output = tf.nn.softmax(logits, name="softmax")
      self.discrim_vars = tf.contrib.framework.get_variables(sc)
    return output, logits

Activation is None in most convolution layers.
Is this OK? I think that gradients do not propagate properly.

zeros_initializer() error

Hi,
I'm just using tensorflow for my first time so I don't know why it happens.
Anyway, when trying your code with a fresh installed tensorflow 0.12.1 I get this error:
TypeError: zeros_initializer() takes at least 1 argument (1 given)
inside the layers.py file.
I fixed the error substituting biases_initializer=tf.zeros_initializer(dtype=tf.float32) with
biases_initializer=tf.zeros_initializer
As suggested by an user here for a similar issue #5742

Multi-GPU

Hi,
can this code run on multi-gpus?
Thanks

error

I am getting this error, unable to correct it, please help

`Traceback (most recent call last):
  File "main.py", line 29, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/Users/neutrino/apple/apple/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 17, in main
    trainer = Trainer(config, rng)
  File "/Users/neutrino/apple/simulated-unsupervised-tensorflow/trainer.py", line 35, in __init__
    self.data_loader = DataLoader(config, rng=self.rng)
  File "/Users/neutrino/apple/simulated-unsupervised-tensorflow/data/gaze_data.py", line 167, in __init__
    self.synthetic_data_dims = list(imread(self.synthetic_data_paths[0]).shape) + [1]
IndexError: index 0 is out of bounds for axis 0 with size 0`

ValueError: all input arrays must have the same shape

Hello, when I prepared dastaset and tried to run ur code , this problem happened, I don't know what's wrong, could u please help me out, thank u!

[] MODEL dir: logs/generative_2017-09-18_16-23-09
[
] PARAM path: logs/generative_2017-09-18_16-23-09/params.json
[*] Training starts...
Traceback (most recent call last):
File "main.py", line 29, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 21, in main
trainer.train()
File "/home/zhangboshen/src/tensorflow/simulated-unsupervised-tensorflow/trainer.py", line 83, in train
self.data_loader.synthetic_data_paths[idxs]]
File "/home/zhangboshen/anaconda2/lib/python2.7/site-packages/numpy/core/shape_base.py", line 347, in stack
raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

The value of coordinate in .json file

I used 640×480 UnityEyes image data, and check the size of output data is 55×35.
Next, I want to correct the value of coordinate in .json files.
I read gaze_data.py, but I only checked correction of coordinate point in crop part.
Will .json file be modified to 55x35 when the output image is refined?

Cropping of Unity Eye Dataset

I just had a small doubt regarding the data generation of Unity.
If we run the UNITY exe file, then we get an image against a 50% grey area but I see that you use properly cropped eye images. Can you please let me know how is this automated
cropping possible?

Batch Norm missing?

Hey, just wondering if there was a reason that batch norm was left out of the resnet blocks. Thanks for the repo!

TypeError: zeros_initializer() takes at least 1 argument

I got this error below when running the main.py script. I am using tensorflow 0.12.1. Can somebody help my out. Thanks a lot.

[!] Found images in data/gaze/UnityEyes.
[] # of synthetic data: 7373, # of cropped_data: 7373
[
] Finished preprocessing synthetic gaze data.
[*] Save samples images in data/gaze
/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2699: VisibleDeprecationWarning: rank is deprecated; use the ndim attribute or function instead. To find the rank of a matrix see numpy.linalg.matrix_rank.
VisibleDeprecationWarning)
Traceback (most recent call last):
File "main.py", line 29, in
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "main.py", line 17, in main
trainer = Trainer(config, rng)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/trainer.py", line 37, in init
self.model = Model(config, self.data_loader)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 27, in init
self._build_model()
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 111, in _build_model
self.R_x = self._build_refiner(self.normalized_x)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/model.py", line 294, in _build_refiner
layer = repeat(layer, 4, resnet_block, scope="resnet")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 47, in repeat
outputs = slim.repeat(inputs, repetitions, layer, **kargv)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1670, in repeat
outputs = layer(outputs, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 37, in resnet_block
padding=padding, activation_fn=tf.nn.relu, scope="conv1")
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 177, in func_with_args
return func(*args, **current_args)
File "/home/w00378682/Documents/Detection/simulated-unsupervised-tensorflow/layers.py", line 61, in conv2d
biases_initializer=tf.zeros_initializer(dtype=tf.float32), scope=scope, **kargv)
TypeError: zeros_initializer() takes at least 1 argument (1 given)

I found the tf.zero_initializer() function should take another argument named shape, but I have no idea what the shape argument should be set. Besides, when I removed the dtype=tf.float32 argument, errors appeared in other parts.

Semantic Image Segmentation

Hello,

Can this method be used for semantic image segmentation? I have a dataset of unlabeled real images and a dataset of labeled synthetic images? Will the big resolution of the image be a problem?

Curious about the constant used in the normalize function

In layers.py, there is a normalize function that has a constant of 127.5:

def normalize(layer):
return layer/127.5 - 1.

I'm a little confused as to where the 127.5 comes from. It's a very specific question of course, but i'm interested in extending the regularization loss function with other types of transforms outside of the identity mapping used in the paper. If you have any tips or pointers in modifying that I'd love to hear. Great work and thanks for doing this!

Memory footprint

What is the memory footprint for the training? What size vram is required as is?

Motivation behind the denormalize function

The denormalize function in layers.py is defined as (layer + 1.)/2. If the aim is to revert the earlier normalization, shouldn't we have (layer + 1.)*127.5?

Asking because I'm facing a problem where the refined images are extremely dark (low intensity), since the pixel values are very low after denormalization.

Does it support GPU?

The training procedure is a little bit slow. How to add the support of GPU training?

Where is output folder?

I tried to refine all images.
python2 main.py --is_train=False --synthetic_image_dir="./data/gaze/UnityEyes/" --log_dir="./logs/*******/

But, in output folder, it didn't contain refined images.
Where are refined files?Do I have to designate output folder?

Hyperparameter choice, cropping decisions

First off -- great code; a joy to read. Thank you for sharing it.

You mention regarding a few of the hyperparams: "Manually choose hyperparameters for B and lambda because those are not specified in the paper." (Have you attempted to ask the authors, rather than guessing?) Looks like you ran a few lambda experiments. For the buffer size B, it seems that a buffer size of 25600 vs. a batch size of 512 results in a rather long history (with random replacement). I wonder how much history is too much.

Regarding cropping gaze data:
synth renders start as 640x480, cropped to 140x84, resized to 55x35
real data start as 60x36, resized to 55x35 (specified by paper)
Are the crop parameters just based on a best-guess visual alignment of synth to real?

More generally, what is your level of confidence of the fidelity of this implementation to the original? Your refined eyeballs don't like quite as compelling as the S+U paper, but I don't know if that's a function of fewer synth images used, or fewer training iterations, or what.

Thanks again!

Local Adversarial loss

Hi,

In the paper the authors mentioned about local adversarial loss and the fact that the discriminator is a fully convolutional, however I couldn't find that part in your code. Could you point me to that.

Thanks

SimGAN input

@carpedm20 hello, I'm a beginner in image processing.
Could you plz tell me the input of simGan is a patch of a synthetic image or a whole synthetic image?
If the input of simGan is a patch of a synthetic image, what is the patch size?
Thank you very much!

eye of infrared camera

hello, can I refine the UnityEyes eye image with real infrared image and keep the labels???

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.