Git Product home page Git Product logo

coordconv's Introduction

CoordConv

This repository contains source code necessary to reproduce the results presented in the paper An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution (NeurIPS 2018):

@inproceedings{liu2018coordconv,
  title={An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution},
  author={Liu, Rosanne and Lehman, Joel and Molino, Piero and Petroski Such, Felipe and Frank, Eric and Sergeev, Alex and Yosinski, Jason},
  booktitle={Advances in Neural Information Processing Systems},
  year={2018}
}

For more on this project, including a 8-min video explanation, see the Uber AI Labs blog post.

CoordConv layer, as a drop-in replacement for convolution

The standalone CoordConv layer, wrapped as a tf.layers object, can be found in CoordConv.py. Models constructed in model_builders.py show usage of it.

Data

To generate Not-so-Clevr dataset, which consists of squares randomly positioned on a canvas, and with uniform and quarant splits:

python ./data/not_so_clevr_generator.py

To generate two-object Sort-of-Clevr images, run a modification of the Sort-of-Clevr source code:

python ./data/sort_of_clevr_generator.py

Supervised Coordinate Tasks

The train.py script executes the training of all supervised coordinate tasks as described in the paper. Use --arch to toggle among different tasks.

The file experiment_logs.sh records the entire series of experiments enumerating different hyperparameters for each task, as exactly used to produce results in the paper. Note that we generate random experiment ids for job tracking in the Uber internal cluster, which can be ignored. We also use resman to keep results organized, which is highly recommended!

Examples to run Supervised Coordinate Classification:

# coordconv version
python train.py --arch coordconv_classification -mb 16 -E 100 -L 0.005 --opt adam --l2 0.001 -mul 1  
# deconv version
python train.py --arch deconv_classification -mb 16 -E 2000 -L 0.01 --opt adam --l2 0.001 -mul 2 -fs 3

Use --data_h5 data/rectangle_4_uniform.h5 and --data_h5 data/rectangle_4_quadrant.h5 to observe the performances on two types of splits.

Examples to run Supervised Rendering:

# coordconv version
python train.py --arch coordconv_rendering -mb 16 -E 100 -L 0.005 --opt adam --l2 0.001 -mul 1
# deconv version
python train.py --arch deconv_rendering -mb 16 -E 2000 -L 0.01 --opt adam --l2 0.001 -mul 2 -fs 3

Use --data_h5 data/rectangle_4_uniform.h5 and --data_h5 data/rectangle_4_quadrant.h5 to observe the performances on two types of splits.

Examples to run Supervised Coordinate Regression:

# coordconv version
python train.py --arch conv_regressor -E 100 --lr 0.01 --opt adam --l2 0.00001
# deconv version
python train.py --arch coordconv_regressor -E 100 --lr 0.01 --opt adam --l2 0.00001

Use --data_h5 data/rectangle_4_uniform.h5 and --data_h5 data/rectangle_4_quadrant.h5 to observe the performances on two types of splits.

Generative Tasks

# coordconv GAN
python train_gan.py --arch clevr_coordconv_in_gd -mb 16 -E 50 -L 0.0001 --lr2 .0005 --opt adam --z_dim 256 --snapshot-every 1
# deconv GAN
python train_gan.py --arch clevr_gan -mb 16 -E 50 -L 0.0001 --lr2 .0005 --opt adam --z_dim 256 --snapshot-every 1

TODO

Add RL, and VAE and LSUN GAN models

coordconv's People

Contributors

mimosavvy avatar rquber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

coordconv's Issues

Python3 Compatibility

Hi,
This is a very minor issue--nevertheless, the rest of the repository working, I really wanted to have this fixed.
This repository code at generating data is not python3 compatible.

Wrong denominators when normalize xx_channel and yy_channel?

Hi, in your code you normalize xx_channel and yy_channel as following:

xx_channel = tf.cast(xx_channel, 'float32') / (self.x_dim - 1)

yy_channel = tf.cast(yy_channel, 'float32') / (self.y_dim - 1)

I think it should be:
xx_channel = tf.cast(xx_channel, 'float32') / (self.y_dim - 1)
yy_channel = tf.cast(yy_channel, 'float32') / (self.x_dim - 1)
That is the denominators are wrong?

CoordConv position for a2c

Hi,

I'm wondering where exactly did you put a coordconv layer for a2c algorithm. Is it before the input?
The paper states 'Adding a CoordConv layer to an actor network within A2C', does this mean the critic network does not use coordconv? but actor and critic network share parameters.

Thanks

Cannot reproduce results of "coordconv_rendering" task

I get zero accuracies by running the instructed command:

# coordconv version
python train.py --arch coordconv_rendering -mb 16 -E 100 -L 0.005 --opt adam --l2 0.001 -mul 1 --use_mse_loss

I am using the instructed tensorflow 1.14 environment as well.
Since there are obvious runtime errors (e.g., variable not declared here and here), I suspect the codes are released without even testing if the codes runnable (not to mention the numbers in the paper).

Can the authors make some clarifications on what is the status of this repo?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.