jannerm / intrinsics-network Goto Github PK

View Code? Open in Web Editor NEW

142.0 9.0 27.0 8.05 MB

Code for the paper "Self-Supervised Intrinsic Image Decomposition"

Home Page: http://rin.csail.mit.edu/

Python 43.01% Shell 8.51% Lua 48.48%

intrinsics-network's Introduction

Rendered Intrinsics Network

Code and data to reproduce the experiments in Self-Supervised Intrinsic Image Decomposition.

This repo contains the original Lua Torch implementation of this work in the lua folder.

Installation

Get PyTorch and pip install -r requirements

You will also need Blender (2.76+) and the ShapeNet repository. In config.py, replace these lines:

blender = '/om/user/janner/blender-2.76b/blender'
shapenet = '/om/data/public/ShapeNetCore.v1'

with the absolute paths to the Blender app and the ShapeNet library on your machine. The Blender-supplied Python might not come with numpy and scipy. You can either fulfill the same requirements with the Blender Python or replace include with a directory containing those libraries.

Note: There were a few breaking changes in a recent PyTorch release that affected this repo. For now, the repo is compatible with v0.1.12 (torch.__version__ to check)

Data

All of the code to render the training images is in dataset.

Make an array of lighting conditions with python make_array.py --size 20000. See the parser arguments for lighting options. An array with the defaults is already at dataset/arrays/shader.npy.
python run.py --category motorbike --output output/motorbike --low 0 --high 100 --repeat 5 will render 100 composite motorbike images (numbered 0 through 100) along with the corresponding intrinsic images. It will reuse a given motorbike model 5 times before loading a new one. (Images of the same model will differ in orientation and lighting.)

The saved images in dataset/output/motorbike/ should look something like this:

A motorbike with its reflectance, shading, and normals map. The lighting conditions are visualized on a sphere.

The available ShapeNet categories are given in config.py. There are also a few geometric primitives (cube, sphere, cylinder, cone, torus) and standard test shapes (Utah teapot, Stanford bunny, Blender's suzanne). If you want to render other categories from ShapeNet, just add its name and ID to the dictionary in config.py and put the location, orientation, and size parameters in dataset/utils.py.

Batching

Since rendering can be slow, you might want to render many images in parallel. If you are on a system with SLURM, you can use divide.py, which works like run.py but also has a --divide argument to launch a large rendering job as many smaller jobs running concurrently.

Download

We also provide a few of the datasets for download if you do not have Blender or ShapeNet.

./download_data.sh { motorbike | airplane | bottle | car | suzanne | teapot | bunny }

will download train, val, and test sets for the specified category into dataset/output/. There is about 2 GB of data for each of the ShapeNet categories and 600 MB for the test shapes.

Shader

Example input shapes and lighting conditions alongside the model's predicted shading image. After training only on synthetic cars like those on the left, the model can generalize to images like the real Beethoven bust on the right.

To train a differentiable shader:

python shader.py --data_path dataset/output --save_path saved/shader --num_train 10000 --num_val 20 \
		 --train_sets motorbike_train,airplane_train,bottle_train \
		 --val_set motorbike_val,airplane_val,bottle_val

where the train and val sets are located in --data_path and were rendered in the previous step. The script will save visualizations of the model's predictions on the validation images every epoch and save them to --save_path along with the model itself. Note that --num_train and --num_val denote the number of images per dataset, so in the above example there will be 30000 total training images and 60 validation images.

Intrinsic image prediction

python decomposer.py --data_path dataset/output --save_path saved/decomposer --array shader --num_train 20000 \
		     --num_val 20 --train_sets motorbike_train --val_set motorbike_val

will train a model on just motorbikes, although you can specify more datasets with a comma-separated list (as shown for the shader.py command). The rest of the options are analogous as well except for array, which is the lighting parameter array used to generate the data. The script will save the model, visualizations, and error plots to --save_path.

Transfer

After training a decomposer and shader network, you can compose them to improve the representations of the decomposer using unlabeled data. If you have trained a decomposer on only the geometric shape primitives, and now wanted to transfer it to the test shapes, you could use:

python composer.py --decomposer saved/decomposer/state.t7 --shader saved/shader.t7 --save_path saved/composer \
		   --unlabeled suzanne_train,teapot_train,bunny_train \
		   --labeled cube_train,sphere_train,cylinder_train,cone_train,torus_train \
		   --val_sets suzanne_val,teapot_val,bunny_val,cube_val,sphere_val \
		   --unlabeled_array unlab_shader --labeled_array lab_shader \
		   --transfer 300_normals --num_epochs 300 --save_model True

where --labeled contains the labeled datasets and --unlabeled the unlabeled datasets. The --val_sets are used to make visualizations after every epoch. (It is useful to have some of the labeled datasets in the visualization as well as a sanity check.) The --array flags are the names of the arrays with lighting parameters. Using the above rendering examples, this would be shader. --decomposer and --shader point to the saved networks trained in the previous steps.

--transfer is the most important flag. It specifies a training schedule for the network, of the form <iters>_<params>,<iters>_<params>,.... For example, 10_shader,10_normals,reflectance,20_lights will train only the shading parameters for 10 epochs, then only the parameters of the reflectance and normals decoders for 10 epochs, and then the lighting deocoder for 20 epochs. This 40-epoch cycle will continue for --num_epochs epochs.

intrinsics-network's People

Contributors

Stargazers

Watchers

intrinsics-network's Issues

RuntimeError, tensor size mismatch

I get the following error when running the decomposer:

Traceback (most recent call last):
File "decomposer.py", line 62, in
train_losses = trainer.train()
File "/home/nietog/Projects/intrinsics-network/pipeline/DecomposerTrainer.py", line 42, in train
err = self.__epoch()
File "/home/nietog/Projects/intrinsics-network/pipeline/DecomposerTrainer.py", line 30, in __epoch
loss.backward()
File "/home/nietog/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/nietog/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
File "/home/nietog/anaconda3/lib/python3.6/site-packages/torch/autograd/function.py", line 91, in apply
return self._forward_cls.backward(self, *args)
File "/home/nietog/anaconda3/lib/python3.6/site-packages/torch/autograd/functions/tensor.py", line 454, in backward
print(grad_output.clone().masked_fill(mask, 0))
RuntimeError: The expanded size of the tensor (1) must match the existing size (32) at non-singleton dimension 1. at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/THC/generic/THCTensor.c:323

Any idea where it comes from? Thanks in advance.

Performance on natural scenes

How does this model perform on natural scenes?
For example on the IIW dataset or Sintel

Thank you

Confusing concept of "self-supervised learning"

This paper is named self supervised intrinsic image decomposition. However, it means self-supervised transfer from a fully-supervised pre-trained model. This is kind of misleading for me, as I haven't be sure about this until I check the codes.

broken subdivision

I was trying to reproduce your sample set with:

http://shapenet.cs.stanford.edu/shapenet/obj-zip/ShapeNetCore.v1/03790512.zip

the objects don't subdivide well with Version 2.79 (2.79 2017-09-11, Blender ) on OSX.

I could pre process the models in meshlab, but I was wondering what your solution was.

For normal .obj files rendering

hi dear Michael,

I am using your rendering code to render intrinsic images from normal .obj dataset like faces. The raw files (.obj and .mtl, .png) are in the attached zip file. I found for common obj mesh, they do not have many "Mesh" groups like in ShapeNet. So for ShapeNet, I can render intrinsic images without problems, but for normal obj files, I cannot load the color/texture in composite/albedo with your code. The rendered result is here:
Composite:

Albedo:

Depth:

Could you please help to check or have some suggestions on how to modify your codes to make it also worked for normal obj files? Thanks sooo much!!!

0365489.zip

Error downloading datasets

The links in download_dataset.sh seem to be broken. I get this error

Can you please update the links? I am downloading just one dataset hence using this command

./download_data.sh {car}

Thank you

RuntimeError: $ Torch: not enough memory

when running the shader, I get the error in the following pictures.

I do not have GPU now , so I run in CPU(I just change the cuda to tensor).In order to make the model fastly run , I only use one category---motorbike ,and set num_train=1000 num_val is same to you . In the first epoch,I got the error! My RAM is 8G .I think this should be enough! I feel strange!
Any ideas where it comes from? Thank you very much!

AssertionError: MaskedSelect can't differentiate the mask

To pytorch version 0.3.0 users, there is a bug related to autograd. Lines 63-64 of file pipeline/utils.py, img[mask] /= 2. and img[mask] += .5, trigger the following "AssertionError: MaskedSelect can't differentiate the mask".
As suggested in https://github.com/yunjey/StarGAN/issues/12, it is a bug of version 0.3.0 that can be fixed by replacing these lines by
img = img * (1. - (mask == 1).float() / 2.)
img = img + (mask == 1).float() * .5

It is also suggested in https://discuss.pytorch.org/t/get-error-message-maskedfill-cant-differentiate-the-mask/9129 to use the .detach() method.

when I render model,the color picture is blank,what can I do?

Albedo and Shading incorrect in Composer output

Hello,

While I'm trying to run the composer with the same code, I'm getting a good reconstruction but we can see that the albedo image has shading artifacts and the shading image is mostly white and flat.
I trained it for 108 epochs with each epoch having the schedule - 10_shape,10_shading,reflectance,20_lighting

https://drive.google.com/file/d/1CpjuBO7jbZvlkDE3zOcoBVoXfQC5j2ft/view?usp=sharing

I could see overfitting in the albedo, shading and reconstruction loss. Hence I added dropout in the decomposer and trained it for 500 epochs. Then I trained the composer network with dropout added in decomposer and shader and made lights multiplier equal to 1.5. This was with the same schedule as above for 300 epochs The overfitting reduced but the network learned poorly on the albedo but did relatively well on shape and shading from before. The albedo output from the decomposer is good but not from the composer. As we can see below:

https://drive.google.com/file/d/1bYfbIxrFp4zPSS02feMNfdzh0Xtrmxsj/view?usp=sharing

So now I'm training the albedo decoder of the composer independently. Kindly help me with what is going wrong. In the end, I aim to be performing self-supervised training with images of my own.

tensor sizes don't match

Hi!
Thank you for the hard work
I'm trying to run the decomposer, but I somehow get the runtime error "normed = normals / (magnitude + 1e-6)", one tensor is of size 3 and the other 60, in the models/primitives.py file.
Wouldn't it be better to permute the dimension before:
"magnitude = magnitude.repeat(3,1,1,1).permute(1, 0, 2, 3)"
I use python 3.6 from anaconda and pytorch 0.3.0

Arguments dismatch

Traceback (most recent call last):
File "shader.py", line 50, in
pipeline.visualize_shader(shader, val_loader, save_path )
File "/home/xuwh/xu/intrinsics-network-master/pipeline/visualization.py", line 52, in visualize_shader
grid = torchvision.utils.make_grid(images, nrow=3, padding=0).cpu().numpy().transpose(1,2,0)
File "/home/xuwh/.local/lib/python2.7/site-packages/torchvision/utils.py", line 35, in make_grid
tensor = torch.stack(tensor, dim=0)
File "/home/xuwh/.local/lib/python2.7/site-packages/torch/functional.py", line 58, in stack
return torch.cat(inputs, dim)
TypeError: cat received an invalid combination of arguments - got (list, int), but expected one of:

(sequence[torch.cuda.FloatTensor] seq)
(sequence[torch.cuda.FloatTensor] seq, int dim)
didn't match because some of the arguments have invalid types: (list, int)

Any solution? Thanks in advance. @jannerm @FangYang970206 @Nighteye

Expected output of shader.py

Hi,
I was able to run shader.py for 500 epochs. On looking in visualization.py, I think the output should be 3 columns where the first column contains the shape images, second contains the shading predictions by the network and the third contains the groundtruth shading images.

Is the understanding right? Since in the image, we can see the output after running for 500 epochs but the second column does not seem to be a reasonable shading output. It also does not look like the lighting sphere as I can see light from the same direction.

Kindly help me interpret the image and tell me where I'm going wrong in my understanding.
Also my pytorch version is 1.1.0

Pre trained models

Hi,

Could you share the trained model so that we could test directly on new images?