charlesq34 / 3dcnn.torch Goto Github PK

View Code? Open in Web Editor NEW

220.0 20.0 71.0 1.15 MB

Volumetric CNN for feature extraction and object classification on 3D data.

Home Page: http://graphics.stanford.edu/projects/3dcnn/

License: Other

Lua 100.00%

3dcnn.torch's Introduction

3dcnn.torch

Volumetric CNN (Convolutional Neural Networks) for Object Classification on 3D Data, with Torch implementation.

Introduction

This work is based on our arXiv tech report. Our paper will also appear as a CVPR 2016 spotlight (please refer to the arXiv one for most up-to-date results). In this repository, we release code, data for training Volumetric CNNs for object classification on 3D data (binary volume).

Citation

If you find our work useful in your research, please consider citing:

@article{qi2016volumetric,
    title={Volumetric and Multi-View CNNs for Object Classification on 3D Data},
    author={Qi, Charles R and Su, Hao and Niessner, Matthias and Dai, Angela and Yan, Mengyuan and Guibas, Leonidas J},
    journal={arXiv preprint arXiv:1604.03265},
    year={2016}
}

Installation

Install Torch7.

Note that cuDNN and GPU are required for VolumetricBatchNormalization layer. You also need to install a few torch packages (if you haven't done so) including cudnn.troch, cunn, hdf5 and xlua.

Usage

To train a model to classify 3D object:

th train.lua

Voxelizations of ModelNet40 models in HDF5 files will be automatically downloaded (633MB). modelnet40_60x include azimuth and elevation rotation augmented occupancy grids. modelnet40_12x is of azimuth rotation augmentation only. modelnet40_20x_stack is used for multi-orientation training. There are also text files in each folder specifying the sequence of CAD models in h5 files.

To see HELP for training script:

th train.lua -h

After the above training, which trains a 3D CNN classifying object based on single input, we can then train a multi-orientation 3D CNN by initializing it with the pretrained network with single orientation input:

th train_mo.lua --model <network_name> --model_param_file <model_param_filepath>

You need to specify at which layer to max-pool the feature in the network, it can be either set in command line by --pool_layer_idx <layer_idx> or set interactively after the script starts - the network layers and indices will be printed. For 3dnin_fc, one option is to set --pool_layer_idx 27 which will max pool the outputs of the last convolutional layer from multiple orientations pooling.

Results

Below are the classification accuracis we got on ModelNet40 test data.

Model	Single-Orientation	Multi-Orientation
voxnet	86.2% (on 12x data)	-
3dnin fc	88.8%	90.3%
subvol sup	88.8%	90.1%
ani probing	87.5%	90.0%

Note 1: Compared with the cvpr paper, we have included batch normalization and drop layers after all the convolutional layers (in caffe prototxt there are only dropouts after fc layers and no batch normalization is used). We also add small translation jittering to augment the data on the fly (following what voxnet has done). Besides the two models proposed in the paper (subvolume supervised network and anisotropic probing network), we have also found a variation of the base network used in subvolume_sup, which we call 3dnin_fc here. 3dnin_fc has a relatively simple architecture (3 mlpconv layers + 2 fc layers) and performs also very well, so we have set it as the default architecture to use in this repository.

Note 2: Numbers reported in the table above are average instance accuracy on the whole ModelNet40 test set containing 2468 CAD models from 40 categories.This different from what is on the modelnet website, which is average class accuracy on either a subset of the test set or the full test set. For direct comparison under average class accuracy metric, please refer to our paper.

You can directly get the trained model for the 3dnin fc architecture through this dowload link.

Caffe Models and Reference Results

Caffe prototxt files of models reported in the paper have been included in the caffe_models directory.

License

Our code and models are released under MIT License (see LICENSE file for details).

Acknowledgement

Torch implementation in this repository is based on the code from cifar.torch, which is a clean and nice GitHub repo on CIFAR image classification using Torch.

TODO

Add matlab interface to extract 3d feature (an in-progress effort is here).

3dcnn.torch's People

Contributors

Stargazers

Watchers

3dcnn.torch's Issues

Requirement for the trained parameters

Hi, Charles,

Thank you for sharing the code. May I ask for the trained models with parameters for these three algorithms using single orientation and multi-orientation? The training takes too long time on my computer, so I am wondering if you could share the resulted model.net directly. Thank you so much for the help.

Best,
Cindy Guo

Mean Accuracy or total Accuracy?

Hi,
Are you calculating instance accuracy or average class accuracy? if you're reporting average class acc, then shouldn't it be confusion.aveargeValid? I think confusion.totalValid gives you the total accuracy. In train.lua file, you are mentioning mean class accuracy on line 68, and then on line 167, and 172, you're calculating confusion.totalValid

original Modelnet40 h5 files

Hi, Can you provide the original Modelnet40 h5 files so that I can test the 3dnin_fc accuracy?
Thanks!

At test time, trying to average the predictions using votting approaches

Hi, in your code, here and here in train.lua (also similarly in train_mo.lua file), you assume the output of the models can be a table, and will take out the last element if it is a table. How could that be possible? According to my understanding, here you will do voting approach to combine the output of voxelization data from the same 3D model. So you have not implemented the voting approach in this code, right? Looking forward to you reply.

Which version of hdf5 is expected ?

I am getting the following error while executing 'th train.lua'

/home/siddharth/.luarocks/share/lua/5.1/hdf5/init.lua:15 Unable to find the HDF5 lib we were built against - trying to find it elsewhere
/home/siddharth/torch/install/bin/luajit: /home/siddharth/torch/install/share/lua/5.1/trepl/init.lua:389: /home/siddharth/.luarocks/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory
stack traceback:
[C]: in function 'error'
/home/siddharth/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
./provider.lua:2: in main chunk
[C]: in function 'dofile'
train.lua:6: in main chunk
[C]: in function 'dofile'
...arth/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

I have installed the hdf5 package available at deepmind torch-hdf5 . I see that another one is available at colberg hdf5.

Can you suggest which version to use (or the error is due to another reason) ?

what kind of data i should use if i want to test the model i trained?

thanks

May I ask where's the code for the multi-resolution MVCNN

Hi,

In the paper, you mentioned about the multi-resolution MVCNN, but I found the code from neither this repo nor the MVCNN repo. May I ask was I wrong and from where I can find the code. Many thanks!

Best,

where is the mlpconv layers

hello
would you please show me part of the code which shows mlpconv layer I mean in your model.
thanks.

Code that do azimuth and elevation rotation on the Modelnet dataset ?

Hi, will you be kind enough to share the code which do data augmentation of the Modelnet dataset and voxelize the dataset? Thanks .

Training stops at epoch 3

Hi,

I was trying to run the first stage training as described in your instruction with following command 'th train.lua'. But the training prematurely quits (without any warning or error) after the 3rd epoch is finished, I have tried setting different max epoch number but the same situation occur. I'm pretty fresh to torch so I have little clue what's causing this problem.

Here's the spec on the server I'm using:

OS: Ubuntu 14.04 LTS
Torch version: torch 7
CUDA: 7.0
cudnn: V4

Please advise.

Keras Version?

Is the keras version of this 3dcnn available?

Hi, can you give the default pool_layer_idx about other 4 modules ?

Hi, I have some problems with the code.
1)can you give the default pool_layer_idx about other 4 modules?
2)when training, I wrote the command line "th train.lua -s 3dninlog --model 3dnin". And in fine-tuning, just like "th train_mo.lua -s 3dninmo_log --model 3dnin --model_param_file ..../3dninlog/model.net"? Are these command lines correct? Using these two lines, I can get the results about 5 modules?
Thank you so much !

ps:
in train_mo.lua, line 133 " testLogger.showPlot = 'false' " should change to " testLogger.showPlot = false", just like train.lua.