Git Product home page Git Product logo

shellnet's Introduction

ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics

International Conference on Computer Vision (ICCV) 2019 (Oral)

Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung.

Introduction

This is the code release of our paper about building a convolution and neural network for point cloud learning such that the training is fast and accurate. We address this problem by learning point features in regions called 'shells', which resolves point orders and produces local features altogether. Please find the details of the technique in our project page.

If you found this paper useful in your research, please cite:

@inproceedings{zhang-shellnet-iccv19,
    title = {ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics},
    author = {Zhiyuan Zhang and Binh-Son Hua and Sai-Kit Yeung},
    booktitle = {International Conference on Computer Vision (ICCV)},
    year = {2019}
}

Installation

The code is based on PointCNN. Please install TensorFlow, and follow the instruction in PointNet++ to compile the customized TF operators in the tf_ops folder.

The code has been tested with Python 3.6, TensorFlow 1.13.2, CUDA 10.0 and cuDNN 7.3 on Ubuntu 14.04.

Code Explanation

The core convolution, ShellConv, and the neural network, ShellNet, are defined in shellconv.py.

Convolution Parameters

Let us take the sconv_params from s3dis.py as an example:

ss = 8 
sconv_param_name = ('K', 'D', 'P', 'C')
sconv_params = [dict(zip(sconv_param_name, sconv_param)) for sconv_param in
                [
                 (ss*4, 4, 512, 128),
                 (ss*2, 2, 128, 256),
                 (ss*1, 1, 32, 512)]]

ss indicates the shell size which is defined as the number of points contained in each shell. Each element in sconv_params is a tuple of (K, D, P, C), where K is the neighborhood size, D is number of shells, P is the representative point number in the output, and C is the output channel number. Each tuple specifies the parameters of one ShellConv layer, and they are stacked to create a deep network.

Deconvolution Parameters

Similarly, for deconvolution, let us look at sdconv_params from s3dis.py:

sdconv_param_name = ('K', 'D', 'pts_layer_idx', 'qrs_layer_idx')
sdconv_params = [dict(zip(sdconv_param_name, sdconv_param)) for sdconv_param in
                [
                (ss*1,  1, 2, 1),
                (ss*2,  2, 1, 0),
                (ss*4,  4, 0, -1)]]

Each element in sdconv_params is a tuple of (K, D, pts_layer_idx, qrs_layer_idx), where K and D have the same meaning as that in sconv_params, pts_layer_idx specifies the output of which ShellConv layer (from the sconv_params) will be the input of this ShellDeConv layer, and qrs_layer_idx specifies the output of which ShellConv layer (from the sconv_params) will be forwarded and fused with the output of this ShellDeConv layer. The P and C parameters of this ShellDeConv layer is also determined by qrs_layer_idx. Similarly, each tuple specifies the parameters of one ShellDeConv layer, and they are stacked to create a deep network.

Usage

Classification

To train a ShellNet model to classify shapes in the ModelNet40 dataset:

cd data_conversions
python3 ./download_datasets.py -d modelnet
cd ..
python3 train_val_cls.py
python3 test_cls_modelnet40.py -l log/cls/xxxx

Our pretrained model can be downloaded here. Please put it to log/cls/modelnet_pretrained folder to test.

Segmentation

We perform segmentation with various datasets, as follows.

ShapeNet

cd data_conversions
python3 ./download_datasets.py -d shapenet_partseg
python3 ./prepare_partseg_data.py -f ../../data/shapenet_partseg
cd ..
python3 train_val_seg.py -x seg_shapenet
python3 test_seg_shapenet.py -l log/seg/shellconv_seg_shapenet_xxxx/ckpts/epoch-xxx
cd evaluation
python3 eval_shapenet_seg.py -g ../../data/shapenet_partseg/test_label -p ../../data/shapenet_partseg/test_pred_shellnet_1 -a

ScanNet

Please refer to ScanNet homepage and PointNet++ preprocessed data to download ScanNet. After that, the following script can be used for training and testing:

cd data_conversions
python3 prepare_scannet_seg_data.py
python3 prepare_scannet_seg_filelists.py
cd ..
python3 train_val_seg.py -x seg_scannet
python3 test_seg_scannet.py -l log/seg/shellconv_seg_scannet_xxxx/ckpts/epoch-xxx
cd evaluation
python3 eval_scannet.py -d <path to *_pred.h5> -p <path to scannet_test.pickle>

S3DIS

Please download the S3DIS dataset. The following script performs training and testing:

cd data_conversions
python3 prepare_s3dis_label.py
python3 prepare_s3dis_data.py
python3 prepare_s3dis_filelists.py
cd ..
python3 train_val_seg.py -x seg_s3dis
python3 test_seg_s3dis.py -l log/seg/shellconv_seg_s3dis_xxxx/ckpts/epoch-xxx
cd evaluation
python3 s3dis_merge.py -d <path to *_pred.h5>
python3 eval_s3dis.py

Please notice that these command just for Area 1 validation. Results on other Areas can be computed by modifying the filelist and filelist_val in s3dis.py.

Semantic3D

You can download our preprocessed hdf5 files and labels here. Then:

python3 train_val_seg.py -x seg_semantic3d
python3 test_seg_semantic3d.py -l log/seg/shellconv_seg_semantic3d_xxxx/ckpts/epoch-xxx
cd evaluation
python3 semantic3d_merge.py -d <path to *_pred.h5> -v <reduced or full>

If you prefer to process the data by yourself, here are the steps we used. In general, this data preprocessing of this dataset is more involved. First, please download the original Semantic3D dataset. We then downsample the data using this script. Finally, we follow PointCNN's script to split the data into training and validation set, and prepare the .h5 files.

License

This repository is released under MIT License (see LICENSE file for details).

shellnet's People

Contributors

songuke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

shellnet's Issues

Training model of S3DIS dataset

Hello, thank you very much for your work, do you have a training model on the S3DIS dataset? Is it convenient to share? Thank you!

FLOPs Problems

Thanks for your sharing great work about efficient point cloud processing. I am curious about the way of computing FLOPs, especially for inference FLOPs. Looking forward to your reply.

can not run the file "train_val_seg.py"

Hello, thanks for your sharing of the code firstly.
When I want to try the outdoor dataset and run the file "train_val_seg.py", I encountered this error :
ModuleNotFoundError: No module named 'transforms3d'
Can you help me solve the problem or give me some tips? Thanks very much!

Issues about the function "knn_indices_general" ( in ./shellnet/utils/pointfly.py Line 159)

Appreciate for your nice work, I just read through the code and find some problems in your function "knn_indices_general(( in ./shellnet/utils/pointfly.py Line 159))":

def knn_indices_general(queries, points, k, sort=True, unique=True):

    queries_shape = tf.shape(queries)

    batch_size = queries_shape[0]

    point_num = queries_shape[1]

    tmp_k = 0

    D = batch_distance_matrix_general(queries, points)

    if unique:

        prepare_for_unique_top_k(D, points)

    _, point_indices = tf.nn.top_k(-D, k=k+tmp_k, sorted=sort)  # (N, P, K)

    # point_indices = tf.contrib.framework.argsort(D)

    # point_indices = point_indices[:,:,:k]

    batch_indices = tf.tile(tf.reshape(tf.range(batch_size), (-1, 1, 1, 1)), (1, point_num, k, 1))

    indices = tf.concat([batch_indices, tf.expand_dims(point_indices[:,:,tmp_k:], axis=3)], axis=3)

    return indices

Here the function "prepare_for_unique_top_k" actually didn't change the value of D.Here is the code:

# A shape is (N, P, C)

def find_duplicate_columns(A):

    N = A.shape[0]

    P = A.shape[1]

    indices_duplicated = np.fill((N, 1, P), 1, dtype=np.int32)

    #x = np.fucksadkj()

    for idx in range(N):

        _, indices = np.unique(A[idx], return_index=True, axis=0)

        indices_duplicated[idx, :, indices] = 0

    return indices_duplicated

def prepare_for_unique_top_k(D, A):

    indices_duplicated = tf.py_function(find_duplicate_columns, [A], tf.int32)

    D += tf.reduce_max(D)*tf.cast(indices_duplicated, tf.float32)

Firstly,module 'numpy' has no attribute 'fill', I guess you mean "np.full" here,and secondly,just pass D into the function would not change its value(actually your program will not run the function"prepare_for_unique_top_k").Maybe you can write the code as follows:

def find_duplicate_columns(A):
   N = A.shape[0]
   P = A.shape[1]
   indices_duplicated = np.full((N, 1, P), 1, dtype=np.int32)
   for idx in range(N):
       _, indices = np.unique(A[idx], return_index=True, axis=0)
       indices_duplicated[idx, :, indices] = 0
   return indices_duplicated

def prepare_for_unique_top_k(D, A):
   indices_duplicated = tf.py_function(find_duplicate_columns, [A], tf.int32)
   D += tf.reduce_max(D)*tf.cast(indices_duplicated, tf.float32)
   return D

def knn_indices_general(queries, points, k, sort=True, unique=True):
   queries_shape = tf.shape(queries)
   batch_size = queries_shape[0]
   point_num = queries_shape[1]
   tmp_k = 0
   D = batch_distance_matrix_general(queries, points)
   if unique:
       D = prepare_for_unique_top_k(D, points)
   _, point_indices = tf.nn.top_k(-D, k=k+tmp_k, sorted=sort)  # (N, P, K)
   batch_indices = tf.tile(tf.reshape(tf.range(batch_size), (-1, 1, 1, 1)), (1, point_num, k, 1))
   indices = tf.concat([batch_indices, tf.expand_dims(point_indices[:,:,tmp_k:], axis=3)], axis=3)
   return indices

FileNotFoundError: [Errno 2] No such file or directory: '../data/modelnet/train_files.txt'

2020-01-08 16:18:11.355997-Preparing datasets...
Traceback (most recent call last):
File "train_val_cls.py", line 232, in
main()
File "train_val_cls.py", line 52, in main
[data_train, label_train] = provider.load_cls_files(setting.filelist)
File "/home/gnss/桌面/shellnet-master/utils/provider.py", line 76, in load_cls_files
for line in open(filelist):
FileNotFoundError: [Errno 2] No such file or directory: '../data/modelnet/train_files.txt'

The performance of OA of ModelNet40

I'm doing a comparative experiment about different Pointcloud framework recently. I only get the oa of 92.1% by Shellnet ,but it's 93.1% in the paper.

eval acc (oa): 0.920989 ---- eval acc (mean class): 0.879494 ---- time cost: 9.074277

used default setting with ss=16,bacth_size=32,and with multi=True

So,I wonder which results should I choose to report.... looking for your reply

the variable names of qrs, pts are so misleading!

In the paper, the authors use q to denote query points and p to denote representation points. But in the code, qrs is used to represent representation points and pts for query points. The mess makes the code very hard for comprehension for a newer like me. Hope you can change the names or give some notation in README at least. Thanks, on the behalf of the possible dazed people. I worked it out for a whole day.....

about online test on semanti3d

What is the file format of the prediction results that need to be uploaded when testing the semantic3D dataset online?Can you show it in detail?The website description is very unclear. looking forward to your reply.

Problem with semantic3d merge

I found half of .label file are label '1'. And the total number of label does not match the point number in h5 file
I don't know if I understand correctly. You down-sampled with intel's script but PointCNN didn't. Therefore in the merge script, the point number should not be the same.

can your code visualize the whole room segmentation results?

hello,I find a problem in pointcnn that after runing segmentation on s3dis dataset and scannet dataset,we only capture .ply files from part of whole room.We can not visualize the whole room segmentation results.can your code visualize the whole room segmentation results?looking forward your reply.

Issues with semantic3d_merge.py

I am trying to visualize the labels returned in the "results/*.txt" from the semantic3d_merge.py. However, it seems that the labels do not correspond to the original order of the txt files used as the input to "data_conversions/prepare_semantic3d_data.py" . Can you give me some information about how to pair the final labels with their original x,y,z coordinates?

Code optimization for dropout operation

Thanks for your nice work. I just read through the code and find a small problem in dropout operation:

layer_fts = tf.layers.dropout(layer_fts, rate=dropout_rate, name='fc{:d}_dropout'.format(layer_idx))

Actually this dropout operation has no effect if you don't pass training=is_training to it. The default value for training parameter is False so it will return the input untouched.

But it's ok, it doesn’t influence the experiment. Maybe you can fix it in your spare time, or just remove dropout to make code clearer.

question about preprocessing the dataset

Hello, respected author, May I ask you a question,
"We then downsample the data using this script. Finally, we follow PointCNN's script to split the data into training and validation set."
is there corresponding code in Shellnet scripts with that two steps? Thanks advance.

cudnn PoolFoward launch failed

when I try to train the net ,an error occurred: cudnn PoolFoward launch failed in tf.layers.max_pooling2d in shellconv.py, it's related to my tensorflow version or not? can you please tell me how to resolve?
my environment information are:
tensorflow-gpu=1.10
cuda 9.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.