uber-research / sbnet Goto Github PK

Sparse Blocks Networks

License: Other

Python 65.85% Shell 0.26% Makefile 0.77% C++ 26.16% C 0.66% Cuda 6.30%

uber neuralnetworks neuralnetwork tensorflow python sbnet

sbnet's Introduction

Sparse Blocks Network (SBNet)

This repository releases code for our paper SBNet: Sparse Blocks Network for Fast Inference. Please refer to our blog post for more context. Note that benchmarking in the paper was performed with an older version of this repo using TensorFlow 1.2, cuDNN 6.1 and commit cf8ea06.

This repository contains

a TensorFlow custom operations library that implements SBNet,
a Python implementation of sparse ResNet blocks, and
a benchmark for performance comparison with Submanifold Sparse Convolutional Networks.

Prerequisites

Installation was tested under Ubuntu 14.04 and 16.04 with TensorFlow 1.8, CUDA 9.0 and cuDNN 7.1.

Hardware requirements

Code was tested on and compiled for NVIDIA CUDA 6.1, 6.0, 5.2 and 7.0 architectures (Titan XP, GTX 1080Ti, GTX 1080, P100, V100, TitanV, and most Maxwell cards). To compile for an older architecture please modify the Makefile and add the corresponding line, such as -gencode arch=compute_50,code=sm_50 for older cards such as laptop Maxwell. Please refer to CUDA Wikipedia page to lookup the architecture code for your graphics card.

Setup

To build a release version of the library, run

cd sbnet_tensorflow/sbnet_ops && make

To run tests:

cd sbnet_tensorflow/sbnet_ops && make test

The library will be built in sbnet_tensorflow/sbnet_ops/build/libsbnet.so and symlinked to sbnet_tensorflow/sbnet_ops/libsbnet.so. To import the library into your TensorFlow Python code use the following command:

sbnet_module = tf.load_op_library('path_to_library/libsbnet.so')

The following Tensorflow ops are implemented in the op library:

sbnet_module.reduce_mask

sbnet_module.sparse_gather

sbnet_module.sparse_scatter

reduce_mask op converts a dense mask to a list of active block indices.

In the following snippet the mask is expected to be a tensor of dimensions [N,H,W,1]:

    indices = sbnet_module.reduce_mask(
        mask, tf.constant([BCH, BCW], dtype=tf.int32),
        bsize=[BSZH, BSZW],
        boffset=[BOFFSH, BOFFSW],
        bstride=[BSTRH, BSTRW],
        tol=0.5, # pooling threshold to consider a block as active
        avgpool=True) # max pooling by default

[BCH, BCW] are block counts in height and width dimensions. [BSZH, BSZW], [BOFFSH, BOFSFW] and [BSTRH, BSTRW] are block sizes, offsets and strides in H and W dimensions. reduce_mask performs a combined max pooling (or average pooling) operation localized to each block followed by generating a list of triples of indices [(ni, hi, wi)] for blocks where either max or average pooling value exceeds specified tolerance tol. In numpy terms each block is defined as a slice from the input mask of dimensions [N,H,W,1], with following dimensions: [ni, BOFFSH+BSTRH*hi : BOFFSH+BSTRH*hi+BSZH, BOFFSW+BSTRW*wi : BOFFSW+BSTRW*wi+BSZW, :].

The resulting list of indices can then be passed to two other operations: sbnet_module.sparse_scatter and sbnet_module.sparse_gather.

The following snippets illustrate the use of these operations:

    blockStack = sbnet_module.sparse_gather(
        x,
        indices.bin_counts,
        indices.active_block_indices,
        bsize=[BSZH, BSZW], # block size
        boffset=[BOFFSH, BOFFSW], # block offset
        bstride=[BSTRH, BSTRW], # block stride
        transpose=do_transpose)

This operation will use the indices generated by reduce_mask and slice out tensors of channel depth C out of input tensor x of dimensions [N,H,W,C] as illustrated in the following pseudo-code snippet:

    for (ni, hi, wi) in indices.active_block_indices:
        channel_slice = x[ni, BOFFSH+BSTRH*hi : BOFFSH+BSTRH*hi+BSZH, BOFFSW+BSTRW*wi : BOFFSW+BSTRW*wi+BSZW, :]
        blockStack[ni, :, :, :] = channel_slice

If do_transpose is true, a fused transpose operation will also be performed and the resulting tensor will have dimensions [nBlocks, C, BSZH, BSZW]. Any out-of-range values will be padded with zeroes.

The inverse operation is sbnet_module.sparse_scatter. The following snippet illustrates it's use:

    y = sbnet_module.sparse_scatter(
        blockStack,
        indices.bin_counts,
        indices.active_block_indices,
        x, # base tensor to copy to output and overwrite on top of
        bsize=[BSZH, BSZW],
        boffset=[BOFFSH, BOFFSW],
        bstride=[BSTRH, BSTRW],
        add=do_add,
        atomic=False, # use atomic or regular adds
        transpose=do_transpose)

Note that due to a limitation of TensorFlow API an intermediate tensor cannot be modified in place unless it's specified to be a tf.Variable. This necessitates creating an intermediate tensor inside the op and performing a copy which has negative implications for performance. So we created a second version of the op sbnet_module.sparse_scatter_var that expects x to be a tf.Variable and modifies it in place. Using sparse_scatter_var is strongly recommended for maximum performance.

The effect of this operation is opposite to sparse_gather - the input blocks will be written on top of base tensor x, or added to it's contents if do_add is True. The following pseudo-code snippet illustrates the semantics of sparse_scatter:

    for (ni, hi, wi) in indices.active_block_indices:
        if do_add:
            x[ni, BOFFSH+BSTRH*hi : BOFFSH+BSTRH*hi+BSZH, BOFFSW+BSTRW*wi : BOFFSW+BSTRW*wi+BSZW, :]\
                += blockStack[ni, :, :, :]
        else:
            x[ni, BOFFSH+BSTRH*hi : BOFFSH+BSTRH*hi+BSZH, BOFFSW+BSTRW*wi : BOFFSW+BSTRW*wi+BSZW, :]\
                = blockStack[ni, :, :, :]

So the blocks are 'put back in place', however the sizes and strides can be different from those passed to sparse_gather. This enables implementation of sparse ResNet blocks where output resolution is reduced after a 'VALID' convolution. Similar to sparse_gather, if do_transpose is true, a fused transpose operation will also be performed by sparse_scatter, permuting the input [N,C,H,W] dimensions to [N,H,W,C] in the output. Typically the block size for a 'VALID' convolution is reduced by 2 in each spatial dimension for each 3x3 convolution, thus creating non-overlapping outputs. Note that even though currently we support atomic adds in scatter with add=True, the gradient is not implemented at this time if overlapping scatters are used the forward pass.

Benchmarks and tests

Benchmarks for SBNet are located in sbnet_tensorflow/benchmarks/ subdirectory.

To run benchmarks execute:

cd sbnet_tensorflow/benchmarks && ./run_all_behchmarks.bash

Note that we average over a number of runs and test many permutations of parameters so this may take about 20 minutes (on a Titan XP) and will produce a number of .csv files in your /home/user/ directory. We benchmark individual sparse convolutions and entire sparse ResNet blocks on a synthetic mask with variable sparsity.

To run unit tests execute:

cd sbnet_tensorflow/sbnet_ops && make tests

Submanifold Sparse Convolutional Networks Benchmark

For comparison we implemented benchmarking code for Submanifold Sparse Convolutional Networks. Running this benchmark requires Submanifold Sparse Convolutions python package to be installed:

git clone https://github.com/facebookresearch/SparseConvNet.git

Follow the setup instructions in SparseConvNet repo.

Code integration with Submanifold Sparse Convolutions was tested with git sha 609224df3c0e42b8a1dd4073aaa56fab805096c6. To reset the repo to this sha use the following sequence of commands:

cd SparseConvNet
git checkout 609224df3c0e42b8a1dd4073aaa56fab805096c6

The benchmark code is located in sbnet_tensorflow/benchmark_submanifold directory.

Other notes

Current code is not tuned for performance with non-square block sizes and has specialized implementations for a specific list of block sizes. This includes square blocks of sizes 1 to 34 and a few others. To achieve maximum performance for these sizes you would need to add your custom template instantiations by modifying SIZE_TEMPLATES macro in sparse_gather.cu.

Contributing to this repository

For now, we do not accept pull request to this repo, as we are currently setting up automated CI. If you would like to contribute to this repository, feel free create a GitHub issue.

Citation

If you use our code, please consider cite the following: M. Ren, A. Pokrovsky, B. Yang, and R. Urtasun. SBNet: Sparse Blocks Network for Fast Inference. CoRR, abs/1801.02108, 2018.

@article{ren18sbnet,
  author    = {Mengye Ren and 
               Andrei Pokrovsky and
               Bin Yang and
               Raquel Urtasun},
  title     = {SBNet: Sparse Blocks Network for Fast Inference},
  journal   = {CoRR},
  volume    = {abs/1801.02108},
  year      = {2018},
}

sbnet's People

Contributors

Stargazers

Watchers

Forkers

neomatrixcode kashif alxndrkalinin johndpope teach-gtav zachluo corcovadoming huanhuanzhang tony32769 xshhhm dreadlord1984 oztc bikong2 cclauss ml-lab manqiaoyue davidsonggithub liuyutingliuyuting jdc08161063 zhang405744522 liyuanyaun lyk125 nature0310 ikyaqoob wanjinchang collector-m codeaudit lawrencewxj codeislife99 xu3kev liuguoyou hanimiao xychen9459 arieling shiyongde shubhampachori12110095 shlpu grseb9s labimage steccami harryprince qingsong99 wpfhtl ashwathaithal andrei-pokrovsky ngchc deepdriving jianyuheng sklf afcarl msingh172 aneesht90 anantshah200 jalywang123 lxwithgod sbperceptron reactivetype thomas3016 amirunpri2018 qiqzhang dukebw dsx0511 poodarchu qjziyou shyamalschandra jovialio wolf1981 benjamindkilleen wyjforwjy jlqzzz zhaojp-frank inaaa shaoyandea o7s8r6 damonmo jeffgan99 leroi46 cuulee yuanchunyu webberpang zhengpiao2018 williamliao28 byangderek dabeschte zherlock030 shuyangchengponyai 3a036 ahmedius2 standardgalactic

sbnet's Issues

How can I prevent the graph growing when using sbnet_module.reduce_mask in loop.

The main code is creating randomly changing mask indices for every loop using sbnet_module.

batch_size = 50
block_params_conv1 = calc_block_params([batch_size, 28, 28, 1],
                                       [1, 5, 5, 1],
                                       [5, 5, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
t_check = Timer()

print ("Starting 1st session...")
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        t_check.tic()

        mask_conv1 = generate_random_mask([batch_size, 28, 28, 1], 0.90)
        a_conv1 = tf.constant(mask_conv1, dtype=tf.float32)
        b_conv1 = sbnet_module.reduce_mask(a_conv1,
                      tf.constant(block_params_conv1.bcount, dtype=tf.int32),
                      bsize=block_params_conv1.bsize,
                      boffset=block_params_conv1.boffset,
                      bstride=block_params_conv1.bstrides,
                      tol=0.0,
                      avgpool=True)
        ind_val_conv1, bin_val_conv1 = sess.run([b_conv1.active_block_indices,
                                                 b_conv1.bin_counts])

        if i % 100 == 0:
            time_check = t_check.toc()
            print('step %d  \t= %f (sec)' % (i, time_check))

And the result was getting slower as the session run multiple times.

Starting 1st session...
step    0  = 0.006941 (sec)
step  100  = 0.008982 (sec)
step  200  = 0.012545 (sec)
step  300  = 0.016614 (sec)
step  400  = 0.018152 (sec)
step  500  = 0.023373 (sec)
step  600  = 0.026576 (sec)
step  700  = 0.028291 (sec)
step  800  = 0.031587 (sec)
step  900  = 0.037221 (sec)
step 1000  = 0.043062 (sec)
step 1100  = 0.048337 (sec)
step 1200  = 0.055366 (sec)
step 1300  = 0.060677 (sec)
step 1400  = 0.058936 (sec)
step 1500  = 0.072439 (sec)
step 1600  = 0.068025 (sec)
step 1700  = 0.073672 (sec)
step 1800  = 0.077006 (sec)
step 1900  = 0.083827 (sec)

So, my question is

I want to run sbnet_module with a randomly changing mask for every detection time
just like SBNet + Predicted Mask experiment, it seems to me that run sbnet_module.reduce_mask every time while detecting.
However, my result means that whenever using sbnet_module.reduce_mask, tensorflow graph grows so that speed has slowed down when detecting with sbnet_module.reduce_mask.
How can I use sbnet_module.reduce_mask without loss of time in this situation.

cuda_helpers.h:23:28: fatal error: cuda_jetbrains.h: No such file or directory

Hi, I have a compilation error. It complains about cuda_jetbrains.h, I couldn't find anything about such a file online. How to resolve this?

Performance for sparse_scatter_var worse not always better than sparse_scatter

The description says using mutable tensors(tf.Variable) would be more efficient because of in-place copy, but the perf degrades using sparse_scatter_var as opposed to sparse_scatter in some of the scenarios with the difference in input size to the Sbnet block. What is the tradeoff for using it ?

Huge loss in accuracy using SBNet

I was able to see the claimed speedup in terms of inference times, but there is a huge loss in terms of accuracy. Please see a simple MNIST example where I artificially created the images and masks. I am seeing a precision of about 11% which is as good as random guessing with a 10 class problem. I also tried to train the algorithm longer, but the precision saturates at about 11%. Without using SbNet you can get to a precision of ~80% after a few epochs.

MNIST_sbnet_repr.py.tar.gz

Unable to cast to tf.Variable after sparse_scatter operation

I am trying to use sparse_scatter_var with multiple CONV layers chained together. After the first sparse scatter operation, I am unable to cast a tf.Tensor to a Variable, which throws me a ValueError
ValueError: Input 'ybase' passed float expected ref type while building NodeDef 'Variable_4/SparseScatterVar_Variable_4_0' using Op<name=SparseScatterVar; signature=x:T, bin_counts:int32, active_block_indices:int16, ybase:Ref(T), dynamic_bsize:int32, dynamic_bstride:int32, dynamic_boffset:int32 -> y:Ref(T); attr=T:type,allowed=[DT_FLOAT]; attr=add:bool; attr=atomic:bool,default=false; attr=transpose:bool,default=false>

Is the detection model used in the SBNet paper open?

I want to use the model on KITTI LiDAR dataset.
But I only find the test and implementation codes for operators.
I cannot find the model definition. So I'd like to know whether the model is open-soureced.

Tensorflow 1.11.0 Error

You have an error with TF 1.11.0. It's impossible to build the library.

I manage to build it with TF 1.8.0 / 1.9.0 / 1.10.0 but with TF 1.11.0 it's broken

CXX11_ABI Flag

Why was ABI flag set to 0 in Makefile?
ABI=-D_GLIBCXX_USE_CXX11_ABI=0

is there a pytorch version?

Hi, I just want to know is there any pytorch version operation of sbnet?

Is there any way to reduce computation time of sbnet_module.reduce_mask?

I have built the yolov2 object detector using sbnet, but it takes too long to compute the result of sbnet_module.reduce_mask.
I need to compute sbnet_module.reduce_mask every frame, because the mask change every frame.

# yolov2 with dense convnet
Forwarding 1 inputs ...
Forwarding time = 0.0349409580231 sec

# yolov2 with sbnet (sparsity = 0.92)
Forwarding 1 inputs ...
Fowarding time = 0.0198512077332 sec
 + time(sbnet_module.reduce_mask) = 0.0325801372528 sec

When I applied sbnet model on yolov2(darknet) model,
forwarding time was about 1.7 times faster. However, it took longer than expected to compute the reduce_mask results which are needed to perform sparse_gather and sparse_scatter.

Below is my code to compute reduce_mask for conv1s ~ conv5.
Currently, it takes 0.03 sec to execute this code, but it's too slow considering the forwarding time of the detector (detector forwarding time is almost 0.03 sec).
Is there any fastest way to compute sbnet_module.reduce_mask?

# compute block_params for all different size in conv1s ~ conv5
block_params_p0_k3 = calc_block_params([1, 416, 864, None],
                                       [1, 34, 34, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p1_k3 = calc_block_params([1, 208, 432, None],
                                       [1, 18, 18, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p2_k3 = calc_block_params([1, 104, 216, None],
                                       [1, 10, 10, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p2_k1 = calc_block_params([1, 104, 216, None],
                                       [1, 8, 8, 1],
                                       [1, 1, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p3_k3 = calc_block_params([1, 52, 108, None],
                                       [1, 6, 6, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p3_k1 = calc_block_params([1, 52, 108, None],
                                       [1, 4, 4, 1],
                                       [1, 1, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p4_k3 = calc_block_params([1, 26, 54, None],
                                       [1, 4, 4, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p4_k1 = calc_block_params([1, 26, 54, None],
                                       [1, 2, 2, 1],
                                       [1, 1, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p5_k3 = calc_block_params([1, 13, 27, None],
                                       [1, 3, 3, 1],
                                       [3, 3, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')
block_params_p5_k1 = calc_block_params([1, 13, 27, None],
                                       [1, 1, 1, 1],
                                       [1, 1, 1, 1],
                                       [1, 1, 1, 1],
                                       padding='VALID')

# compute random binaray mask depending on bndbox for conv1s ~ conv5
mask_p0 = np.zeros([1, 416, 864], dtype=np.float32)
if bndbox:
    for bbox in bndbox:
        xmin = max(0, int(round(bbox[0] * 416)))
        ymin = max(0, int(round(bbox[1] * 864)))
        xmax = min(int(round(bbox[2] * 416)), 415)
        ymax = min(int(round(bbox[3] * 864)), 863)
        mask_p0[:, xmin:xmax, ymin:ymax] = 1.0

mask_p5 = block_reduce(mask_p0, (1, 32, 32), np.max)
mask_p4 = mask_p5.repeat(2, axis=1).repeat(2, axis=2)
mask_p3 = mask_p4.repeat(2, axis=1).repeat(2, axis=2)
mask_p2 = mask_p3.repeat(2, axis=1).repeat(2, axis=2)
mask_p1 = mask_p2.repeat(2, axis=1).repeat(2, axis=2)
mask_p0 = mask_p1.repeat(2, axis=1).repeat(2, axis=2)

with tf.Graph().as_default():
    mask_p0_tf = tf.constant(mask_p0, dtype=tf.float32)
    mask_p1_tf = tf.constant(mask_p1, dtype=tf.float32)
    mask_p2_tf = tf.constant(mask_p2, dtype=tf.float32)
    mask_p3_tf = tf.constant(mask_p3, dtype=tf.float32)
    mask_p4_tf = tf.constant(mask_p4, dtype=tf.float32)
    mask_p5_tf = tf.constant(mask_p5, dtype=tf.float32)

    # compute sbnet for conv1s ~ conv5
    sbnet_p0_k3 = sbnet_module.reduce_mask(mask_p0_tf,
        tf.constant(block_params_p0_k3.bcount, dtype=tf.int32),
         bsize=block_params_p0_k3.bsize,
         boffset=block_params_p0_k3.boffset,
         bstride=block_params_p0_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p1_k3 = sbnet_module.reduce_mask(mask_p1_tf,
         tf.constant(block_params_p1_k3.bcount, dtype=tf.int32),
         bsize=block_params_p1_k3.bsize,
         boffset=block_params_p1_k3.boffset,
         bstride=block_params_p1_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p2_k3 = sbnet_module.reduce_mask(mask_p2_tf,
         tf.constant(block_params_p2_k3.bcount, dtype=tf.int32),
         bsize=block_params_p2_k3.bsize,
         boffset=block_params_p2_k3.boffset,
         bstride=block_params_p2_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p2_k1 = sbnet_module.reduce_mask(mask_p2_tf,
         tf.constant(block_params_p2_k1.bcount, dtype=tf.int32),
         bsize=block_params_p2_k1.bsize,
         boffset=block_params_p2_k1.boffset,
         bstride=block_params_p2_k1.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p3_k3 = sbnet_module.reduce_mask(mask_p3_tf,
         tf.constant(block_params_p3_k3.bcount, dtype=tf.int32),
         bsize=block_params_p3_k3.bsize,
         boffset=block_params_p3_k3.boffset,
         bstride=block_params_p3_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p3_k1 = sbnet_module.reduce_mask(mask_p3_tf,
         tf.constant(block_params_p3_k1.bcount, dtype=tf.int32),
         bsize=block_params_p3_k1.bsize,
         boffset=block_params_p3_k1.boffset,
         bstride=block_params_p3_k1.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p4_k3 = sbnet_module.reduce_mask(mask_p4_tf,
         tf.constant(block_params_p4_k3.bcount, dtype=tf.int32),
         bsize=block_params_p4_k3.bsize,
         boffset=block_params_p4_k3.boffset,
         bstride=block_params_p4_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p4_k1 = sbnet_module.reduce_mask(mask_p4_tf,
         tf.constant(block_params_p4_k1.bcount, dtype=tf.int32),
         bsize=block_params_p4_k1.bsize,
         boffset=block_params_p4_k1.boffset,
         bstride=block_params_p4_k1.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p5_k3 = sbnet_module.reduce_mask(mask_p5_tf,
         tf.constant(block_params_p5_k3.bcount, dtype=tf.int32),
         bsize=block_params_p5_k3.bsize,
         boffset=block_params_p5_k3.boffset,
         bstride=block_params_p5_k3.bstrides,
         tol=0.0,
         avgpool=True)
    sbnet_p5_k1 = sbnet_module.reduce_mask(mask_p5_tf,
         tf.constant(block_params_p5_k1.bcount, dtype=tf.int32),
         bsize=block_params_p5_k1.bsize,
         boffset=block_params_p5_k1.boffset,
         bstride=block_params_p5_k1.bstrides,
         tol=0.0,
         avgpool=True)

    with tf.Session() as sess:
        ind_val_p0_k3, ind_val_p1_k3, \
        ind_val_p2_k3, ind_val_p2_k1, \
        ind_val_p3_k3, ind_val_p3_k1, \
        ind_val_p4_k3, ind_val_p4_k1, \
        ind_val_p5_k3, ind_val_p5_k1, \
        bin_val_p0_k3, bin_val_p1_k3, \
        bin_val_p2_k3, bin_val_p2_k1, \
        bin_val_p3_k3, bin_val_p3_k1, \
        bin_val_p4_k3, bin_val_p4_k1, \
        bin_val_p5_k3, bin_val_p5_k1 = \
            sess.run([sbnet_p0_k3.active_block_indices,
                      sbnet_p1_k3.active_block_indices,
                      sbnet_p2_k3.active_block_indices,
                      sbnet_p2_k1.active_block_indices,
                      sbnet_p3_k3.active_block_indices,
                      sbnet_p3_k1.active_block_indices,
                      sbnet_p4_k3.active_block_indices,
                      sbnet_p4_k1.active_block_indices,
                      sbnet_p5_k3.active_block_indices,
                      sbnet_p5_k1.active_block_indices,
                      sbnet_p0_k3.bin_counts,
                      sbnet_p1_k3.bin_counts,
                      sbnet_p2_k3.bin_counts,
                      sbnet_p2_k1.bin_counts,
                      sbnet_p3_k3.bin_counts,
                      sbnet_p3_k1.bin_counts,
                      sbnet_p4_k3.bin_counts,
                      sbnet_p4_k1.bin_counts,
                      sbnet_p5_k3.bin_counts,
                      sbnet_p5_k1.bin_counts])

# After that, these values go into feed_dict.

AttributeError: module 'f21c708d1ddc75dcce283dd13fe531f7' has no attribute 'sparse_gather'

File "/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 123, in _sparse_scatter_grad
dout_dx = sbnet_module.sparse_gather(
AttributeError: module 'f21c708d1ddc75dcce283dd13fe531f7' has no attribute 'sparse_gather'

Time cost increases

Hi. Thanks for the codes and the detailed instruction.

I implemented sparse convolution into my encoder:

with tf.variable_scope('featureEncoder'):
	auxiShape = (self.inputShape[0], self.inputShape[1], self.inputShape[2], 7)
	featureShape = (self.inputShape[0], self.inputShape[1], self.inputShape[2], 32)
	blockSize = 8
	blockStride = (8,8)
	blockOffset = (0,0)
	blockCount = (self.divup(self.inputShape[1], blockStride[0]), self.divup(self.inputShape[2], blockStride[1]))
	inBlockParams = { "dynamic_bsize": (blockSize, blockSize), "dynamic_boffset": blockOffset, "dynamic_bstride": blockStride }
	outBlockParams = { "dynamic_bsize": (blockSize, blockSize), "dynamic_boffset": blockOffset, "dynamic_bstride": blockStride }
	
	if not self.training:
		indices = sbnet_module.reduce_mask(self.mask, blockCount, tol=0.1, **inBlockParams)
	
		# stack active overlapping tiles to batch dimension
		stack = sbnet_module.sparse_gather(
			auxi, indices.bin_counts, indices.active_block_indices, transpose=False, **inBlockParams)
	else:
		stack = auxi
	# perform dense convolution on a sparse stack of tiles
	stack = self.conv_layer2(stack, 7, 32, name='1')
	stack = tf.nn.leaky_relu(stack)
	stack = self.conv_layer2(stack, 32,32, name='2')
	stack = tf.nn.leaky_relu(stack)
	stack = self.conv_layer2(stack, 32,32, name='3')
	stack = tf.nn.leaky_relu(stack)
	stack = self.conv_layer2(stack, 32,32, name='4')
	stack = tf.nn.leaky_relu(stack)
	stack = self.conv_layer2(stack, 32,32, name='5')
	stack = tf.nn.leaky_relu(stack)

	# write/scatter the tiles back on top of original tensor
	# note that the output tensor is reduced by 1 on each side due to 'VALID' convolution
	if not self.training:
		feature=sbnet_module.sparse_scatter(
			stack, indices.bin_counts, indices.active_block_indices,
			self.lastFeature, transpose=False, add=False, atomic=False, **outBlockParams)
		feature.set_shape(featureShape)
	else:
		feature=stack

self.training is set False when training and True when testing. Variable mask is generated outside the network and fed in via tf.placeholder. So does self.lastFeature.

I tried to measure the inference time with timeline:

feed_dict = {model.source: src, model.target: tgt, model.batch_size:src_hdr.shape[0], model.mask:Mask, model.feature:Feature}
denoised_1_bd, Feature = sess.run([model.fake_image, model.feature], feed_dict, options=run_options, run_metadata=run_metadata)
tl = timeline.Timeline(run_metadata.step_stats)
ctf = tl.generate_chrome_trace_format(show_memory=True)
with open(os.path.join(errorlog_dir, 'timeline.json'),'w') as wd:
	wd.write(ctf)

However, I can't find time records of layers under 'featureEncoder'. And there are two bars captioned unknown, the second of which is strangely long. Some Pooling and LeakyRelu‘s time is also strange, costing nearly 2ms.

I wonder how I can get the proper time measurement. Thanks.

My Environment
TensorFlow Version: 1.15.0
Operating System: Ubuntu 16.04
Python Version: 3.6.13
CUDA Version: 10.0
CUDNN Version: 7.6.4
GPU Type: RTX 2080ti
Nvidia Driver Version: 460.67

sparse_conv_lib doesn't do sparse_conv2d_custom correctly

`import sys
import numpy as np
import tensorflow as tf
sys.path.insert(0, 'sbnet/sbnet_tensorflow/benchmark')
from sparse_conv_lib import convert_mask_to_indices_custom, sparse_conv2d_custom,
calc_block_params, sparse_conv2d, convert_mask_to_indices

size = [1, 704, 800, 6]
grid = (np.random.rand(*size) > 0.95).astype(np.float32)

block_params = calc_block_params(in_size=size,
bsize=[1, 3, 3, 1],
ksize=[3, 3, 1, 1],
strides=[1, 1, 1, 1],
padding='SAME')
with tf.Session() as sess:

x = tf.placeholder(tf.float32, size)
mask = x

w = tf.constant(np.ones(shape=[3, 3, 6, 64]), dtype=tf.float32)

indices = convert_mask_to_indices_custom(mask, block_params, tol=0.1)

y_dense = tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')
y_sparse_custom = sparse_conv2d_custom(x, w,
                                       indices,
                                       block_params,
                                       [1, 1, 1, 1], transpose=True)
# import pdb;pdb.set_trace();
print("Dense:")
a = sess.run([y_dense], feed_dict={x: grid})
print(np.array(a).shape)

print("Sparse (custom):")
b = sess.run([y_sparse_custom], feed_dict={x: grid})
print(np.array(b).shape)

Outputs:
Dense:
(1, 1, 704, 800, 64)
Sparse (custom):
(1, 1, 704, 800, 6)

Running with variable input size

Hi,
Can I use the code with variable input size?
As we can see, the functions below use in_size as the first input parameter:
def calc_block_params(in_size, bsize, ksize, strides, padding, static=True):
def calc_block_params_res_block(in_size, bsize, ksize_list, strides, padding):

I'm using FCN so my network can handle with variable input size, but the function above require explicit input size.

Thanks for your help,
Rafael.

Invalid configuration argument reduce mask

I am getting the following error using a very small block size(1-7), for block size everything works fine.
GPUassert: invalid configuration argument reduce_mask.cu 87

sample.py not working

When I run sample.py, I get the following error:

Traceback (most recent call last):
File "sample.py", line 61, in
indices = sbnet_module.reduce_mask(mask, blockCount, tol=0.5, **inBlockParams)
TypeError: reduce_mask() got an unexpected keyword argument 'bsize'

This gets fixed when I rename bsize, boffset and bstride to dynamic_bsize, dynamic_boffset and dynamic_bstride in the inBlockParams and outBlockParams.

auto testing error

I got the following error when I run make test, I am using GP100 and tensorflow1.2

======================================================================
ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

Traceback (most recent call last):
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 267, in test_sparse_resblock_gradients
xval, mask, bsize, strides, padding, data_format='NHWC')
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 229, in _test_sparse_resblock_gradients
yval = y.eval()
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 606, in eval
return _eval_using_default_session(self, feed_dict, self.graph, session)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3928, in _eval_using_default_session
return session.run(tensors, feed_dict)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
ResourceExhaustedError: OOM when allocating tensor with shape[1008726972,5,5,4]
[[Node: SparseGather = SparseGather[T=DT_FLOAT, boffset=[-1, -1], bsize=[5, 5], bstride=[3, 3], transpose=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Const, Variable_1/read/_69, Variable/read/_71)]]
[[Node: SparseScatter/_73 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_64_SparseScatter", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Caused by op u'SparseGather', defined at:
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/h/14/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/main.py", line 12, in
main(module=None)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/main.py", line 95, in init
self.runTests()
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/main.py", line 232, in runTests
self.result = testRunner.run(self.test)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/runner.py", line 151, in run
test(result)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 70, in call
return self.run(*args, **kwds)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 108, in run
test(result)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 70, in call
return self.run(*args, **kwds)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 108, in run
test(result)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 70, in call
return self.run(*args, **kwds)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/suite.py", line 108, in run
test(result)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/case.py", line 393, in call
return self.run(*args, **kwds)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 267, in test_sparse_resblock_gradients
xval, mask, bsize, strides, padding, data_format='NHWC')
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 221, in _test_sparse_resblock_gradients
use_var=False)
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 770, in sparse_res_block_bottleneck
transpose=transpose)
File "", line 39, in sparse_gather
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1008726972,5,5,4]
[[Node: SparseGather = SparseGather[T=DT_FLOAT, boffset=[-1, -1], bsize=[5, 5], bstride=[3, 3], transpose=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Const, Variable_1/read/_69, Variable/read/_71)]]
[[Node: SparseScatter/_73 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_64_SparseScatter", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

======================================================================
FAIL: test_sparse_conv2d_with_mask_same (sparse_conv_tests.SparseConv2DCustomTests)

Traceback (most recent call last):
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_conv_tests.py", line 460, in test_sparse_conv2d_with_mask_same
self._test_sparse_conv2d_custom_with_mask(mask, bsize, ksize, strides, padding, y_exp)
File "/ais/gobi5/linghuan/sbnet/sbnet_tensorflow/benchmark/sparse_conv_tests.py", line 435, in _test_sparse_conv2d_custom_with_mask
np.testing.assert_array_equal(y_act.reshape(y_exp.shape), y_exp)
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 855, in assert_array_equal
verbose=verbose, header='Arrays are not equal')
File "/u/linghuan/anaconda/envs/tf1.2/lib/python2.7/site-packages/numpy/testing/nose_tools/utils.py", line 779, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not equal

(mismatch 52.0%)
x: array([[[[1.],
[1.],
[1.],...
y: array([[[[1.],
[6.],
[6.],...

Ran 40 tests in 186.952s

FAILED (failures=1, errors=1)
Makefile:14: recipe for target 'test' failed
make: *** [test] Error 1

There is a "TypeError" when I run benchmarks execute.

I have make and make test passed.

When I run this command:
cd sbnet_tensorflow/benchmarks && ./run_all_behchmarks.bash

Error:
TypeError: Input 'active_block_indices' of 'SparseGather' Op has type int64 that does not match expected type of int32.
TypeError: Input 'active_block_indices' of 'SparseGather' Op has type int64 that does not match expected type of int32.
File "/home/znjs/sbnet-master/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 542, in sparse_conv2d_custom
transpose=transpose)
File "", line 39, in sparse_gather
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 513, in apply_op
(prefix, dtypes.as_dtype(input_arg.type).name))
TypeError: Input 'active_block_indices' of 'SparseGather' Op has type int64 that does not match expected type of int32.
TypeError: Input 'active_block_indices' of 'SparseGather' Op has type int64 that does not match expected type of int32.

Ubuntu 14.04
GTX 1050Ti
tensorflow 1.2.1
Python 2.7.6
CUDA 8.0
cuDNN 5.1

No speed-up despite sparsity

Hello,

Nice paper! Unfortunately, I am unable to reproduce your speed-ups.

This is what I do:

from sparse_conv_lib import convert_mask_to_indices_custom, sparse_conv2d_custom, \
    calc_block_params, sparse_conv2d, convert_mask_to_indices

size = [1, 1024, 1024, 1]
grid = (np.random.rand(*size) > 0.95).astype(np.float32)

block_params = calc_block_params(in_size=size,
                                 bsize=[1, 3, 3, 1],
                                 ksize=[3, 3, 1, 1],
                                 strides=[1, 1, 1, 1],
                                 padding='SAME')
with tf.Session() as sess:

    x = tf.placeholder(tf.float32, size)
    mask = x
    
    w = tf.constant(np.ones(shape=[3, 3, 1, 64]), dtype=tf.float32)

    indices = convert_mask_to_indices_custom(mask, block_params, tol=0.1)
    
    y_dense = tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')
    y_sparse_custom = sparse_conv2d_custom(x, w,
                                           indices, 
                                           block_params,
                                           [1, 1, 1, 1], transpose=True)
    
    print("Dense:")
    %timeit -n10 sess.run([y_dense], feed_dict={x: grid})
    
    print("Sparse (custom):")
    %timeit -n10 sess.run([y_sparse_custom], feed_dict={x: grid})

and it seems the sparse version, despite ~95% sparsity, is 10x slower!?

Dense:
46.5 ms ± 19.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sparse (custom):
400 ms ± 29.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

I am pretty sure there is something I am misunderstanding from your code. Could you clarify?
I tried looking into your benchmark scripts, I'd try to isolate a single layer first...

Thanks!

3D LiDAR Dataset

If you can provide me with 3D LiDAR Dataset, I would be greatly appreciated.

Usage within C++ inference pipeline

How do I call the functions from C++ pipeline, do I still need to load the .so using TF_LoadLibrary? Any use cases would be useful.

How can I train the model with sbnet_module?

I try to train model with MNIST dataset using sbnet_module, but

LookupError: No gradient defined for operation 'conv2/SparseScatterVar' (op type: SparseScatterVar)

How can I update gradient using sbnet_module?
I don't know how to use @ops.RegsiterGradient("SparseGather") and @ops.RegsiterGradient("SparseScatter")
Below is a sbnet_module conv2d function for training.

from sparse_conv_lib import calc_block_params, convert_mask_to_indices_custom

def sparse_conv2d(x, W, hw):
    xsize_ = [batch, hw, hw, 1]

    mask = generate_top_left_mask(xsize_, 0.90)
    block_params = calc_block_params(xsize_,
                                     bsize_,
                                     ksize_,
                                     strides,
                                     padding='VALID')
    ind = convert_mask_to_indices_custom(mask, block_params, 0.0, True)
    x_ = tf.Variable(x)
    p = sbnet_module.sparse_gather(
        x_, 
        ind.bin_counts,
        ind.active_block_indices,
        bsize=block_params.bsize,
        boffset=block_params.boffset,
        bstride=block_params.bstrides,
        transpose=True)
    q = tf.nn.conv2d(p, W, strides, 'VALID', data_format='NCHW', use_cudnn_on_gpu=True)
    y = sbnet_module.sparse_scatter_var(
        q,
        ind.bin_counts,
        ind.active_block_indices,
        x_,
        bsize=block_params.bsize_out,
        boffset=[0, 0],
        bstride=block_params.bstrides,
        add=False,
        transpose=True,
        atomic=False)
    return y

No gradient defined for SparseScatter_Var

Trying to reproduce the paper's result with the recommended sparse_scatter_var method, but I am getting an error which says the gradients are not registered
LookupError: No gradient defined for operation 'network/SparseScatterVar' (op type: SparseScatterVar)

make test fails

======================================================================
ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

Traceback (most recent call last):
File "/home/dhingratul/OneDrive/Projects/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 273, in test_sparse_resblock_gradients
xval, mask, bsize, strides, padding, data_format='NHWC', dynamic_size=dynamic_size)
File "/home/dhingratul/OneDrive/Projects/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 211, in _test_sparse_resblock_gradients
py_inds = sess.run([tf_ind])
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1120, in _run
self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 427, in init
self._fetch_mapper = _FetchMapper.for_fetch(fetches)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 245, in for_fetch
return _ListFetchMapper(fetch)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 352, in init
self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 245, in for_fetch
return _ListFetchMapper(fetch)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 352, in init
self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 253, in for_fetch
return _ElementFetchMapper(fetches, contraction_fn)
File "/home/dhingratul/.virtualenvs/sbnet/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 289, in init
'Tensor. (%s)' % (fetch, str(e)))
ValueError: Fetch argument <tf.Tensor 'ReduceMask_1:0' shape= dtype=int16> cannot be interpreted as a Tensor. (Tensor Tensor("ReduceMask_1:0", dtype=int16) is not an element of this graph.)

Ran 40 tests in 389.501s

FAILED (errors=1)
Makefile:14: recipe for target 'test' failed
make: *** [test] Error 1

make test error 1

TensorFlow installed from (source or binary): pip (conda env)
TensorFlow version (use command below): 1.8.0
Python version: 2.7
GCC/Compiler version (if compiling from source):5.4.0
CUDA/cuDNN version:9/7.1.2
GPU model and memory: GTX1060 6G
Exact command to reproduce: make test

cd ../benchmark && bash run_all_unittests.bash # unit tests
sparse_res_block_tests (unittest.loader.ModuleImportFailure) ... ERROR
reduce_mask_tests (unittest.loader.ModuleImportFailure) ... ERROR
sparse_conv_tests (unittest.loader.ModuleImportFailure) ... ERROR
sparse_scatter_tests (unittest.loader.ModuleImportFailure) ... ERROR
sparse_gather_tests (unittest.loader.ModuleImportFailure) ... ERROR
test_calc_out_size (tf_conv_dims_tests.CalcOutSizeDeconvTests) ... ok
test_session (tf_conv_dims_tests.CalcOutSizeDeconvTests)
Returns a TensorFlow Session for use in executing tests. ... ok
test_calc_out_size (tf_conv_dims_tests.CalcOutSizeTests) ... ok
test_session (tf_conv_dims_tests.CalcOutSizeTests)
Returns a TensorFlow Session for use in executing tests. ... ok
test_calc_padding (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_ksize_list (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_strides_list (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_err_strides_tensor (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_stride (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_calc_padding_valid (tf_conv_dims_tests.CalcPaddingTests) ... ok
test_session (tf_conv_dims_tests.CalcPaddingTests)
Returns a TensorFlow Session for use in executing tests. ... ok

======================================================================
ERROR: sparse_res_block_tests (unittest.loader.ModuleImportFailure)

ImportError: Failed to import test module: sparse_res_block_tests
Traceback (most recent call last):
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
import(name)
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_res_block_tests.py", line 32, in
from sparse_conv_lib import _get_offset_array
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 74, in
sbnet_module = tf.load_op_library('../sbnet_ops/libsbnet.so')
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
NotFoundError: ../sbnet_ops/libsbnet.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

======================================================================
ERROR: reduce_mask_tests (unittest.loader.ModuleImportFailure)

ImportError: Failed to import test module: reduce_mask_tests
Traceback (most recent call last):
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
import(name)
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/reduce_mask_tests.py", line 25, in
from sparse_conv_lib import convert_mask_to_indices, convert_mask_to_indices_custom
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 74, in
sbnet_module = tf.load_op_library('../sbnet_ops/libsbnet.so')
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
NotFoundError: ../sbnet_ops/libsbnet.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

======================================================================
ERROR: sparse_conv_tests (unittest.loader.ModuleImportFailure)

ImportError: Failed to import test module: sparse_conv_tests
Traceback (most recent call last):
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
import(name)
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_tests.py", line 29, in
from sparse_conv_lib import _get_offset_array
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 74, in
sbnet_module = tf.load_op_library('../sbnet_ops/libsbnet.so')
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
NotFoundError: ../sbnet_ops/libsbnet.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

======================================================================
ERROR: sparse_scatter_tests (unittest.loader.ModuleImportFailure)

ImportError: Failed to import test module: sparse_scatter_tests
Traceback (most recent call last):
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
import(name)
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_scatter_tests.py", line 26, in
from sparse_conv_lib import sbnet_module
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 74, in
sbnet_module = tf.load_op_library('../sbnet_ops/libsbnet.so')
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
NotFoundError: ../sbnet_ops/libsbnet.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

======================================================================
ERROR: sparse_gather_tests (unittest.loader.ModuleImportFailure)

ImportError: Failed to import test module: sparse_gather_tests
Traceback (most recent call last):
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 254, in _find_tests
module = self._get_module_from_name(name)
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/unittest/loader.py", line 232, in _get_module_from_name
import(name)
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_gather_tests.py", line 26, in
from sparse_conv_lib import convert_mask_to_block_indices, convert_mask_to_indices_custom
File "/home/jnghhk/PycharmProjects/py2/sbnet/sbnet_tensorflow/benchmark/sparse_conv_lib.py", line 74, in
sbnet_module = tf.load_op_library('../sbnet_ops/libsbnet.so')
File "/home/jnghhk/anaconda3/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
NotFoundError: ../sbnet_ops/libsbnet.so: undefined symbol: _ZN10tensorflow8internal21CheckOpMessageBuilder9NewStringEv

Ran 16 tests in 4.775s

FAILED (errors=5)
Makefile:14: recipe for target 'test' failed
make: *** [test] Error 1

uber-research / sbnet Goto Github PK

sbnet's Introduction

Sparse Blocks Network (SBNet)

Prerequisites

Hardware requirements

Setup

Benchmarks and tests

Submanifold Sparse Convolutional Networks Benchmark

Other notes

Contributing to this repository

Citation

sbnet's People

Contributors

Stargazers

Watchers

Forkers

sbnet's Issues

The main code is creating randomly changing mask indices for every loop using sbnet_module.

And the result was getting slower as the session run multiple times.

So, my question is

====================================================================== ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

====================================================================== FAIL: test_sparse_conv2d_with_mask_same (sparse_conv_tests.SparseConv2DCustomTests)

LookupError: No gradient defined for operation 'conv2/SparseScatterVar' (op type: SparseScatterVar)

====================================================================== ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

====================================================================== ERROR: sparse_res_block_tests (unittest.loader.ModuleImportFailure)

====================================================================== ERROR: reduce_mask_tests (unittest.loader.ModuleImportFailure)

====================================================================== ERROR: sparse_conv_tests (unittest.loader.ModuleImportFailure)

====================================================================== ERROR: sparse_scatter_tests (unittest.loader.ModuleImportFailure)

====================================================================== ERROR: sparse_gather_tests (unittest.loader.ModuleImportFailure)

Recommend Projects

Recommend Topics

Recommend Org

======================================================================
ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

======================================================================
FAIL: test_sparse_conv2d_with_mask_same (sparse_conv_tests.SparseConv2DCustomTests)

======================================================================
ERROR: test_sparse_resblock_gradients (sparse_res_block_tests.SparseResBlockGradientTests)

======================================================================
ERROR: sparse_res_block_tests (unittest.loader.ModuleImportFailure)

======================================================================
ERROR: reduce_mask_tests (unittest.loader.ModuleImportFailure)

======================================================================
ERROR: sparse_conv_tests (unittest.loader.ModuleImportFailure)

======================================================================
ERROR: sparse_scatter_tests (unittest.loader.ModuleImportFailure)

======================================================================
ERROR: sparse_gather_tests (unittest.loader.ModuleImportFailure)