postech-cvlab / fastpointtransformer Goto Github PK

View Code? Open in Web Editor NEW

264.0 6.0 41.0 579 KB

Official source code of Fast Point Transformer, CVPR 2022

License: MIT License

Python 88.61% Shell 2.01% C++ 4.71% Cuda 4.67%

computer-vision 3d-vision transformer point-cloud cvpr2022

fastpointtransformer's Introduction

Fast Point Transformer

Project Page | Paper

This repository contains the official source code and data for our paper:

Fast Point Transformer
Chunghyun Park, Yoonwoo Jeong, Minsu Cho, and Jaesik Park
POSTECH GSAI & CSE
CVPR, New Orleans, 2022.

Overview

This work introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel based method, and our network achieves 129 times faster inference time than the state-of-the-art, Point Transformer, with a reasonable accuracy trade-off in 3D semantic segmentation on S3DIS dataset.

Citation

If you find our code or paper useful, please consider citing our paper:

@inproceedings{park2022fast,
 title={Fast Point Transformer},
 author={Park, Chunghyun and Jeong, Yoonwoo and Cho, Minsu and Park, Jaesik},
 booktitle={Proceedings of the {IEEE/CVF} Conference on Computer Vision and Pattern Recognition (CVPR)},
 month={June},
 year={2022},
 pages={16949-16958}
}

Experiments

1. S3DIS Area 5 test

We denote MinkowskiNet42 trained with this repository as MinkowskiNet42^†. We use voxel size 4cm for both MinkowskiNet42^† and our Fast Point Transformer.

Model	Latency (sec)	mAcc (%)	mIoU (%)	Reference
PointTransformer	18.07	76.5	70.4	Codes from the authors
MinkowskiNet42^†	0.08	74.1	67.2	Checkpoint
+ rotation average	0.66	75.1	69.0	-
FastPointTransformer	0.14	76.6	69.2	Checkpoint
+ rotation average	1.13	77.6	71.0	-

2. ScanNetV2 validation

Model	Voxel Size	mAcc (%)	mIoU (%)	Reference
MinkowskiNet42	2cm	80.4	72.2	Official GitHub
MinkowskiNet42^†	2cm	81.4	72.1	Checkpoint
FastPointTransformer	2cm	81.2	72.5	Checkpoint
MinkowskiNet42^†	5cm	76.3	67.0	Checkpoint
FastPointTransformer	5cm	78.9	70.0	Checkpoint
MinkowskiNet42^†	10cm	70.8	60.7	Checkpoint
FastPointTransformer	10cm	76.1	66.5	Checkpoint

Installation

This repository is developed and tested on

Ubuntu 18.04 and 20.04
Conda 4.11.0
CUDA 11.1 and 11.3
Python 3.8.13
PyTorch 1.7.1, 1.10.0, and 1.12.1
MinkowskiEngine 0.5.4

Environment Setup

You can install the environment by using the provided shell script:

~$ git clone --recursive [email protected]:POSTECH-CVLab/FastPointTransformer.git
~$ cd FastPointTransformer
~/FastPointTransformer$ bash setup.sh fpt
~/FastPointTransformer$ conda activate fpt

Training & Evaluation

First of all, you need to download the datasets (ScanNetV2 and S3DIS), and preprocess them as:

(fpt) ~/FastPointTransformer$ python src/data/preprocess_scannet.py # you need to modify the data path
(fpt) ~/FastPointTransformer$ python src/data/preprocess_s3dis.py # you need to modify the data path

And then, locate the provided meta data of each dataset (src/data/meta_data) with the preprocessed dataset following the structure below:

${data_dir}
├── scannetv2
│   ├── meta_data
│   │   ├── scannetv2_train.txt
│   │   ├── scannetv2_val.txt
│   │   └── ...
│   └── scannet_processed
│       ├── train
│       │   ├── scene0000_00.ply
│       │   ├── scene0000_01.ply
│       │   └── ...
│       └── test
└── s3dis
    ├── meta_data
    │   ├── area1.txt
    │   ├── area2.txt
    │   └── ...
    └── s3dis_processed
        ├── Area_1
        │   ├── conferenceRoom_1.ply
        │   ├── conferenceRoom_2.ply
        │   └── ...
        ├── Area_2
        └── ...

After then, you can train and evalaute a model by using the provided python scripts (train.py and eval.py) with configuration files in the config directory. For example, you can train and evaluate Fast Point Transformer with voxel size 4cm on S3DIS dataset via the following commands:

(fpt) ~/FastPointTransformer$ python train.py config/s3dis/train_fpt.gin
(fpt) ~/FastPointTransformer$ python eval.py config/s3dis/eval_fpt.gin {checkpoint_file} # use -r option for rotation averaging.

Consistency Score

You need to generate predictions via the following command:

(fpt) ~/FastPointTransformer$ python -m src.cscore.prepare {checkpoint_file} -m {model_name} -v {voxel_size} # This takes hours.

Then, you can calculate the consistency score (CScore) with:

(fpt) ~/FastPointTransformer$ python -m src.cscore.calculate {prediction_dir} # This takes seconds.

3D Object Detection using VoteNet

Please refer this repository.

Acknowledgement

Our code is based on the MinkowskiEngine. We also thank Hengshuang Zhao for providing the code of Point Transformer. If you use our model, please consider citing them as well.

fastpointtransformer's People

Contributors

Stargazers

Watchers

fastpointtransformer's Issues

How to get the access to wandb?

An error is reported when training the dataset, saying "Error while calling W&B API: permission denied (<Response [403]>)". I checked the documentation of wandb and it says I don't have the project permissions. I want to know how to get the permission, thanks for your help!

import cuda_sparse_ops

Thanks to the author for such quality code. I have some questions for the author,
in the sparse_ops.py file.
import cuda_sparse_ops
keeps reporting errors.
How to solve it? ? ?
And how to run setup.py in src.cuda_ops？？？？？？

How to get access to wandb?

It occurs when I start to train on S3DIS datset via the command "python train.py config/s3dis/train_fpt.gin"

I tried installing the module using pip but it said "No matching distribution found for cuda_sparse_ops", and I didn't find any solution on the Internet. Is it because there was something wrong with my installation?

Questions about model runs memory and FLOPs

Hello, I would like to ask you how much memory and FLOPs your model runs on S3DIS, I would like to quote your paper, but there is something wrong with my computer, so I thought I would ask you directly!

[Question] hardware used and training time

The paper mentions an inference time extremely reduced compared to the original PointTransformer, but I was also curious about the time it took to train the model. What GPUs did you use and how much time did it take to train compared to PointTransformer ?

Thx a lot

Why δ(vi-vj) is O(KD)?

Hi!
When I read your paper at reducing space complexity section, it says that the space complexity of δrel(vi-vj) is O(KD),I can't understand it.
I think there are K neighbors for each voxel,why not O(IKD)? I hope you can help me ,Thanks!

The stride parameter in LSA layer

Hello, I 'd like to know about whether in the implementation, the stride shoud be kept as 1. What other things I may need to do if I want
to expand it to an arbitrary number.

Could it utilize the inter-frame information?

Hi,Thanks for your great work first,but I have some questions about the attention block.
For example, I set batch size = 2, how can i find the query voxels around both two frames.

about pytorch_lightning version

What version of pytorch_lightning are you using? Thank you

Questions about fast point transformer

Thanks for your amazing work, and I have few questions about the implemention of this architecture. Thank you for any answers.

In the code of LightweightSelfAttentionLayer, why the inter postion embeding is initialized as a learnable random variable ?

self.inter_pos_enc = nn.Parameter(torch.FloatTensor(self.kernel_volume, self.num_heads, self.attn_channels))
nn.init.normal_(self.inter_pos_enc, 0, 1)

According to Fig 3 in the paper, shouldn't it be obtained from the coordinate difference between the current voxel and neighboring voxels ?

How many specific neighboring voxels are indexed in LightweightSelfAttentionLayer ？Is the number of neighboring voxels determined by kernel_size in the input parameter? Is the neighboring voxels the valid voxels contained in the kernel ?

how to download the dataset?

Before calling the script preprocess_s3dis.py,I need to download the data manually, right? Or automatic download in the script preprocess_s3dis.py? If I need to download it automatically,How to download the dataset?

DDP/DP training - multigpu

Hi @chrockey, great work!

Can you guide me on how to set up multigpu training? I have only 20GB gpus available, and when using batch size of 2 I obtain poor performance (~6% lower mIoU and mAcc; probably due to the batch norm and batch size).

If I add multigpu support (DDP) according to the example from the ME repository the learning is blocked, i.e. it never starts.

Any help will be appreciated. You commented "multi-GPU training is currently not supported" in the code. Have you had similar issues as I mentioned?

Thanks!

Code for 3d object detection

Thanks for the great work!
I would like to learn if the code for 3d object detection is released?

Training Log

Hellow! Thank you for your awesome work!
I find that it takes 20 hours on A100 to train your model. Could I have a look at your training log on S3DIS?

questions

Thanks for the great work! I have some questions.

I wonder where the GPU memory budget cost in FastTrans. Because I only have GTX1080, 12G. Do you test the model for small-scale task like ShapeNet DataSet? And can I apply the lightweight self-attention to get a better feature embedding for ShapeNet classification in my 1080 machine? Maybe I can use a smaller FastTrans?
I notice that the voxel size is used for data augmention. If i use the model for ShapeNet classification which normalized coordinates lie between -1 and 1. Can I remove the data augmention and just use original coordinates? Or how to get a feasible voxel size ?

RuntimeError

result, COO, vals = MEB.coo_spmm_average_int32(
RuntimeError: at /FastPointTransformer/thirdparty/MinkowskiEngine/src/gpu.cu:100
I encountered this issue during runtime.I don't know how to solve it. Can you help me take a look

About the environment

Could you share some environmental information? I can successfully install pytorch according to setup.py, but there is a conflict when I install openblas-devel.