Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder

Created by Junsheng Zhou, Xin Wen, Baorui Ma, Yu-Shen Liu, Yue Gao, Yi Fang, Zhizhong Han

This repository contains official PyTorch implementation for Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder.

We present a novel self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE). Our key idea is to randomly occlude some local patches of the input point cloud and establish the supervision via recovering the occluded patches using the remaining visible ones. Specifically, we design an encoder for learning the features of visible local patches, and a decoder for leveraging these features to predict the occluded patches. In contrast to previous methods, our 3D-OAE can remove a large proportion of patches and predict them only with a small number of visible patches, which enable us to significantly accelerate training and yield a nontrivial self-supervisory performance. The trained encoder can be further transferred to various downstream tasks. We demonstrate our superior performances over the state-of-the-art methods in different discriminant and generative applications under widely used benchmarks.

We first extract seed points from the input point cloud using FPS, and then separate the input into point patches by grouping local points around each seed point using KNN. After that, we randomly occlude high ratio of patches and subtract each visible patch to its corresponding seed point for detaching the patch from its spatial location. The encoder operates only on the embeddings of visible patches and the learnable occlusion tokens are combined to the latent feature before the decoder . Finally, we operate addition to the output patches and their corresponding seed points to regain their spatial locations and further merge the local patches into a complete shape, where we compute a loss function with the ground truth.

Pretrained Models

Model	Dataset	Task	Performance	Config	Url
3D-OAE (SSL)	ShapeNet	Linear-SVM	92.3 (Acc.)	config	Google Drive
Transformer/PoinTr	PCN	Point Cloud Completion	6.97 (CD.)	config	Google Drive
Transformer	ModelNet	Classification	93.4 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	89.16 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	88.64 (Acc.)	config	Google Drive
Transformer	ScanObjectNN	Classification	83.17 (Acc.)	config	Google Drive
Transformer	ShapeNetPart	Part Segmentation	85.7 (Acc.)	config	Google Drive

Usage

Requirements

PyTorch >= 1.7.0
python >= 3.7
CUDA >= 9.0
GCC >= 4.9
torchvision
timm
open3d
tensorboardX

pip install -r requirements.txt

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

NOTE: PyTorch >= 1.7 and GCC >= 4.9 are required.

# Chamfer Distance
bash install.sh
# PointNet++
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Dataset

We use ShapeNet for the self-supervised learning of 3D-OAE models. And finetuning the 3D-OAE models on ModelNet, ScanObjectNN, PCN and ShapeNetPart

The details of used datasets can be found in DATASET.md.

Self-supervised learning

For self-supervised learning of 3D-OAE models on ShapeNet, simply run:

bash ./scripts/run_OAE.sh <NUM_GPU> \
    --config cfgs/SSL_models/Point-OAE_2k.yaml \
    --exp_name <name> \
    --val_freq 1

val_freq controls the frequence to evaluate the Transformer on ModelNet40 with LinearSVM.

Fine-tuning on downstream tasks

We finetune our 3D-OAE on 6 downstream tasks: LinearSVM on ModelNet40, Classfication on ModelNet40, Few-shot learning on ModelNet40, Point completion on PCN dataset, Transfer learning on ScanObjectNN and Part segmentation on ShapeNetPart.

ModelNet40

To finetune a pre-trained 3D-OAE model on ModelNet40, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/ModelNet_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Few-shot Learning on ModelNet40

First, preparing the few-shot learning split and dataset. (see DATASET.md). Then run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/Fewshot_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name> \
    --way <int> \
    --shot <int> \
    --fold <int>

ScanObjectNN

To finetune a pre-trained 3D-OAE model on ScanObjectNN, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS>  \
    --config cfgs/ScanObjectNN_models/Transformer_hardest.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Point Cloud Completion

To finetune a pre-trained 3D-OAE model on PCN, simply run:

bash ./scripts/run_OAE_pcn.sh <GPU_IDS>  \
    --config cfgs/PCN_models/Transformer_pcn.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Part Segmentation

To finetune a pre-trained 3D-OAE model on ShapeNetPart, simply run:

bash ./scripts/run_OAE_seg.sh <GPU_IDS>  \
    --config cfgs/ShapeNetPart_models/Transformer_seg.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Visualization

Point cloud self-reconstruction results using our 3D-OAE model trained on ShapeNet:

Point cloud completion results using our 3D-OAE model trained on PCN dataset:

License

MIT License

Acknowledgements

Some of the code of this repo is borrowed from Point-BERT. We thank the authors for their great job!

Citation

If you find our work useful in your research, please consider citing:

@article{zhou2022-3DOAE,
      title={Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder},
      author={Zhou, Junsheng and Wen, Xin and Ma, Baorui and Liu, Yu-Shen and Gao, Yue and Fang, Yi and Han, Zhizhong},
      journal={arXiv preprint arXiv:2203.14084},
      year={2022}
}

jlqzzz / 3d-oae Goto Github PK

3d-oae's Introduction