Git Product home page Git Product logo

3d-oae's Introduction

Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder

Created by Junsheng Zhou, Xin Wen, Baorui Ma, Yu-Shen Liu, Yue Gao, Yi Fang, Zhizhong Han

[arXiv] [Project Page] [Models]

This repository contains official PyTorch implementation for Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder.

intro

We present a novel self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE). Our key idea is to randomly occlude some local patches of the input point cloud and establish the supervision via recovering the occluded patches using the remaining visible ones. Specifically, we design an encoder for learning the features of visible local patches, and a decoder for leveraging these features to predict the occluded patches. In contrast to previous methods, our 3D-OAE can remove a large proportion of patches and predict them only with a small number of visible patches, which enable us to significantly accelerate training and yield a nontrivial self-supervisory performance. The trained encoder can be further transferred to various downstream tasks. We demonstrate our superior performances over the state-of-the-art methods in different discriminant and generative applications under widely used benchmarks.

intro

We first extract seed points from the input point cloud using FPS, and then separate the input into point patches by grouping local points around each seed point using KNN. After that, we randomly occlude high ratio of patches and subtract each visible patch to its corresponding seed point for detaching the patch from its spatial location. The encoder operates only on the embeddings of visible patches and the learnable occlusion tokens are combined to the latent feature before the decoder . Finally, we operate addition to the output patches and their corresponding seed points to regain their spatial locations and further merge the local patches into a complete shape, where we compute a loss function with the ground truth.

Pretrained Models

Model Dataset Task Performance Config Url
3D-OAE (SSL) ShapeNet Linear-SVM 92.3 (Acc.) config Google Drive
Transformer/PoinTr PCN Point Cloud Completion 6.97 (CD.) config Google Drive
Transformer ModelNet Classification 93.4 (Acc.) config Google Drive
Transformer ScanObjectNN Classification 89.16 (Acc.) config Google Drive
Transformer ScanObjectNN Classification 88.64 (Acc.) config Google Drive
Transformer ScanObjectNN Classification 83.17 (Acc.) config Google Drive
Transformer ShapeNetPart Part Segmentation 85.7 (Acc.) config Google Drive

Usage

Requirements

  • PyTorch >= 1.7.0
  • python >= 3.7
  • CUDA >= 9.0
  • GCC >= 4.9
  • torchvision
  • timm
  • open3d
  • tensorboardX
pip install -r requirements.txt

Building Pytorch Extensions for Chamfer Distance, PointNet++ and kNN

NOTE: PyTorch >= 1.7 and GCC >= 4.9 are required.

# Chamfer Distance
bash install.sh
# PointNet++
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Dataset

We use ShapeNet for the self-supervised learning of 3D-OAE models. And finetuning the 3D-OAE models on ModelNet, ScanObjectNN, PCN and ShapeNetPart

The details of used datasets can be found in DATASET.md.

Self-supervised learning

For self-supervised learning of 3D-OAE models on ShapeNet, simply run:

bash ./scripts/run_OAE.sh <NUM_GPU> \
    --config cfgs/SSL_models/Point-OAE_2k.yaml \
    --exp_name <name> \
    --val_freq 1

val_freq controls the frequence to evaluate the Transformer on ModelNet40 with LinearSVM.

Fine-tuning on downstream tasks

We finetune our 3D-OAE on 6 downstream tasks: LinearSVM on ModelNet40, Classfication on ModelNet40, Few-shot learning on ModelNet40, Point completion on PCN dataset, Transfer learning on ScanObjectNN and Part segmentation on ShapeNetPart.

ModelNet40

To finetune a pre-trained 3D-OAE model on ModelNet40, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/ModelNet_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Few-shot Learning on ModelNet40

First, preparing the few-shot learning split and dataset. (see DATASET.md). Then run:

bash ./scripts/run_OAE.sh <GPU_IDS> \
    --config cfgs/Fewshot_models/Transformer_1k.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name> \
    --way <int> \
    --shot <int> \
    --fold <int>

ScanObjectNN

To finetune a pre-trained 3D-OAE model on ScanObjectNN, simply run:

bash ./scripts/run_OAE.sh <GPU_IDS>  \
    --config cfgs/ScanObjectNN_models/Transformer_hardest.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Point Cloud Completion

To finetune a pre-trained 3D-OAE model on PCN, simply run:

bash ./scripts/run_OAE_pcn.sh <GPU_IDS>  \
    --config cfgs/PCN_models/Transformer_pcn.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Part Segmentation

To finetune a pre-trained 3D-OAE model on ShapeNetPart, simply run:

bash ./scripts/run_OAE_seg.sh <GPU_IDS>  \
    --config cfgs/ShapeNetPart_models/Transformer_seg.yaml \
    --finetune_model \
    --ckpts <path> \
    --exp_name <name>

Visualization

Point cloud self-reconstruction results using our 3D-OAE model trained on ShapeNet:

results

Point cloud completion results using our 3D-OAE model trained on PCN dataset:

results

License

MIT License

Acknowledgements

Some of the code of this repo is borrowed from Point-BERT. We thank the authors for their great job!

Citation

If you find our work useful in your research, please consider citing:

@article{zhou2022-3DOAE,
      title={Self-Supervised Point Cloud Representation Learning with Occlusion Auto-Encoder},
      author={Zhou, Junsheng and Wen, Xin and Ma, Baorui and Liu, Yu-Shen and Gao, Yue and Fang, Yi and Han, Zhizhong},
      journal={arXiv preprint arXiv:2203.14084},
      year={2022}
}

3d-oae's People

Contributors

junshengzhou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.