Git Product home page Git Product logo

ptt's Introduction

PTT: PointTrackTransformer

Overview

Introduction

This is the official code release of "PTT: Point-Track-Transformer Module for 3D Single Object Trackingin Point Clouds"(Accepted as Contributed paper in IROS 2021). 🌟 🌟 🌟

conference paper | video(youtube) | video(bilibili)

This work is towards the point-based 3D SOT (Single Object Tracking) task, and is dedicated to solving several challenges brought by the natural sparsity of point cloud, such as: error accumulation, sparsity sensitivity, and feature ambiguity.

To this end, we proposed our PTT, a framework combining transformer and tracking pipeline. The main pipeline of PTT is as following. Experiments show that tracker can well achieve robust tracking in sparse point cloud scenes (less than 50 foreground points) by using Transformer's Self Attention to re-weight sparse features.

main-pipeline

Performance

Here, we show the latest performance of our PTT. In order to better open source our code, we reconstruct the code and optimized some parameters compared to the version in the paper. It is worth noting that we unified the environment and parameter settings of the final version, so the model performance is slightly different from the paper. The performances after code reconstruction are as follows:

kitti dataset

Car Ped Cyclist Van
Success 69.0 47.7 41.0 55.3
Precision 82.1 72.2 49.4 64.0

nuScenes dataset

Car Truck Bus Trailer
Success 40.2 46.5 39.4 51.7
Precision 45.8 46.7 36.7 46.5

For nuScenes, we follow the settings of BAT to retrain and test our model. And these results are all trained with batchsize 48 on a single Nvidia RTX 3090, while the results of extended journal paper are trained with 8 x 2080Ti GPUs.

Setup

installation

  1. install some dependences

    apt update && apt-get install git libgl1 -y
    
  2. create conda env and install python 3.8

    conda create -n ptt python=3.8 -y
    conda activate ptt
    git clone https://github.com/shanjiayao/PTT
    cd PTT/
  3. install torch

    pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

    It is worth noting that we tested our code on different versions of cuda, and finally found that the performance will be different due to the randomness of the cuda version. So please use cuda version at least 11.0 and install torch follow the above command.

  4. install others

    pip install -r requirements.txt
    conda install protobuf -y
  5. [optional] install visualize tools

    pip install vtk==9.0.1
    pip install mayavi==4.7.4 pyqt5==5.15.6
  6. setup ptt package

    python setup.py develop   # ensure be root dir

dataset configuration

  1. Kitti

    Download the dataset from KITTI Tracking and organize the downloaded files as follows:

    PTT                                           
    |-- data                                     
    |   |-- kitti                                                                          
    │   │   └── training
    │   │       ├── calib
    │   │       ├── label_02
    │   │       └── velodyne
    
  2. nuScenes

    Download the dataset from nuScenes and organize the downloaded files as follows:

    PTT                                           
    |-- data              
    |   └── nuScenes                                                      
    |       |── maps
    |       |── samples
    |       |── sweeps
    |       └── v1.0-trainval

QuickStart

configs

The model configs are located within tools/cfgs for different datasets. Please refer to ptt.yaml to learn more introduction about the model configs.

pretrained models

Here we provide the pretrained models on both kitti and nuscenes dataset. You can download these models from google drive. Then organize the downloaded files as follows:

PTT
├── output
│   ├── kitti_models
│   └── nuscenes_models

train

For training, you can customize the training by modifying the parameters in the yaml file of the corresponding model, such as 'CLASS_NAMES', 'OPTIMIZATION', 'TRAIN' and 'TEST'.

After configuring the yaml file, run the following command to parser the path of config file and the training tag.

cd PTT/tools
# python train_tracking.py --cfg_file cfgs/kitti_models/ptt.yaml --extra_tag car
python train_tracking.py --cfg_file $model_config_path --extra_tag $your_train_tag

By default, we use a single Nvidia RTX 3090 for training.

For training with ddp, you can execute the following command ( ensure be root dir ):

# bash scripts/train_ddp.sh 2 --cfg_file cfgs/kitti_models/ptt.yaml --extra_tag car
bash scripts/train_ddp.sh $NUM_GPUs --cfg_file $model_config_path --extra_tag $your_train_tag

eval

Similar to training, you need to configure parameters such as 'CLASS_NAMES' in the yaml file first, and then run the following commands to test single checkpoint.

cd PTT/tools
# python test_tracking.py --cfg_file cfgs/kitti_models/ptt.yaml --extra_tag car --ckpt ../output/kitti_models/ptt/car/ckpt/best_model.pth
python test_tracking.py --cfg_file $model_config_path --extra_tag $your_train_tag --ckpt $your_saved_ckpt

If you need to test all models, you could modify the default value of 'eval_all' in here before running above command.

After evaluation, the results are saved to the same path as the model, such as 'output/kitti_models/ptt/car/'.

Acknowledgment

Citation

If you find the project useful for your research, you may cite,

@INPROCEEDINGS{ptt,
  author={Shan, Jiayao and Zhou, Sifan and Fang, Zheng and Cui, Yubo},
  booktitle={2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, 
  title={PTT: Point-Track-Transformer Module for 3D Single Object Tracking in Point Clouds}, 
  year={2021},
  volume={},
  number={},
  pages={1310-1316},
  doi={10.1109/IROS51168.2021.9636821}}
@ARTICLE{ptt-journal,
  author={Jiayao, Shan and Zhou, Sifan and Cui, Yubo and Fang, Zheng},
  journal={IEEE Transactions on Multimedia}, 
  title={Real-time 3D Single Object Tracking with Transformer}, 
  year={2022},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TMM.2022.3146714}}

ptt's People

Contributors

iliiliiliili avatar shanjiayao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ptt's Issues

Paper and Code.

Hi, it's really a nice work, but I wonder where can I get the related paper and code?

Nuscenes dataset

Hello author, thank you for your excellent work, I would like to know how to download the Nuscenes dataset? Is it enough to download the entire Trainval data set or the KeyFrame blobs part.

ubuntu qt.qpa.xcb: could not connect to display

I got an error when connecting to Ubuntu training using FinalShell on my MAC

ubuntu qt.qpa.xcb: could not connect to display

Can you tell me how to solve the problem, I reproduce the P2B code is successful

Questions about PTT module

Hi,I'm trying to reproduce your work. But i can't achieve the same result as you. I have several questions as follow:
1、 Could you please tell me the difference between the PTT module in your work and the transformer block in the work "Point transformer". When i try to reproduce your work, i I use the transformer block in the work "Point transformer" directly.
2、How do you set the parameter "k" when you use K-nearest-neighbor (KNN) to get position encoding.

如何训练自己的数据集?

您好,我目前手上有固定点、激光雷达采集的pcd海事船舶数据集,我如果需要对船舶点云做目标检测,应该怎么做呢?

Model Reasoning Questions

嗨,打扰了。
你的算法在给每个序列做推理的时候,始终使用第一帧模板帧来推理整个序列吗?
还是每一帧的前一帧来作为模板帧?
调试的时候我怎样可以可视化出来模型推理一次的过程中具体是怎样运算的呢?
不知道为什么,我每次debug的断点处,batch_dict中显示已经计算出3D框了,我想搞清楚多目标跟踪,从何处入手呢?谢谢前辈指点^V^。

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.

作者,您好,我在尝试按照redme文件训练ptt模型时,遇到了以下问题:

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl, xcb.

Aborted (core dumped)

环境是python 3.8
ubutu16.04.6
1080tiGPU
cuda10.2

另外这是通过MobaXterm远程连接该服务器的,并没有可视化窗口,请问这个会对代码的运行有影响吗?
希望能够得知这个问题应该如何解决。
感谢您优秀的工作。

Question about Demo_tracking.py

Hi, I've been learning about your work recently. This interface appears when running demo_tracking.py, I'm not sure what to enter next?
Can you provide the usage instructions for demo_tracking.py?
And what is the abbreviation for pts?

image

Thank you for your response.

Visualization tracking

Thank you for your open-source code. Could you please tell me how to visualize the tracking process? I find you pass this stage in your code.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.