Git Product home page Git Product logo

vps's Introduction

VPSNet for Video Panoptic Segmentation

Official implementation for "Video Panoptic Segmentation" (CVPR 2020 Oral)
[Paper] [Dataset] [Project] [Slides] [Codalab]

Dahun Kim, Sanghyun Woo, Joon-Young Lee, and In So Kweon.

Update

2020.08.23.   Cityscapes-VPS test set evaluation is now available at this Codalab server.



Image-level baseline (left) / VPSNet result (right)

Disclaimer

This repo is tested under Python 3.7, PyTorch 1.4, Cuda 10.0, and mmcv==0.2.14.

Installation

a. This repo is built based on mmdetection commit hash 4357697. Our modifications for VPSNet implementation are listed here. Please refer to INSTALL.md to install the library. You can use following commands to create conda env with related dependencies.

conda create -n vps python=3.7 -y
conda activate vps
conda install pytorch=1.4 torchvision cudatoolkit=10.0 -c pytorch -y
pip install -r requirements.txt
pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
pip install "git+https://github.com/cocodataset/panopticapi.git"
pip install -v -e . 

b. You also need to install dependencies for Flownet2 and UPSNet modules.

bash ./init_flownet.sh
bash ./init_upsnet.sh

c. You may also need to download some pretrained weights.

pip install gdown
bash ./download_weights.sh

Dataset

You can download Cityscapes-VPS here. It provides 2500-frame panoptic labels that temporally extend the 500 Cityscapes image-panoptic labels. There are total 3000-frame panoptic labels which correspond to 5, 10, 15, 20, 25, and 30th frames of each 500 videos, where all instance ids are associated over time.

It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

Necessary data for Cityscapes-VPS training, testing, and evaluation are as follows. Please refer to DATASET.md for dataset preparation.

mmdetection
├── mmdet
├── tools
├── configs
├── data
│   ├── cityscapes_vps
│   │   ├── panoptic_im_train_city_vps.json
│   │   ├── panoptic_im_val_city_vps.json
│   │   ├── panoptic_im_test_city_vps.json  
│   │   ├── instances_train_city_vps_rle.json (for training)
│   │   ├── instances_val_city_vps_rle.json 
│   │   ├── im_all_info_val_city_vps.json (for inference)
│   │   ├── im_all_info_test_city_vps.json (for inference)
│   │   ├── panoptic_gt_val_city_vps.json (for VPQ eval)
│   │   ├── train 
│   │   │   ├── img
│   │   │   ├── labelmap
│   │   ├── val
│   │   │   ├── img
│   │   │   ├── img_all
│   │   │   ├── panoptic_video
│   │   ├── test
│   │   │   ├── img_all

Evaluation Metric

Testing

Our trained models are available for download here. Rename it to latest.pth and run the following commands to test the model on Cityscapes-VPS.

  • FuseTrack model for Video Panoptic Quality (VPQ) on Cityscapes-VPS val set (vpq-λ.txt will be saved.)
python tools/test_vpq.py configs/cityscapes/fusetrack.py \
  work_dirs/cityscapes_vps/fusetrack_vpct/latest.pth \
  --out work_dirs/cityscapes_vps/fusetrack_vpct/val.pkl \
  --pan_im_json_file data/cityscapes_vps/panoptic_im_val_city_vps.json \
  --n_video 50 --mode val \
python tools/eval_vpq.py \
  --submit_dir work_dirs/cityscapes_vps/fusetrack_vpct/val_pans_unified/ \
  --truth_dir data/cityscapes_vps/val/panoptic_video/ \
  --pan_gt_json_file data/cityscapes_vps/panoptic_gt_val_city_vps.json
  • FuseTrack model VPS inference on Cityscapes-VPS test set
python tools/test_vpq.py configs/cityscapes/fusetrack.py \
  work_dirs/cityscapes_vps/fusetrack_vpct/latest.pth \
  --out work_dirs/cityscapes_vps/fusetrack_vpct/test.pkl \
  --pan_im_json_file data/cityscapes_vps/panoptic_im_test_city_vps.json \
  --n_video 50 --mode test \

Files containing the predicted results will be generated as pred.json and pan_pred/*.png at work_dirs/cityscapes_vps/fusetrack_vpct/test_pans_unified/.

Cityscapes-VPS test split currently only allows evaluation on the codalab server. Please upload submission.zip to Codalab server to see actual performances.

submission.zip
├── pred.json
├── pan_pred.zip
│   ├── 0005_0025_frankfurt_000000_001736.png
│   ├── 0005_0026_frankfurt_000000_001741.png
│   ├── ...
│   ├── 0500_3000_munster_000173_000029.png

Training

  • Train FuseTrack model on video-level Cityscapes-VPS. We start from initial weights of image panoptic segmentation (IPS) model, pretrained on the original Cityscapes. Place it at work_dirs/cityscapes/fuse_vpct/ and rename to latest.pth and run the following command.
# Multi-GPU distributed training
bash ./tools/dist_train.sh configs/cityscapes/fusetrack.py ${GPU_NUM}
# OR
python ./tools/train.py configs/cityscapes/fusetrack.py --gpus ${GPU_NUM}

Citation

If you use this toolbox or benchmark in your research, please cite this project.

@inproceedings{kim2020vps,
  title={Video Panoptic Segmentation},
  author={Dahun Kim and Sanghyun Woo and Joon-Young Lee and In So Kweon},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Terms of Use

This software is for non-commercial use only. The source code is released under the Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) Licence (see this for details)

Acknowledgements

This project has used utility functions from other wonderful open-sourced libraries. We would especially thank the authors of:

Contact

If you have any questions regarding the repo, please contact Dahun Kim ([email protected]) or create an issue.

vps's People

Contributors

mcahny avatar joonyoung-cv avatar joe-siyuan-qiao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.