Git Product home page Git Product logo

particle-sfm's Introduction

ParticleSfM

Code release for our ECCV 2022 paper "ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild." by Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang and Yong-Jin Liu.

[Introduction] ParticleSfM is an offline structure-from-motion system for videos (image sequences). Inspired by Particle video, our method connects pairwise optical flows and optimizes dense point trajectories as long-range video correpondences, which are used in a customized global structure-from-motion framework with similarity averaging and global bundle adjustment. In particular, for dynamic scenes, the acquired dense point trajectories can be fed into a specially designed trajectory-based motion segmentation module to select static point tracks, enabling the system to produce reliable camera trajectories on in-the-wild sequences with complex foreground motion.

Teaser

Contact Wang Zhao ([email protected]), Shaohui Liu ([email protected]) and Hengkai Guo ([email protected]) for questions, comments and reporting bugs.

If you are interested in potential collaboration or internship at ByteDance, please feel free to contact Hengkai Guo ([email protected]).

Installation

  1. Install dependencies:
  • Ceres 2.0.0 [Guide]
  • COLMAP [Guide]
  • Theia SfM (customized version) [Guide]
  1. Set up Python environment with Conda:
conda env create -f particlesfm_env.yaml
conda activate particlesfm 
  1. Build our point trajectory optimizer and global structure-from-motion module.
  • The path to your customized python executable should be set here.
  • (Optional) Add another gcc search path (e.g. gcc 9) here to compile gmapper correctly.
git submodule update --init --recursive
sudo apt-get install libhdf5-dev
bash scripts/build_all.sh
  1. Download pretrained models for MiDaS, RAFT and our motion segmentation module (download script).
bash scripts/download_all_models.sh

Quickstart Demo

  1. Download two example in-the-wild sequences [Google Drive] from DAVIS: snowboard and train:
bash ./scripts/download_examples.sh
  1. Example command to run the reconstruction (e.g. on snowboard):
python run_particlesfm.py --image_dir ./example/snowboard/images --output_dir ./outputs/snowboard/

You can also alternatively use the command for the workspace with the images folder inside below. This option will write all the output in the same workspace.

python run_particlesfm.py --workspace_dir ./example/snowboard/
  1. Visualize the outputs with either the COLMAP GUI or your customized visualizer. We also provide a visualization script:
python -m pip install open3d pycolmap
python visualize.py --input_dir ./outputs/snowboard/sfm/model --visualize

The results below are expected (left: snowboard; right: train): Example

Usage

  1. Given an image sequence, put all the images in the same folder. The sorted ordering of the names should be consistent with its ordering in the sequence.

  2. Use the following command to run our whole pipeline:

    python run_particlesfm.py --image_dir /path/to/the/image/folder/ \
                              --output_dir /path/to/output/workspace/
    

    This will sequentially run optical flow -> point trajectory -> motion seg -> sfm. The final results will be saved inside the image data folder with COLMAP output format.

    If you have the prior information that the scene to be reconstructed is fully static, you can skip the motion segmentation module with --assume_static. Conversely, if you only want to run the motion segmentation, attach --skip_sfm to the command.

    To speed up

    • Use "--skip_path_consistency" to skip the path consistency optimization of point trajectories
    • Try higher down-sampling ratio for optimizing point trajectories: e.g. "--sample_ratio 4"
  3. Visualize the outputs using COLMAP GUI (Download the COLMAP Binary and import the data sequence directory) or just your customized visualizer.

Evaluation

MPI Sintel dataset

  1. Download the Sintel dataset. You also need to download the groundtruth camera motion data and the generated motion mask to evaluate the pose and motion segmentation.
  2. Prepare the sequences:
python scripts/prepare_sintel.py --src_dir /path/to/your/sintel/training/final/ \
                                 --tgt_dir /path/to/the/data/root/dir/want/to/save/
  1. Run ParticleSfM reconstructions:
python run_particlesfm.py --root_dir /path/to/the/data/root/dir/
  1. To evaluate the camera poses:
python ./evaluation_evo/eval_sintel.py --input_dir /path/to/the/data/root/dir/ \
                                       --gt_dir /path/to/the/sintel/training/data/camdata_left/ \
                                       --dataset sintel

This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.

  1. To evaluate the motion segmentation:
python ./motion_seg/eval_traj_iou.py --root_dir /path/to/the/data/root/dir/ \
                                     --gt_dir /path/to/the/sintel/rigidity/

ScanNet dataset

  1. Download the test split of ScanNet dataset, extract the data from .sens data using the official script.

  2. Prepare the sequences:

python scripts/prepare_scannet.py --src_dir /path/to/your/scannet/test/scans_test/ \ 
                                  --tgt_dir /path/to/the/data/root/dir/want/to/save/

We use the first 20 sequences of test split and perform downsampling with stride 3, resize the image to 640x480.

  1. Run ParticleSfM reconstructions:
python run_particlesfm.py --root_dir /path/to/the/data/root/dir/ \
                          --flow_check_thres 3.0 --assume_static
  1. To evaluate the camera poses:
python ./evaluation_evo/eval_scannet.py --input_dir /path/to/the/data/root/dir/ \
                                        --gt_dir /path/to/the/scannet/test/scans_test/ \
                                        --dataset scannet

This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.

Training

  1. Download the Flyingthings3D dataset from the official website. We need the RGB images (finalpass) and optical flow data.

  2. Download the generated binary motion labels from here, and unpack this archive into the root directory of the FlyingThings3D dataset. We thank the authors of MPNet for kindly sharing it.

  3. Prepare the training data:

python ./scripts/prepare_flyingthings3d.py --src_dir /path/to/your/flyingthings3d/data/root/
  1. To launch the training, configure your config file inside ./motion_seg/configs/ and then run:
cd ./motion_seg/
python train_seq.py ./configs/your-config-file
cd ..

Citation

@inproceedings{zhao2022particlesfm,
      author    = {Zhao, Wang and Liu, Shaohui and Guo, Hengkai and Wang, Wenping and Liu, Yong-Jin},
      title     = {ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild},
      booktitle = {European conference on computer vision (ECCV)},
      year      = {2022}
  }

More related projects

  • DynaSLAM. Bescos et al. DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes. IROS 2018.
  • TrianFlow. Zhao et al. Towards Better Generalization: Joint Depth-Pose Learning without PoseNet. CVPR 2020.
  • VOLDOR. Min et al. VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough. ICRA 2021.
  • DROID-SLAM. Teed et al. DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras. NeurIPS 2021.

Acknowledgements

This project could not be possible without the great open-source works from COLMAP, Theia, hloc, RAFT, MiDaS and OANet. We sincerely thank them all.

particle-sfm's People

Contributors

guohengkai avatar b1ueber2y avatar thuzhaowang avatar yz-cnsdqz avatar jonathanlehner avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.