Git Product home page Git Product logo

sc-sfmlearner's Introduction

SC-Depth

This codebase implements the system described in the paper:

Unsupervised Scale-consistent Depth Learning from Video

Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid

IJCV 2021 [PDF]

This is an extended version of NeurIPS 2019 [PDF] [Project webpage]

Point cloud visulization on KITTI (left) and real-world data (right)

Dense Voxel reconstruction (left) using the estimated depth (bottom right)

reconstruction demo

Contributions

  1. A geometry consistency loss, which makes the predicted depths to be globally scale consistent.
  2. A self-discovered mask, which detects moving objects and occlusions for boosting accuracy.
  3. Scale-consistent predictions, which can be used in the Monocular Visual SLAM system.

If you find our work useful in your research please consider citing our paper:

@article{bian2021ijcv, 
  title={Unsupervised Scale-consistent Depth Learning from Video}, 
  author={Bian, Jia-Wang and Zhan, Huangying and Wang, Naiyan and Li, Zhichao and Zhang, Le and Shen, Chunhua and Cheng, Ming-Ming and Reid, Ian}, 
  journal= {International Journal of Computer Vision (IJCV)}, 
  year={2021} 
}

Updates (Compared with NeurIPS version)

Note that this is an improved version, and you can find the NeurIPS version in 'Release / NeurIPS Version' for reproducing the results reported in paper. Compared with NeurIPS version, we (1) Change networks by using Resnet18 and Resnet50 pretrained model (on ImageNet) for depth and pose encoders. (2) Add 'auto_mask' by Monodepth2 to remove stationary points. (3) Integrate the depth and pose prediction into the ORB-SLAM system. (4) Add training and testing on NYUv2 indoor dataset. See Unsupervised-Indoor-Depth for details.

Preamble

This codebase was developed and tested with python 3.6, Pytorch 1.0.1, and CUDA 10.0 on Ubuntu 16.04. It is based on Clement Pinard's SfMLearner implementation.

Prerequisite

pip3 install -r requirements.txt

Datasets

See "scripts/run_prepare_data.sh".

For KITTI Raw dataset, download the dataset using this script http://www.cvlibs.net/download.php?file=raw_data_downloader.zip) provided on the official website.

For KITTI Odometry dataset, download the dataset with color images.

Or you can download our pre-processed dataset from the following link

kitti_256 (for kitti raw) | kitti_vo_256 (for kitti odom) | kitti_depth_test (eigen split) | kitti_vo_test (seqs 09-10)

Training

The "scripts" folder provides several examples for training and testing.

You can train the depth model on KITTI Raw by running

sh scripts/train_resnet18_depth_256.sh

or train the pose model on KITTI Odometry by running

sh scripts/train_resnet50_pose_256.sh

Then you can start a tensorboard session in this folder by

tensorboard --logdir=checkpoints/

and visualize the training progress by opening https://localhost:6006 on your browser.

Evaluation

You can evaluate depth on Eigen's split by running

sh scripts/test_kitti_depth.sh

evaluate visual odometry by running

sh scripts/test_kitti_vo.sh

and visualize depth by running

sh scripts/run_inference.sh

Pretrained Models

Latest Models

To evaluate the NeurIPS models, please download the code from 'Release/NeurIPS version'.

Depth Results

KITTI raw dataset (Eigen's splits)

Models Abs Rel Sq Rel RMSE RMSE(log) Acc.1 Acc.2 Acc.3
resnet18 0.119 0.857 4.950 0.197 0.863 0.957 0.981
resnet50 0.114 0.813 4.706 0.191 0.873 0.960 0.982

NYUv2 dataset (Original Video)

Models Abs Rel Log10 RMSE Acc.1 Acc.2 Acc.3
resnet18 0.159 0.068 0.608 0.772 0.939 0.982
resnet50 0.157 0.067 0.593 0.780 0.940 0.984

NYUv2 dataset (Rectifed Images by Unsupervised-Indoor-Depth)

Models Abs Rel Log10 RMSE Acc.1 Acc.2 Acc.3
resnet18 0.143 0.060 0.538 0.812 0.951 0.986
resnet50 0.142 0.060 0.529 0.813 0.952 0.987

Visual Odometry Results on KITTI odometry dataset

Network prediction (trained on 00-08)

Metric Seq. 09 Seq. 10
t_err (%) 7.31 7.79
r_err (degree/100m) 3.05 4.90

Pseudo-RGBD SLAM output (Integration of SC-Depth in ORB-SLAM2)

Metric Seq. 09 Seq. 10
t_err (%) 5.08 4.32
r_err (degree/100m) 1.05 2.34

Related projects

sc-sfmlearner's People

Contributors

amin-golden avatar

Watchers

 avatar

sc-sfmlearner's Issues

How to visualize the mask

I found a lot of code, but none of the files visualized with masks and if it is possible to ask help from you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.