Git Product home page Git Product logo

indoor-sfmlearner's Introduction

Indoor SfMLearner

PyTorch implementation of our ECCV2020 paper:

P2Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

Zehao Yu*, Lei Jin*, Shenghua Gao

(* Equal Contribution)

Getting Started

Installation

pip install -r requirements.txt

Then install pytorch with

conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

Pytorch version >= 0.4.1 would work well.

Download pretrained model

Please download pretrained model from Onedrive and extract:

tar -xzvf ckpts.tar.gz 
rm ckpts.tar.gz

Prediction on single image

Run the following command to predict on a single image:

python inference_single_image.py --image_path=/path/to/image

By default, the script saves the predicted depth to the same folder

Evaluation

Download testing data from Onedrive and put to ./data.

cd data
tar -xzvf nyu_test.tar.gz 
tar -xzvf scannet_test.tar.gz
tar -xzvf scannet_pose.tar.gz
cd ../

NYUv2 Dpeth

CUDA_VISIBLE_DEVICES=1 python evaluation/nyuv2_eval_depth.py \
    --data_path ./data \
    --load_weights_folder ckpts/weights_5f \
    --post_process  

NYUv2 normal

CUDA_VISIBLE_DEVICES=1 python evaluation/nyuv2_eval_norm.py \
    --data_path ./data \
    --load_weights_folder ckpts/weights_5f \
    # --post_process

ScanNet Depth

CUDA_VISIBLE_DEVICES=1 python evaluation/scannet_eval_depth.py \                                               
    --data_path ./data/scannet_test \
    --load_weights_folder ckpts/weights_5f \
    --post_process

ScanNet Pose

CUDA_VISIBLE_DEVICES=1 python evaluation/scannet_eval_pose.py \
    --data_path ./data/scannet_pose \
    --load_weights_folder ckpts/weights_5f \
    --frame_ids 0 1

Training

First download NYU Depth V2 on the official website and unzip the raw data to DATA_PATH.

Extract Superpixel

Run the following command to extract superpixel:

python extract_superpixel.py --data_path DATA_PATH --output_dir ./data/segments

3-frames

Run the following command to train our network:

CUDA_VISIBLE_DEVICES=1 python train_geo.py \                                                                   
    --model_name 3frames \
    --data_path DATA_PATH \
    --val_path ./data \
    --segment_path ./data/segments \
    --log_dir ./logs \
    --lambda_planar_reg 0.05 \
    --batch_size 12 \
    --scales 0 \
    --frame_ids_to_train 0 -1 1

5-frames

Using the pretrained model from 3-frames setting gives better results.

CUDA_VISIBLE_DEVICES=1 python train_geo.py \                                                                   
    --model_name 5frames \
    --data_path DATA_PATH \
    --val_path ./data \
    --segment_path ./data/segments \
    --log_dir ./logs \
    --lambda_planar_reg 0.05 \
    --batch_size 12 \
    --scales 0 \
    --load_weights_folder FOLDER_OF_3FRAMES_MODEL \
    --frame_ids_to_train 0 -2 -1 1 2

Acknowledgements

This project is built upon Monodepth2. We thank authors of Monodepth2 for their great work and repo.

License

TBD

Citation

Please cite our paper for any purpose of usage.

@inproceedings{IndoorSfMLearner,
  author    = {Zehao Yu and Lei Jin and Shenghua Gao},
  title     = {P$^{2}$Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation},
  booktitle = {ECCV},
  year      = {2020}
}

indoor-sfmlearner's People

Contributors

niujinshuchong avatar pmoulon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indoor-sfmlearner's Issues

关于得到的深度图像的颜色和深度的对应关系

您好。
我正在做本科毕设,想利用您论文里的深度估计模块,想问下得到深度图片是绝对深度还是相对深度呀,另外颜色和深度的对应关系是什么样的。请问能告诉我,在network文件夹哪个.py文件中是有关这一部分的吗?

关于测试集与DeepV2D中测试集图片存在差异的问题

大佬您好,感谢您之前的回复,我还有新的问题。
因为我自己想搭建一个多张自监督室内深度估计的网络,我参考了DeepV2D和sfmlearner indoor
我发现两者的测试集图片不完全一样
这个是DeepV2D中提供的测试集 wget https://www.dropbox.com/s/numnge239p7ll7o/nyu_test.zip
这个里面000/000.png 和 nyu_test中00001.h5的rgb不是完全一样
我把000.png按照sfmlearner Indoor中_unditort的方法进行去畸变了,然后再进行比较还是不一样,00001.h5冰箱上贴纸明显小一些,而000.png原图和去畸变后的图都能看到贴纸。我没有弄明白是为啥。
deepv2d
undistorted_deepv2d
sfmindoor
第一张图是DeepV2D的原图,第二张图是按照sfmlearner indoor的方式进行去畸变后的,第三张图是00001.h5的图

!明显感觉到00001.h5的图片相对于DeepV2D的原图有旋转的变化,如果我只比较[44: 471, 40: 601, :]这个范围的
np.allclose两个array 返回是false,将两个array相减结果数值比较大

比较了深度标签wget https://www.dropbox.com/s/u5pu0j2ysed64ja/nyu_groundtruth.npy
[44: 471, 40: 601, :]范围内,两者是一模一样的

Fail to undistort images

Hi!

I'm trying to train my model using this repo's codes for loading the NYU V2 dataset. While I meet the following error:

image

How can I solve this?

Thanks a lot!

Two questions about the dataloader

Thanks for your great work!

Can I ask two questions about the dataloader?

  1. In 'NYUDataset', we undistort images.

    color = self._undistort(color)

    But in 'NYUTestDataset', we don't undistort images.
    rgb, depth, norm, valid_mask = self.loader(line)

    Is this because raw images are loaded in 'NYUDataset'?

  2. Here we use self.full_res_shape (608, 448), instead of (640, 480), to compute the normalized intrinsics. Will this have a negative influence?

    w, h = self.full_res_shape

Why assume the patch have the same depth?

It seems that the keypoints extracted by DSO mainly distribute around the edge of objects, thus the depth variance may be large, so I'm wondering if the same depth assumption is plausible, could you please share the idea of this implementation?

shocked by the initial val result

Before training, the function val evaluate the initial model on the NYUv2 test set, and the result is
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.323 & 0.448 & 1.002 & 0.365 & 0.520 & 0.783 & 0.905
That shocks me, am i wrong ? why the initial model perform pretty well on the NYUv2 test set ?

NYUv2 dataset

Hi, is there a chance that someone can share the raw nyuv2 dataset resource? It seems the download link from nyuv2 website may be invalid today? nyu-v2. And I suspect my original nyuv2 dataset may be inconsistent.

Thanks in advance.

multi scale question

Thank you for your excellent work. Why multi-scale training strategy is disabled in your code. Multi-scale loss works very well in Monodepth2.

pretrained model

I can not download the pretrained model, can you provide another link to me?
Thanks very much.

collapse issues when training

Hi,

I try to train my model without superpixel planar regularization. But it's likely that the training collapses and the output is all-zero. Have you faced and managed to solve this problem?

Thanks in advance.

Keypoints extraction

Hi, I noticed that you've adopted the points selection strategy from DSO for its effectiveness and efficiency. Points from DSO are sampled from pixels that have large intensity gradients. I'm wondering why this strategy is effective and efficient and why not choose the sobel filter to get points with large intensity gradients?

About monodepth2 result in the paper

Thanks for sharing code!
How did you get this result, is it directly using the monodepth2 pre-training model for evaluation, or using monodepth2 to train on the NYUv2 dataset before evaluating.
By the way, are networks.py and partialconv.py unrelated to your paper ?

nyu2 raw dataset does not match the split

Thank you very much for your work. When I trying to test the trainning scripts, I met some problem abour NYU2 dataset, because the entire dataset download link is failed, I download the different parts of the dataset individually and unzip them all, but the scenes does not match the training split, such as classroom__0016 is not in the dataset I downloaded, could you give me some advice?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.