This repository contains the implementation of processing an image or a directory of images for segmentation and metric depth estimation. It accompanies several depth models developed by the following papers:
Digging Into Self-Supervised Monocular Depth Estimation
Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation
Vision Transformers for Dense Prediction
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
For further informations and how to work with the models itself, refer to the original github repositories below.
- Clone the main repository and its submodules:
git submodule update --init --recursive
- Pick one of the estimator types and download the weights
-
Monodepth2: automatically downloads the weights in the first run
-
MiDaS and DPT: download the model from their repositories and store them inside the
weights
folder in the correct directory (e.g.Depth_Estimation/MiDaS/weights
)→ DPT: dpt_hybrid, dpt_hybrid_kitti
→ MiDaS: dpt_swin2_tiny_256, midas_v21_384, dpt_beil_large_384
-
ZoeDepth: works with
torch.hub
and also automatically downloads the weights.
-
Download the segmentator model weights from (here)
-
Download (here) the KITTI Depth Prediction Evaluation 5 validation and test sets into root directory (needed for alignment and some images can be later also moved to the
input/images
)
-
Set up dependencies:
Powershell:
python -m venv venv .\venv\Scripts\activate pip install -r requirements.txt
-
Place one or more input images in the folder
input/images
. -
Run the model with
python run.py -et <estimator_type> -s
where
<estimator_type>
is chosen from[Mono2, MiDaS, DPT, ZoeDepth]
. If the-s
flag is set, the image is also segmentated.Additionally also choose one of the model types from the according model type list within the
run.py
. -
The resulting depth maps are written to the
output/images
folder. For each of the estimator types there exists a subdirectory where all the depth maps are stored. The depth maps are accordingly named by their input image name and their specific model type. Additionally the segmented image and the mean metric depth per segmented object is stored in a*_mean_depth_per_object.csv
. This file contains the class name and the mean metric depth in asceding order of the depth. -
For MiDaS and DPT alignments have to be done in the first run (relative depth - metric depth) and the steps are stored in the
assets
folder.
For evaluating the different depth estimators, you have to make sure that the KITTI Depth Prediction Evaluation dataset is downloaded. You just have to specify the name to the raw predictions of the model and the path to the ground truth depth maps. One example to do this for the Mono2 is the following:
python .\evaluation.py --pred_path ".\output\images\Mono2\mono+stereo_640x192\raw" --gt_path ".\depth_selection\val_selection_cropped\groundtruth_depth\" --garg_crop
Copyright © Sandro Sage. All rights reserved. Please see the license file for terms.