Git Product home page Git Product logo

monocular_depth_estimation's Introduction

SMMetrDE: Segmentation-guided Monocular Metric Depth Estimation

This repository contains the implementation of processing an image or a directory of images for segmentation and metric depth estimation. It accompanies several depth models developed by the following papers:

Digging Into Self-Supervised Monocular Depth Estimation

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer

MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation

Vision Transformers for Dense Prediction

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Github references

For further informations and how to work with the models itself, refer to the original github repositories below.

  • Monodepth2 1
  • MiDaS 2
  • DPT 3
  • ZoeDepth 4

Setup

  1. Clone the main repository and its submodules:
git submodule update --init --recursive
  1. Pick one of the estimator types and download the weights
  • Monodepth2: automatically downloads the weights in the first run

  • MiDaS and DPT: download the model from their repositories and store them inside the weights folder in the correct directory (e.g. Depth_Estimation/MiDaS/weights)

    → DPT: dpt_hybrid, dpt_hybrid_kitti

    → MiDaS: dpt_swin2_tiny_256, midas_v21_384, dpt_beil_large_384

  • ZoeDepth: works with torch.hub and also automatically downloads the weights.

  1. Download the segmentator model weights from (here)

  2. Download (here) the KITTI Depth Prediction Evaluation 5 validation and test sets into root directory (needed for alignment and some images can be later also moved to the input/images)

  1. Set up dependencies:

    Powershell:

    python -m venv venv
    .\venv\Scripts\activate
    pip install -r requirements.txt
    

Usage

  1. Place one or more input images in the folder input/images.

  2. Run the model with

    python run.py -et <estimator_type> -s

    where <estimator_type> is chosen from [Mono2, MiDaS, DPT, ZoeDepth]. If the -s flag is set, the image is also segmentated.

    Additionally also choose one of the model types from the according model type list within the run.py.

  3. The resulting depth maps are written to the output/images folder. For each of the estimator types there exists a subdirectory where all the depth maps are stored. The depth maps are accordingly named by their input image name and their specific model type. Additionally the segmented image and the mean metric depth per segmented object is stored in a *_mean_depth_per_object.csv. This file contains the class name and the mean metric depth in asceding order of the depth.

  4. For MiDaS and DPT alignments have to be done in the first run (relative depth - metric depth) and the steps are stored in the assets folder.

Evaluation

For evaluating the different depth estimators, you have to make sure that the KITTI Depth Prediction Evaluation dataset is downloaded. You just have to specify the name to the raw predictions of the model and the path to the ground truth depth maps. One example to do this for the Mono2 is the following:

python .\evaluation.py --pred_path ".\output\images\Mono2\mono+stereo_640x192\raw" --gt_path ".\depth_selection\val_selection_cropped\groundtruth_depth\" --garg_crop

👩‍⚖️ License

Copyright © Sandro Sage. All rights reserved. Please see the license file for terms.

Footnotes

  1. https://github.com/nianticlabs/monodepth2

  2. https://github.com/isl-org/MiDaS

  3. https://github.com/isl-org/DPT

  4. https://github.com/isl-org/ZoeDepth

  5. https://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction

monocular_depth_estimation's People

Contributors

sandrosage avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.