Git Product home page Git Product logo

deepmvs's Introduction

DeepMVS: Learning Multi-View Stereopsis

License

DeepMVS is a Deep Convolutional Neural Network which learns to estimate pixel-wise disparity maps from a sequence of an arbitrary number of unordered images with the camera poses already known or estimated.

teaser1 teaser2

If you use our codes or datasets in your work, please cite:

@inproceedings{DeepMVS,
  author       = "Huang, Po-Han and Matzen, Kevin and Kopf, Johannes and Ahuja, Narendra and Huang, Jia-Bin",
  title        = "DeepMVS: Learning Multi-View Stereopsis",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
  year         = "2018"
}

For the paper and other details of DeepMVS or the MYS-Synth Dataset, please see our project webpage.

Training

Requirements

  • python 2.7
  • numpy 1.13.1
  • pytorch 0.3.0 and torchvision: Follow the instructions from their website.
  • opencv 3.1.0: Run conda install -c menpo opencv or pip install opencv-python.
  • imageio 2.2.0 (with freeimage plugin): Run conda install -c conda-forge imageio or pip install imageio. To install freeimage plugin, run the following Python script once:
    import imageio
    imageio.plugins.freeimage.download()
  • h5py 2.7.0: Run conda install h5py or pip install h5py.
  • lz4 0.23.1: Run pip install lz4.
  • cuda 8.0.61 and 16GB GPU RAM (required for gpu support): The training codes use up to 14GB of the GPU RAM with the default configuration. We train our model with an NVIDIA Tesla P100 GPU. To reduce GPU RAM usage, feel free to try smaller --patch_width, --patch_height, --num_depths, and --max_num_neighbors. However, the resulting model may not show the efficacy as appeared in our paper.

Instructions

  1. Download the training datasets.
    python python/download_training_datasets.py # This may take up to 1-2 days to complete.
    Update: The training datasets have been updated on May 18, 2018 because of some errors in camera poses. Please remove the files and download them again if you have downloaded the old version.
  2. Train the network.
    python python/train.py # This may take up to 4-6 days to complete, depending on which GPU is used.

Testing

Requirements

  • python 2.7
  • numpy 1.13.1
  • pytorch 0.3.0 and torchvision: Follow the instructions from their website.
  • opencv 3.1.0: Run conda install -c menpo opencv or pip install opencv-python.
  • imageio 2.2.0: Run conda install -c conda-forge imageio or pip install imageio.
  • pyquaternion 0.9.0: Run pip install pyquaternion.
  • pydensecrf: Run pip install pydensecrf.
  • cuda 8.0.61 and 6GB GPU RAM (required for gpu support): The testing codes use up to 4GB of the GPU RAM with the default configuration.
  • COLMAP 3.2: Follow the instructions from their website.

Instructions

  1. Download the trained model.

    python python/download_trained_model.py
  2. Run the sparse reconstruction and the image_undistorter using COLMAP. The image_undistorter will generate a images folder which contains undistorted images and a sparse folder which contains three .bin files.

  3. Run the testing script with the paths to the undistorted images and the sparse construction model.

    python python/test.py --load_bin --image_path path/to/images --sparse_path path/to/sparse --output_path path/to/output/directory

    By default, the script resizes the images to be 540px in height to reduce the running time. If you would like to run the model with other resolutions, please pass the arguments --image_width XXX and --image_height XXX. If your COLMAP outputs .txt files instead of .bin files for the sparse reconstruction, simply remove the --load_bin flag.

  4. To evaluate the predicted results, run

    python python/eval.py --load_bin --image_path path/to/images --sparse_path path/to/sparse --output_path path/to/output/directory --gt_path path/to/gt/directory --image_width 810 --image_height 540 --size_mismatch crop_pad

    In gt_path, the ground truth disparity maps should be stored in npy format with filenames being <image_name>.depth.npy. If the ground truths are depth maps instead of disparity maps, please add --gt_type depth flag.

License

DeepMVS is licensed under the BSD 2-Clause License

deepmvs's People

Contributors

phuang17 avatar e1ichan avatar inchangchoi avatar rasmus25 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.