Code to extract stereo frame pairs from 3D videos, as used in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, arXiv:1907.01341"
Thank you for your excellent work! Your idea is innovative to me. Thanks for your code.
I would like to use the code to generate my own training data. I have some questions about it as follows:
①I'm not sure that how to combine the disparity map and semantic segmentation results when training, which does not seems to be mentioned in get_disp_and_uncertainty.py.
②The paper described that ". In a final step, we detect pixels that belong to sky regions using a pre-trained semantic segmentation model and set their disparity to the minimum disparity in the image". For this, I'm not sure the disparity of sky regions is the minimum disparity in the image , or the minimum disparity in the image is set as the disparity of sky regions.
Thank you for your kind consideration of these questions.
Best regards.