Git Product home page Git Product logo

sc_depth_pl's Introduction

SC_Depth:

This repo provides the pytorch lightning implementation of SC-Depth (V1, V2, and V3) for self-supervised learning of monocular depth from video.

In the SC-DepthV1 (IJCV 2021 & NeurIPS 2019), we propose (i) geometry consistency loss for scale-consistent depth prediction over time and (ii) self-discovered mask for detecting and removing dynamic regions and occlusions during training towards higher accuracy. The predicted depth is sufficiently accurate and consistent for use in the ORB-SLAM2 system. The below video showcases the estimated depth in the form of pointcloud (top) and color map (bottom right).

In the SC-DepthV2 (TPMAI 2022), we prove that the large relative rotational motions in the hand-held camera captured videos is the main challenge for unsupervised monocular depth estimation in indoor scenes. Based on this findings, we propose auto-recitify network (ARN) to handle the large relative rotation between consecutive video frames. It is integrated into SC-DepthV1 and jointly trained with self-supervised losses, greatly boosting the performance.

In the SC-DepthV3 (TPAMI 2023), we propose a robust learning framework for accurate and sharp monocular depth estimation in (highly) dynamic scenes. As the photometric loss, which is the main loss in the self-supervised methods, is not valid in dynamic object regions and occlusion, previous methods show poor accuracy in dynamic scenes and blurred depth prediction at object boundaries. We propose to leverage an external pretrained depth estimation network for generating the single-image depth prior, based on which we propose effective losses to constrain self-supervised depth learning. The evaluation results on six challenging datasets including both static and dynamic scenes demonstrate the efficacy of the proposed method.

Qualitative depth estimation results: DDAD, BONN, TUM, IBIMS-1

Demo Videos

ddad_video.mp4
bonn_video.mp4
tum_video.mp4

Install

conda create -n sc_depth_env python=3.8
conda activate sc_depth_env
conda install pytorch torchvision pytorch-cuda=11.7 -c pytorch -c nvidia
pip install -r requirements.txt

Dataset

We organize the video datasets into the following format for training and testing models:

Dataset
  -Training
    --Scene0000
      ---*.jpg (list of color images)
      ---cam.txt (3x3 camera intrinsic matrix)
      ---depth (a folder containing ground-truth depth maps, optional for validation)
      ---leres_depth (a folder containing psuedo-depth generated by LeReS, it is required for training SC-DepthV3)
    --Scene0001
    ...
    train.txt (containing training scene names)
    val.txt (containing validation scene names)
  -Testing
    --color (containg testing images)
    --depth (containg ground-truth depths)
    --seg_mask (containing semantic segmentation masks for depth evaluation on dynamic/static regions)

We provide pre-processed datasets:

[kitti, nyu, ddad, bonn, tum]

Training

We provide a bash script ("scripts/run_train.sh"), which shows how to train on kitti, nyu, and datasets. Generally, you need edit the config file (e.g., "configs/v1/kitti.txt") based on your devices and run

python train.py --config $CONFIG --dataset_dir $DATASET

Then you can start a tensorboard session in this folder by running

tensorboard --logdir=ckpts/

By opening https://localhost:6006 on your browser, you can watch the training progress.

Train on Your Own Data

You need re-organize your own video datasets according to the above mentioned format for training. Then, you may meet three problems: (1) no ground-truth depth for validation; (2) hard to choose an appropriate frame rate (FPS) to subsample videos; (3) no pseudo-depth for training V3.

No GT depth for validation

Add "--val_mode photo" in the training script or the configure file, which uses the photometric loss for validation.

python train.py --config $CONFIG --dataset_dir $DATASET --val_mode photo

Subsample video frames (to have sufficient motion) for training

We provide a script ("generate_valid_frame_index.py"), which computes and saves a "frame_index.txt" in each training scene. It uses the opencv-based optical flow method to compute the camera shift in consecutive frames. You might need to change the parameters for detecting sufficient keypoints in your images if necessary (usually you do not need). Once you prepare your dataset as the above-mentioned format, you can call it by running

python generate_valid_frame_index.py --dataset_dir $DATASET

Then, you can add "--use_frame_index" in the training script or the configure file to train models on the filtered frames.

python train.py --config $CONFIG --dataset_dir $DATASET --use_frame_index

Generating Pseudo-depth for training V3

We use the LeReS to generate pseudo-depth in this project. You need to install it and generate pseudo-depth for your own images (the pseudo-depth for standard datasets have been provided above). More specifically, you can refer to the code in this line for saving the pseudo-depth.

Besides, it is also possible to use other state-of-the-art monocular depth estimation models to generate psuedo-depth, such as DPT.

Pretrained models

[Models]

You need uncompress and put it into "ckpts" folder. Then you can run "scripts/run_test.sh" or "scripts/run_inference.sh" with the pretrained model.

For v1, we provide models trained on KITTI and DDAD.

For v2, we provide models trained on NYUv2.

For v3, we provide models trained on KITTI, NYUv2, DDAD, BONN, and TUM.

Testing (Evaluation on Full Images)

We provide the script ("scripts/run_test.sh"), which shows how to test on kitti, nyu, and ddad datasets. The script only evaluates depth accuracy on full images. See the next section for an evaluation of depth estimation on dynamic/static regions, separately.

python test.py --config $CONFIG --dataset_dir $DATASET --ckpt_path $CKPT

Demo

A simple demo is given here. You can put your images in "demo/input/" folder and run

python inference.py --config configs/v3/nyu.txt \
--input_dir demo/input/ \
--output_dir demo/output/ \
--ckpt_path ckpts/nyu_scv3/epoch=93-val_loss=0.1384.ckpt \
--save-vis --save-depth

You will see the results saved in "demo/output/" folder.

Evaluation on dynamic/static regions

You need to use ("scripts/run_inference.sh") firstly to save the predicted depth, and then you can use the ("scripts/run_evaluation.sh") for doing evaluation. A demo on DDAD dataset is provided in these files. Generally, you need do

Inference

python inference.py --config $YOUR_CONFIG \
--input_dir $TESTING_IMAGE_FOLDER \
--output_dir $RESULTS_FOLDER \
--ckpt_path $YOUR_CKPT \
--save-vis --save-depth

Evaluation

python eval_depth.py \
--dataset $DATASET_FOLDER \
--pred_depth=$RESULTS_FOLDER \
--gt_depth=$GT_FOLDER \
--seg_mask=$SEG_MASK_FOLDER

References

SC-DepthV1:

Unsupervised Scale-consistent Depth Learning from Video (IJCV 2021)
Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Zhichao Li, Le Zhang, Chunhua Shen, Ming-Ming Cheng, Ian Reid [paper]

@article{bian2021ijcv, 
  title={Unsupervised Scale-consistent Depth Learning from Video}, 
  author={Bian, Jia-Wang and Zhan, Huangying and Wang, Naiyan and Li, Zhichao and Zhang, Le and Shen, Chunhua and Cheng, Ming-Ming and Reid, Ian}, 
  journal= {International Journal of Computer Vision (IJCV)}, 
  year={2021} 
}

which is an extension of the previous conference version:

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video (NeurIPS 2019)
Jia-Wang Bian, Zhichao Li, Naiyan Wang, Huangying Zhan, Chunhua Shen, Ming-Ming Cheng, Ian Reid [paper]

@inproceedings{bian2019neurips,
  title={Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video},
  author={Bian, Jiawang and Li, Zhichao and Wang, Naiyan and Zhan, Huangying and Shen, Chunhua and Cheng, Ming-Ming and Reid, Ian},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year={2019}
}

SC-DepthV2:

Auto-Rectify Network for Unsupervised Indoor Depth Estimation (TPAMI 2022)
Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Tat-Jun Chin, Chunhua Shen, Ian Reid [paper]

@article{bian2021tpami, 
  title={Auto-Rectify Network for Unsupervised Indoor Depth Estimation}, 
  author={Bian, Jia-Wang and Zhan, Huangying and Wang, Naiyan and Chin, Tat-Jin and Shen, Chunhua and Reid, Ian}, 
  journal= {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, 
  year={2021} 
}

SC-DepthV3:

SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes (TPAMI 2023)
Libo Sun*, Jia-Wang Bian*, Huangying Zhan, Wei Yin, Ian Reid, Chunhua Shen [paper]
* denotes equal contribution and joint first author

@article{sc_depthv3, 
  title={SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes}, 
  author={Sun, Libo and Bian, Jia-Wang and Zhan, Huangying and Yin, Wei and Reid, Ian and Shen, Chunhua}, 
  journal= {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)}, 
  year={2023} 
}

sc_depth_pl's People

Contributors

jiawangbian avatar yeicor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sc_depth_pl's Issues

loss=nan

Hi,JiawangBian

I constructed my own dataset, and since all the data did not have GT, I only generated pseudo-depth images. I trained the model on my own dataset and I got loss=nan in the first epoch, and it is still the same after reducing the learning rate. I would like to ask for your help to answer this question, I would appreciate it!

why? self.alpha = 10 self.beta = 0.01

class DepthDecoder(nn.Module):
def init(self, num_ch_enc, scales=range(4), num_output_channels=1, use_skips=True):
super(DepthDecoder, self).init()

    self.alpha = 10
    self.beta = 0.01

why?

About multi-gpu training

It seems the multi-gpu training has some errors, while it works well on single-gpu training.


My environment:
torch 1.13.1
pytorch-lightning 1.7.3
python 3.8.16
cuda 11.7.1
On Nvidia GTX 3090


When I train the SCdepthV3 on NYU dataset, it works well on a single gpu (batch_size=32) but fails on multi gpus (even batch_size=16) due to the OOM error.

Here is the training log:

 (sc_depth_env) ➜  sc_depth_pl git:(master) ✗ sh scripts/run_train.sh 
/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/torchvision/models/_utils.py:252: UserWarning: Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and may be removed in the future. Please access them via the appropriate Weights Enum instead.
  warnings.warn(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 2 processes
----------------------------------------------------------------------------------------------------

26295 samples found for training
1646 samples found for validation
26295 samples found for training
1646 samples found for validation
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [4,5]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [4,5]

  | Name      | Type     | Params
---------------------------------------
0 | depth_net | DepthNet | 14.8 M
1 | pose_net  | PoseNet  | 13.0 M
---------------------------------------
27.9 M    Trainable params
0         Non-trainable params
27.9 M    Total params
111.417   Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:209: UserWarning: num_workers>0, persistent_workers=False, and strategy=ddp_spawn may result in data loading bottlenecks. Consider setting persistent_workers=True (this is a limitation of Python .spawn() and PyTorch)
  rank_zero_warn(
Epoch 0:   0%|                                                                                                                                                                                                                              | 0/912 [00:00<?, ?it/s]Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ssd/***/sc_depth_pl/train.py", line 7, in <module>
    from SC_Depth import SC_Depth
  File "/ssd/***/sc_depth_pl/SC_Depth.py", line 5, in <module>
    import losses.loss_functions as LossF
  File "/ssd/***/sc_depth_pl/losses/loss_functions.py", line 53, in <module>
    normal_ranking_loss = EdgeguidedNormalRankingLoss().to(device)
  File "/ssd/***/sc_depth_pl/losses/normal_ranking_loss.py", line 156, in __init__
    self.kernel = torch.tensor(
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/ssd/***/sc_depth_pl/train.py", line 7, in <module>
    from SC_Depth import SC_Depth
  File "/ssd/***/sc_depth_pl/SC_Depth.py", line 5, in <module>
    import losses.loss_functions as LossF
  File "/ssd/***/sc_depth_pl/losses/loss_functions.py", line 53, in <module>
    normal_ranking_loss = EdgeguidedNormalRankingLoss().to(device)
  File "/ssd/***/sc_depth_pl/losses/normal_ranking_loss.py", line 156, in __init__
    self.kernel = torch.tensor(
RuntimeError: CUDA error: out of memory
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
^C/ssd/***/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:653: UserWarning: Detected KeyboardInterrupt, attempting graceful shutdown...
  rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown...")
^C

Any solutions?Is it related to ddp?
Thank you very much for your help!

ImportError: dlopen: cannot load any more object with static TLS

I setup the env as the guidance, but when I run sh scripts/run_inference.sh I got the following error

(sc_depth_env) [root@sc_depth_pl]# sh scripts/run_inference.sh 
Traceback (most recent call last):
  File "inference.py", line 10, in <module>
    from SC_Depth import SC_Depth
  File ".../Models/sc_depth_pl/SC_Depth.py", line 8, in <module>
    from visualization import *
  File ".../Models/sc_depth_pl/visualization.py", line 1, in <module>
    import cv2
  File "/data/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/data/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/cv2/__init__.py", line 153, in bootstrap
    native_module = importlib.import_module("cv2")
  File "/data/anaconda3/envs/sc_depth_env/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: dlopen: cannot load any more object with static TLS

kitti dataset

I downloaded the kitti dataset you provided, do you know their timestamps? I want to test it under ORB-SLAM3, I downloaded the odometry data set (color, 65 GB) and can't find the correspondence. The folders inside are named by serial number, not date.
ZCC6NKL6_4Y236(6 C0 2~5

training loss

Hi, I trained the scv2 model by tum dataset, and I found that the loss could not converge to a low value after trained for 70 epochs,etc
image
Is there any skills in traning? And what should I pay attention to in training?

How to train on my own datasets without ground truth?

Could you please tell me how to train on my own datasets without ground truth?

"python train.py --config my_config --dataset_dir my_dataset"

It tells me to provide "val.txt" and "my_dataset/depth". Isn't the depth optional for validation?

Training error

Thanks for your great work!
when I train the dataset on Colab like this, this is a problem:

!python train.py --config configs/v3/ddad.txt --dataset_dir ddad --val_mode photo

/usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:253: UserWarning: Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and will be removed in 0.15. Please access them via the appropriate Weights Enum instead.
"Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and will "
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
22896 samples found for training
22896 samples found for validation
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | depth_net | DepthNet | 14.8 M
1 | pose_net | PoseNet | 13.0 M

27.9 M Trainable params
0 Non-trainable params
27.9 M Total params
111.417 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:566: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1163, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/usr/lib/python3.7/queue.py", line 179, in get
self.not_empty.wait(remaining)
File "/usr/lib/python3.7/threading.py", line 300, in wait
gotit = waiter.acquire(True, timeout)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 782) is killed by signal: Killed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train.py", line 60, in
trainer.fit(system, dm)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 697, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1274, in _run_train
self._run_sanity_check()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_sanity_check
val_loop.run()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 127, in advance
batch = next(data_fetcher)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/fetching.py", line 184, in next
return self.fetching_function()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/fetching.py", line 263, in fetching_function
self._fetch_next_batch(self.dataloader_iter)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/fetching.py", line 277, in _fetch_next_batch
batch = next(iterator)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1359, in _next_data
idx, data = self._get_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1315, in _get_data
success, data = self._try_get_data()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1176, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 782) exited unexpectedly

Since there is only one GPU, I set the num_worker=1( 0 the same error), then it comes to:

/usr/local/lib/python3.7/dist-packages/torchvision/models/_utils.py:253: UserWarning: Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and will be removed in 0.15. Please access them via the appropriate Weights Enum instead.
"Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and will "
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
22896 samples found for training
22896 samples found for validation
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | depth_net | DepthNet | 14.8 M
1 | pose_net | PoseNet | 13.0 M

27.9 M Trainable params
0 Non-trainable params
27.9 M Total params
111.417 Total estimated model params size (MB)
Sanity Checking DataLoader 0: 0% 0/5 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 60, in
trainer.fit(system, dm)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 697, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1274, in _run_train
self._run_sanity_check()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_sanity_check
val_loop.run()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 143, in advance
output = self._evaluation_step(**kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 240, in _evaluation_step
output = self.trainer._call_strategy_hook(hook_name, *kwargs.values())
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1704, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 370, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/content/sc_depth_pl/SC_DepthV3.py", line 90, in validation_step
tgt_depth = self.depth_net(tgt_img)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/content/sc_depth_pl/models/DepthNet.py", line 133, in forward
features = self.encoder(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/content/sc_depth_pl/models/resnet_encoder.py", line 100, in forward
x = self.encoder.conv1(x)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 457, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 454, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[4, 320, 256, 320] to have 3 channels, but got 320 channels instead

What should be changed? Thank you very much for your help!

AttributeError: module 'torch' has no attribute 'profiler'

Hi!

Thanks for sharing this great work!

When I run the code, it occurs:

Traceback (most recent call last):
File "../train.py", line 1, in
from pytorch_lightning import Trainer
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/init.py", line 34, in
from pytorch_lightning.callbacks import Callback # noqa: E402
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/callbacks/init.py", line 14, in
from pytorch_lightning.callbacks.callback import Callback
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/callbacks/callback.py", line 25, in
from pytorch_lightning.utilities.types import STEP_OUTPUT
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/utilities/init.py", line 18, in
from pytorch_lightning.utilities.apply_func import move_data_to_device # noqa: F401
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 29, in
from pytorch_lightning.utilities.imports import _compare_version, _TORCHTEXT_LEGACY
File "/home/hzc/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/utilities/imports.py", line 146, in
_KINETO_AVAILABLE = torch.profiler.kineto_available()
AttributeError: module 'torch' has no attribute 'profiler'

I install the env by requirements.txt. How can I solve this?

error when I train my own data

Hi, @JiawangBian

When I train my own data by V3 model, I get below issue, can you please guide me?

(sc_depth_env) [admin@localhost sc_depth_pl]$ python train.py --dataset_dir zylds --config configs/v3/th.txt --val_mode photo
/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/torchvision/models/_utils.py:252: UserWarning: Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and may be removed in the future. Please access them via the appropriate Weights Enum instead.
warnings.warn(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
367 samples found for training
367 samples found for validation
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | depth_net | DepthNet | 14.8 M
1 | pose_net | PoseNet | 13.0 M

27.9 M Trainable params
0 Non-trainable params
27.9 M Total params
111.417 Total estimated model params size (MB)
Sanity Checking: 0it [00:00, ?it/s]/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:225: PossibleUserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 72 which is the number of cpus on this machine) in theDataLoader` init to improve performance.
rank_zero_warn(
Sanity Checking DataLoader 0: 0%| | 0/5 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 60, in
trainer.fit(system, dm)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1274, in _run_train
self._run_sanity_check()
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_sanity_check
val_loop.run()
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 143, in advance
output = self._evaluation_step(**kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 240, in _evaluation_step
output = self.trainer._call_strategy_hook(hook_name, *kwargs.values())
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1704, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/admin/anaconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 370, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/home/admin/sc_depth_pl/SC_DepthV3.py", line 88, in validation_step
tgt_img, ref_imgs, intrinsics = batch
ValueError: too many values to unpack (expected 3)

training time

Hi,

We have tried to train the models by following "scripts/run_train.sh" on a single 2080Ti which is slightly worse than V100 performance.

Before training, we firstly change the value of "limit_train_batches"(train.py line 41) to 1.0 so that it can use all the images in one epoch. As a result, the number of the batches in each epoch is 3493. The processed dataset we used is downloaded through your link.
Another change is the "num_epochs" in the "scripts/run_train.sh". We use 50 epochs which is the same as the description in your article.

Then, we followed the "scripts/run_train.sh" to train on nyuv2. However, the total time here is about 15 hours and the time of each epoch is about 8 minutes. The time here is much less than the description in the paper, which is about 44 hours (section 6.3 Timing).

We are troubled by this problem and would like to seek your help in pointing out possible errors in our operation.

Thanks

Training error with index_anchors variable

Hi,
I am trying to train the network on my own dataset. I don't have ground-truth depth, but following the instructions I've generated pseudo-depth with LeReS. I am training with --val_mode photo.
However, during training the following error arises (I've just pasted the most representing part of the stack trace):

[...]
File "/home/gabriele/development/sc_depth_pl/losses/normal_ranking_loss.py", line 286, in forward
    ) = edgeGuidedSampling(
  File "/home/gabriele/development/sc_depth_pl/losses/normal_ranking_loss.py", line 73, in edgeGuidedSampling
    index_anchors = torch.randint(
RuntimeError: random_ expects 'from' to be less than 'to', but got from=0 >= to=0

This happens either with or without the --use_frame_index flag.
What am I doing wrong?
Thank you!

what's the measurement unit of depth ?

Hi, @JiawangBian

I tried my own data with V3, I found the output depths were like [20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44]. Actually I took the picturess with distance around 10cms, thus I wonder what's the measurment unit of the depth in output depth file? It seems neither mm nor cm.

Thanks you so much!

quat2mat

rotMat = torch.stack([w2 + x2 - y2 - z2, 2*xy - 2*wz, 2*wy + 2*xz,

if we use the quaternion coefficients as shown in image1
image
according to the Rodriguez formula,We can deduce the rotation matrix,as shown in image2
image
i find there are difference between formula and codes. Could you tell me the details of the changes you have made?

No module named 'imageio.v2'

When installed environment and then bash scripts/run_train.sh,I found from imageio.v2 import imread
ModuleNotFoundError: No module named 'imageio.v2',what can I do this?

absolute value of depth for a specific object in high resolution images and without any GT data

Hi,

The work is really appreciated!! Good job.

I have the following basic queries:

i. I am trying to generate the depth images of a custom dataset using given pre-trained models. Images have 4K resolution while models are trained at small resolutions (e.g 320x256 in bonn dataset). For a single 4k image, its respective depth image size is 320x256. To have the depth value for the original 4k images, I should calculate the value of depth from a small size depth image and then simply plug it with the original image or there is any way around it?

ii. Since I have a custom dataset without any Ground Truth (GT) data, after predicting the depth images, how can I calculate the absolute value of depth (in meters) for a particular object in an image? I have the maximum value of depth of the room where the camera is installed.

pseudo_depth in the data_modules.py

Dear Dr.Bian,
I train on my dataset with --val_mode photo, but I find the with_pseudo_depth=False in the data_modules. If it is False, will the pseudo_depth generated by the Leres effect?
Thank you very much for your patient reply!

elif self.hparams.hparams.val_mode == 'photo':
            self.val_dataset = TrainFolder(
                self.hparams.hparams.dataset_dir,
                train=False,
                transform=self.valid_transform,
                sequence_length=self.hparams.hparams.sequence_length,
                skip_frames=self.hparams.hparams.skip_frames,
                use_frame_index=self.hparams.hparams.use_frame_index,
                with_pseudo_depth=False
            )

How to visualize masks

Hi, I found a lot of code, but none of the files visualized with masks. And I saw you answer the question and offer the code place in SC sfmlearner, but it was difficult for me. I sincerely hope for your help.

Read GT depth

Thank you for your great work!
When I try to load gt depth, I have a question, what's the meaning of "/5000" and "/1000“ in the following comments? Where did they come from? By the way, when using LerEs to generating Pseudo-depth, what's the meaning of "*60000"?
@JiawangBian Thank you very much!

# load gt depth
            if args.dataset in ['nyu']:
                gt_depths[i] = imread(gt_depths[i]).astype(np.float32) / 5000
            elif args.dataset in ['scannet', 'bonn', 'tum']:
                gt_depths[i] = imread(gt_depths[i]).astype(np.float32) / 1000
            elif args.dataset == 'kitti':
                gt_depths[i] = np.load(gt_depths[i])

cv2.imwrite(os.path.join(image_dir_out, img_name[:-4]+'-depth_raw.png'), (pred_depth_ori/pred_depth_ori.max() * 60000).astype(np.uint16))

about point cloud

What method did you use to obtain the fusion point cloud? Can you provide a script? Thank you

About Subsample video frames

Hi, @JiawangBian

I get my own data by a blender script, the basic idea is to take the pictures while the camera moves around the object, thus I get ~100 pics in each scense (take one pic when the camer moves center degrees). Before the training, I use below to generate valid frames

python generate_valid_frame_index.py --dataset_dir $DATASET

However, I only get 1 pic in each scense, something wrong when I take the pic?

The pic looks like below:

无标题

Get Absolute distances value from the Kitti GT Depth map

@JiawangBian Thanks for the wonderful work !!

I wanted to get the absolute distances for objects from the Kitti GT depth map provided. I have downloaded the kitti raw dataset provided in the repo

To load the kitti GT depth map used the following code

images = sorted(glob.glob("kitti/training/2011_09_26_drive_0001_sync_02/*.jpg"))
depth_maps =  sorted(glob.glob("kitti/training/2011_09_26_drive_0001_sync_02/depth/*.npz"))

print("There are",len(images),"images with ",len(depth_maps)," depth maps")

index = 32

#load the gt depth map 
gt_depth_map = np.load(depth_maps[index])
gt_depth_map = csr_matrix((gt_depth_map['data'],gt_depth_map['indices'],gt_depth_map['indptr']),shape=gt_depth_map['shape'])
gt_depth_map = gt_depth_map.toarray()

f, (ax1, ax2) = plt.subplots(1, 2, figsize=(20,10))
ax1.imshow(cv2.cvtColor(cv2.imread(images[index]), cv2.COLOR_BGR2RGB))
ax1.set_title('Image', fontsize=30)
ax2.imshow(gt_depth_map)
ax2.set_title('Depth Map', fontsize=30)

the resulted out is as shown below
Screenshot 2022-01-20 at 8 35 08 PM

Than to get the bbox used an Yolov4 model

yolo = YOLOv4()
yolo.classes = "Yolov4/coco.names"
yolo.make_model()
yolo.load_weights("Yolov4/yolov4.weights", weights_type="yolo")

def run_obstacle_detection(img):
    start_time=time.time()
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    resized_image = yolo.resize_image(img)
    # 0 ~ 255 to 0.0 ~ 1.0
    resized_image = resized_image / 255.
    #input_data == Dim(1, input_size, input_size, channels)
    input_data = resized_image[np.newaxis, ...].astype(np.float32)

    candidates = yolo.model.predict(input_data)

    _candidates = []
    for candidate in candidates:
        batch_size = candidate.shape[0]
        grid_size = candidate.shape[1]
        _candidates.append(tf.reshape(candidate, shape=(1, grid_size * grid_size * 3, -1)))
        # candidates == Dim(batch, candidates, (bbox))
        candidates = np.concatenate(_candidates, axis=1)

        # pred_bboxes == Dim(candidates, (x, y, w, h, class_id, prob))
        pred_bboxes = yolo.candidates_to_pred_bboxes(candidates[0], iou_threshold=0.35, score_threshold=0.40)
        pred_bboxes = pred_bboxes[~(pred_bboxes==0).all(1)] #https://stackoverflow.com/questions/35673095/python-how-to-eliminate-all-the-zero-rows-from-a-matrix-in-numpy?lq=1
        pred_bboxes = yolo.fit_pred_bboxes_to_original(pred_bboxes, img.shape)
        exec_time = time.time() - start_time
        print("time: {:.2f} ms".format(exec_time * 1000))
        result = yolo.draw_bboxes(img, pred_bboxes)
    return result, pred_bboxes

img = cv2.imread(images[index])
result, pred_bboxes = run_obstacle_detection(img)
plt.figure(figsize = (14, 10))
plt.imshow(result)
plt.show() 

Screenshot 2022-01-20 at 8 38 01 PM

Overlayed the bbox on the depth map and took the depth value from the center point as shown

def find_distances(depth_map, pred_bboxes, img, method="center"):
    depth_list = []
    h, w, _ = img.shape
    print("shape :",img.shape)
    #h, w, _  = 256,256,3
    for box in pred_bboxes:
        x1 = int(box[0]*w - box[2]*w*0.5) # center_x - width /2
        y1 = int(box[1]*h-box[3]*h*0.5) # center_y - height /2
        x2 = int(box[0]*w + box[2]*w*0.5) # center_x + width/2
        y2 = int(box[1]*h+box[3]*h*0.5) # center_y + height/2
        obstacle_depth = depth_map[y1:y2, x1:x2]
        if method=="closest":
            depth_list.append(obstacle_depth.min()) # take the closest point in the box
        elif method=="average":
            depth_list.append(np.mean(obstacle_depth)) # take the average
        elif method=="median":
            depth_list.append(np.median(obstacle_depth)) # take the median
        else:
            depth_list.append(depth_map[int(box[1]*h)][int(box[0]*w)]) # take the center
     
    return depth_list

def add_depth(depth_list, result, pred_bboxes):
    h, w, _ = result.shape
    res = result.copy()
    for i, distance in enumerate(depth_list):
        cv2.putText(res, '{0:.2f} m'.format(distance), (int(pred_bboxes[i][0]*w - pred_bboxes[i][2]*w*0.2),int(pred_bboxes[i][1]*h)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 1, cv2.LINE_AA)    
    return res

depth_list = find_distances(gt_depth_map, pred_bboxes, img, method="center")
print(depth_list)
res = add_depth(depth_list, result, pred_bboxes)

plt.figure(figsize = (14,10))
plt.imshow(res)

the resulted out is this where we get 0m distances
Screenshot 2022-01-20 at 7 56 03 PM

Do we have to any other pre-processing prior to using the kitti gt depth maps ?

3D reconstruction visualization code

Hi, your work looks very great and I hope to use them in 3D reconstruction. Cuz I am new here, I am wondering whether you can share the 3D reconstruction visualization code or software as shown on Youtube. Thank you so much!!!

About static camera.

Hi, thank you for your gorgeous work. Recently I am trying to apply SC-DepthV3 on my own dataset. I note that before trainning, it is required to subsample video frames (to have sufficient motion) for training, which will filter out static-camera frames. However, my video are actually taken under a static camera, similar to a monitoring camera. I am wondering what will happen if I train on such dataset? Will the program crash?

How can I test pose ?

Hi dear author, I want to know how to test pose accuracy since I didn't find ground pose in the dataset that you provided.
图片

Is it possible to get indoor depth without gt or retraining?

I understand that I need a scaling factor to scale the model's predictions, but I don't have a GT depth map (I only want to do inference), so I can't get a median of the GT.
I can however get the maximum depth inside a room, can I somehow use it to get the absolute distance in the frame?
Let's say the wall of the room is 5 meters away from the camera, other objects (or people) are closer, but I don't know how much closer... will doing something like
depth = (depth / depth.max()) * 5 # 5 meters....
work?

Test with median scaling?

Hi, I'm interested in your impressive work!

If I understood correctly, this algorithm produces scale-consistent depth and pose estimation, so as mentioned in the paper, one does not need to do median scaling per frame (instead, only a single scaling factor would be enough).

However, in the code, I found that you still use median scaling per frame in test.py (see compute_errors. Sorry, I'm using my phone now so it's hard to paste the link).

That is different from what you mentioned in the paper:

Note that the comparison is highly disadvantageous to the proposed method: i)we align per-frame scale to the ground truth scale for [7] due to its scale-inconsistency, while we only align one global scale for our method.

error

inference.py: error: Unable to open config file: $CONFIG. Error: No such file or directory why do l get this error
test.py: error: Unable to open config file: $CONFIG. Error: No such file or directory
inference.py: error: Unable to open config file: configs. Error: Permission denied

photo_and_geometry_loss missing 1 required positional argument when val_mode = photo

I am training with my own dataset. I set the val_mode = photo because I do not have any ground truth images. I get the following error:

File "sc_depth_pl/SC_Depth.py", line 158, in validation_step
intrinsics, poses, poses_inv)
TypeError: photo_and_geometry_loss() missing 1 required positional argument: 'hparams'

It seems that the error occurs because this function takes 8 inputs by default but only 7 are given here. Is this intentional?

abs_diff

What does abs_diff mean? I found this graph in training.

full code

Hi! I want to use your system to test the global 6-dimensional camera trajectory and dense 3D map output by monocular RGB video. Could you please share the code of your system combined with orbslam2? thanks!

Error training V3 with its own dataset

Thank you for your excellent work.
I built my dataset in your format, with gt, and also added pseudo-depth, all image sizes are 256*256.
My dataset is a number of pictures of endoscopes, with 500 pictures in the validation set and 2000 pictures in the training set.
like this picture:
image_0001
Enter the command:CUDA_VISIBLE_DEVICES=0 python train.py --config /home/ubuntu/wl/sc_depth_pl-master/configs/v3/nyu.txt --dataset_dir /home/ubuntu/data0/dataset_endo_colon
get error:/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/torchvision/models/_utils.py:252: UserWarning: Accessing the model URLs via the internal dictionary of the module is deprecated since 0.13 and will be removed in 0.15. Please access them via the appropriate Weights Enum instead.
warnings.warn(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Trainer(limit_val_batches=1.0) was configured so 100% of the batches will be used..
1998 samples found for training
500 samples found for validation
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | depth_net | DepthNet | 14.8 M
1 | pose_net | PoseNet | 13.0 M

27.9 M Trainable params
0 Non-trainable params
27.9 M Total params
111.417 Total estimated model params size (MB)
Sanity Checking DataLoader 0: 0%| | 0/5 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 60, in
trainer.fit(system, dm)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1166, in _run
results = self._run_stage()
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1252, in _run_stage
return self._run_train()
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1274, in _run_train
self._run_sanity_check()
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1343, in _run_sanity_check
val_loop.run()
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 155, in advance
dl_outputs = self.epoch_loop.run(self._data_fetcher, dl_max_batches, kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 200, in run
self.advance(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 143, in advance
output = self._evaluation_step(**kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 240, in _evaluation_step
output = self.trainer._call_strategy_hook(hook_name, *kwargs.values())
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1704, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 370, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/home/ubuntu/wl/sc_depth_pl-master/SC_DepthV3.py", line 82, in validation_step
errs = LossF.compute_errors(gt_depth, tgt_depth, self.hparams.hparams.dataset_name)
File "/home/ubuntu/miniconda3/envs/sc_depth_env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/wl/sc_depth_pl-master/losses/loss_functions.py", line 193, in compute_errors
batch_size, h, w = gt.size()
ValueError: too many values to unpack (expected 3)

I want to ask you a few questions,:
Are 2000 training sets and 500 validation sets too few?
Is sc-depth-V3 suitable for endoscopic Settings (like inside the large intestine)?
In config, I can only choose bonn, ddad, kitti_raw, nyu, and tum, etc., but which one should I choose when I use my own data set? My own data set image size is 256*256. Should I modify my image size to fit one of the five options above? But I don't know the specific size of these five types of pictures. I guess the above error is caused by the mismatch between the size of my data set and the nyu data set. I am looking forward to your reply

BONN and TUM depth

Hi,
What is the scale coefficient of the depth ground truth images in the BONN and TUM dataset you provided? I only found out how to deal with nyu, kitti and ddad.

if self.dataset == 'nyu':
    depth = torch.from_numpy(
        imread(self.depth[index]).astype(np.float32)).float()/5000
elif self.dataset == 'kitti' or self.dataset == 'ddad':
    depth = torch.from_numpy(load_sparse_depth(
        self.depth[index]).astype(np.float32))

Looking forward to your reply. Thanks.

depthmap

Can I use the depthmap generated by sc_depth for dense mapping? I want to use the built map to navigate outdoors.

Own data training loss

Hello @JiawangBian

I'm training the model with my own data, and found below issue, all the loss (includeing val_loss) except geometry_loss looks well as below:
6227f68bf1ee67b77d8eb66e64cda9f

I wonder whether I should stop the training around 10K iterations, however afer that point all the other losses still drop down continually.

would you please give some advice on how this happen?

Thanks so much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.