Git Product home page Git Product logo

packnet-sfm's People

Contributors

adriengaidon-tri avatar ivasiljevic avatar kuanleetri avatar spillai avatar vitorguizilini avatar vitorguizilini-tri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

packnet-sfm's Issues

issue withe evaluation of Kitti dataset (first model: ResNet18, Self-Supervised, 192x640, ImageNet → KITTI (K))

Traceback (most recent call last):
File "scripts/eval.py", line 66, in
test(args.checkpoint, args.config, args.half)
File "scripts/eval.py", line 52, in test
model_wrapper.load_state_dict(state_dict)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ModelWrapper:

Checkpoint (.ckp) produced files size

I have a question about the size of checkpoint files. Why they all have the same size (519.6 MB). Even I train the program with a tiny version produced checkpoint file has the same volume as you shared it. Is there any specific reason?

Training on NuScenes

Hello!

First and foremost I would like to congratulate you on your great work!

I have read in your paper that you have done some testing on the NuScenes dataset and I was wondering if meanwhile, you have also done some training on it. This is something I am trying to do right now.

I read in the dataset.proto file about the hierarchy of the files and folders, but I could not find anywhere anything about the generation of those .json files.

What have you used to generate them?

I am asking this because the nuScenes dataset also has some .json files but are very differently structured and I don't know how I can generate .json files, having the same structure with the ones that you have, using those from nuScenes.

Thank you in advance for your response!

Issue with saving the output from the Infer.py

My issue is although I'm running the program correctly and it wrote that "saving ... to ..." but I cannot see the output.

I have no idea why am I getting this issue, but I assume it relates to the permission problem.

I appreciate it if someone helps me and gives me idea of how to fix it.

How to configure this project in IDE like Pycharm?

Hi,
This is really great work. I'm trying to use this network on my data and I'm able to run the code in shell using the Makefile you provided. But i hope to run it in Pycharm so that debugging can be easier. However, since the project is packaged in Docker and also commands are written in the Makefile. I'm not sure how I can configure the project in Pycharm. Searched for many tutorials on line but couldn't find a working one maybe due to my limited understanding of Docker.
I'm wondering if you have used IDE during the development of this project. Could you give some instructions on how to configure this project in Pycharm?
Best,

Two-stream ego-motion

Hi there,

Thanks for this great work.

I am trying to apply the two-stream ego-motion network to map the internal part of the shoulder when recorded during the arthroscopy. So far, I tried to use the pose and depth network separately. The video frame from the internal part of the joints have serious problem with the lack of textures. So I pretrained the Mono Depth network (Godard 2017) with high texture frames environments first and then trained it with the arthroscopic images, and I achieve some success in getting the depth for the arthroscopic frames. However, with the pose Network I had little success, yet. So far, I trained the pose network in a supervised manner where the coordinates of each frame is known with respect to the center (Absolute pose) and the it only works well when the same joint is being used.

I was thinking to use the two-stream ego-motion network which gives me the relative pose of the target image I_t with respect to the source image I_s. But I am guessing this may not work for my case as in the arthroscopic videos there are occasions when the whole frame is being suddenly obscured by a tissue and at the same time the camera is moving. In that case, I guess there will be no similarity between the I_t and the I_s (which is the previous frame?). This could lead into a huge drift over time. But then when I was reading your paper I noticed the similarity matching cost function (Eq 3) tries to match the I_t to the context images I_S (which I guess are all the available training images?). In that case, and for my application, the pose network should be able to estimate the pose of the first frame right after when obscuring tissue is removed. Or is it otherwise and after training the pose network only estimates the pose of the I_t with respect to the I_s?

Regards,
Jacob

Training with ImageDataset?

Hi,
I'm trying to train this model with my private video data, but the program fails to save checkpoint file.
I'm afraid that something is wrong in ImageDataset or my conf for the dataset.
Could you give me some advice?

Console outputs are like following:

Epoch 0 | Avg.Loss 0.0722: 100%|?????????????????????????????????????????????????| 856/856 [09:35<00:00,  1.49 images/s]
session1-frame_{:05d}-: 100%|?????????????????????????????????????????????????????| 746/746 [04:08<00:00, 3.00 images/s]
Traceback (most recent call last):
  File "scripts/train.py", line 62, in <module>
    train(args.file)
  File "scripts/train.py", line 57, in train
    trainer.fit(model_wrapper)
  File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 61, in fit
    self.check_and_save(module, validation_output)
  File "/workspace/packnet-sfm/packnet_sfm/trainers/base_trainer.py", line 42, in check_and_save
    self.checkpoint.check_and_save(module, output)
  File "/workspace/packnet-sfm/packnet_sfm/models/model_checkpoint.py", line 128, in check_and_save
    filepath = self.format_checkpoint_name(epoch, metrics)
  File "/workspace/packnet-sfm/packnet_sfm/models/model_checkpoint.py", line 117, in format_checkpoint_name
    filename = filename.format(**metrics)
ValueError: unexpected '{' in field name

My config for dataset is like following:

datasets:
    ...
    train:
        batch_size: 2
        dataset: ['Image']
        path: ['/workspace/packnet-sfm/data/my_video_001/session0']
        split: ["frame_{:05d}"]
        repeat: [1]

I have a number of images for training data and the path is like:
/workspace/packnet-sfm/data/my_video_001/session0/frame_00123.png

I looked into the code and found that ModelCheckpoint's filename had the value of:
epoch={epoch:02d}_session1-frame_0000{={session1-frame_0000{:01d}--loss:.3f}.
This was the direct cause of the error above.

Thanks in advance,

Evaluation of models

KITTI_raw-eigen_test_files-velodyne: 0.00 images [00:00, ? images/s]
Traceback (most recent call last):
File "scripts/eval.py", line 66, in
test(args.checkpoint, args.config, args.half)
File "scripts/eval.py", line 61, in test
trainer.test(model_wrapper)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 128, in test
self.evaluate(test_dataloaders, module)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 152, in evaluate
return module.test_epoch_end(all_outputs)
File "/workspace/packnet-sfm/packnet_sfm/models/model_wrapper.py", line 262, in test_epoch_end
output_data_batch, self.test_dataset, self.metrics_name)
File "/workspace/packnet-sfm/packnet_sfm/utils/reduce.py", line 53, in all_reduce_metrics
names = [key for key in list(output_data_batch[0][0].keys()) if key.startswith(name)]
IndexError: list index out of range
Makefile:77: recipe for target 'docker-run' failed

I got this error when I tried to evaluate the model.
Also, I did not use the whole dataset of Kitti and I tried to use partial data of KITTI.

fine-tune poorly on my own dataset

Hi, thanks for the great work ! I am trying to fine-tune the 'PackNet01_MR_selfsup_K.ckpt' on my own dataset, but the performance is not good. Here are some examples:

If i use 'PackNet01_MR_selfsup_K.ckpt' without fine-tune, the result looks like this:
image

after fine-tune this model, i get this result:
image

The following is the yaml file i used for training.

arch:
    max_epochs: 10
checkpoint:
    filepath: /workspace/packnet-sfm/results/chept
    save_top_k: -1
model:
    name: 'SelfSupModel'
    checkpoint_path: /data/models/PackNet01_MR_semisup_CStoK.ckpt
    optimizer:
        name: 'Adam'
        depth:
            lr: 0.0002
        pose:
            lr: 0.0002
    scheduler:
        name: 'StepLR'
        step_size: 30
        gamma: 0.5
    depth_net:
        name: 'PackNet01'
        version: '1A'
    pose_net:
        name: 'PoseNet'
        version: ''
    params:
        crop: 'garg'
        min_depth: 0.0
        max_depth: 80.0
datasets:
    augmentation:
        image_shape: (192, 640)
    train:
        batch_size: 1
        dataset: ['Image']
        path: ['my_training_dataset']
        split: ['{:03d}']
        repeat: [1]
    validation:
        dataset: ['Image']
        path: ['my_eval_dataset']
        split: ['{:03d}']
    test:
        dataset: ['Image']
        path: ['my_test_dataset']
        split: ['{:03d}']

i also changed the dummy_calibration parames inside datasets/image_dataset.py:

def dummy_calibration(image):
    w, h = [float(d) for d in image.size]
    return np.array([[343.169585 , 0.    , 321.358181],
                     [0.    , 344.619087, 93.813611],
                     [0.    , 0.    , 1.          ]])

My dataset contains 3k images, resolution is (192,640). (only mono images)
The training is done on 4 x Tesla M40, with NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1
After the training, the 'Avg.Loss' drops from 0.0370 to 0.0350. (it drops really slow.)

I know my training epochs is too few. i am still training it now. but the Avg.Loss is always around 0.0350.
Is there any other possible reasons about this poor result? Any help or suggestion will be appreciated. Thanks !

Why 3D conv?

First of all, congrats on the impressive work! The image reconstruction sanity check is highly inspiring.

I have a question regarding why PackNet uses 3d conv.

I think what the PackNet wants to do is to blend the 2x2 spatial content that is now scattered into the channel dimension. So PackNet used the 3rd dimension to blend the channel. Maybe group conv makes more sense in this application?
image

Another comment is that the paper mentioned that "2D conv are not designed to directly leverage the tiled structure of this feature space, instead, we propose to first learn to expand this structured representation via a 3d conv layer." I actually did not see in the ablation study how this is the case -- I only see that with 3d conv the results went better, but perhaps this is due to increased parameters in the model?

Thank you very much for your insights!

error on the step 5: bash evaluate_kitti.sh

docker build
-t packnet-sfm:master-latest . -f docker/Dockerfile
ERRO[0001] failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied
error during connect: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=docker%2FDockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&session=73lzetomzynf87189u2c8lnrk&shmsize=0&t=packnet-sfm%3Amaster-latest&target=&ulimits=null&version=1: context canceled
Makefile:33: recipe for target 'docker-build' failed
make: *** [docker-build] Error 1

Query regarding Self Supervised learning model

Hi, thanks for uploading the code. I had a few queries:

  1. As we are specifying the camera intrinsics so I believe the pre-trained model won't give good results on datasets other than which they are trained?
  2. Will the model learn depth in a scenario where there are very few distinct objects in the scene and environment that look like a repeating texture, eg: desert, farms, etc? Are these the situations or scenario where the model can benefit from some sort of supervision(Semi-supervised)
  3. For the Custom Image model, since in config YAML file, we are specifying the resolution of the image, do we need to account for the new resolution in the dummy camera matrix manually or there is function somewhere which premultiplies the camera matrix with the scale matrix.

About ADE20k datasets

Hi @VitorGuizilini,

Thanks for your code and the great work!
I am trying to train your model on ADE20k(MIT) datasets which have no depth information and parameters like camera intrinsics. In your paper ,you trained your model on cityscapes whose situation is similar to ADE20k, could you tell me how to train datasets like ADE20k and cityscapes?
Thank you very much!!

can't get the desired training result

Hi there,
Thanks so much for sharing the training code, I tried to run with self-sup model over KITTI dataset, but it seems to have a weird result after running for a few epochs with pretrained model.

Screenshot from 2020-05-19 13-46-18

The loss get smaller during training, but the evaluation metrics of error get higher every epoch. Did you have ever encounter this situation ?

@AdrienGaidon-TRI @VitorGuizilini-TRI @spillai

About the training speed

Thanks for sharing this wonderful work!

I'm trying to train the PackNet model with a single RTX2080 card, using batch_size=1 (due to the limited memory). But it basically needs 7 seconds to perform an iteration (forward and backward), while the DispNet only needs 0.2s. Is this the right training speed for the model or something wrong with it?

reproduction of the monodepth2 with resnet18

Hey! Thanks for your wonderful work!

Now i'm trying to reproduce the result of monodepth2 with resnet18 backbone. I used the train_kitti.yaml to train about 50 epochs. But the result is not good with loss around about 0.075 but absrel is only 0.125. I think it is may overfitting with the imperfect hyperparameters. So can you share the .yaml file for training monodepth2?

Inference on video

Hi @VitorGuizilini,

Thanks for your code and the great work! I am trying to get the inference on a custom video/image-sequence, and so was wondering if there needs a major change in the config or is to be done frame-by-frame in the video?

Thanks in advance

using pre-trained semantic segmentation

Hi,
First of all thank you for releasing this awesome repository and for your continuous support.
You mention in your readme that "can also use a fixed pre-trained semantic segmentation network to guide the representation learning further" but i didn't see any reference to it in the code.

Do you plan to include this feature in any of your upcoming releases?
thanks

details about your tensorrt implementation

dear author, I'm trying to accelerate your given models by using tensorrt. However, I got some problems.
My envs is as below:
TensorRT7.0 onnx-tensorrt pytorch1.4.0 onnx1.6.0
when converting the onnx style into tensrrt, I found that some operations(Pad, group norm) of the given model couldn't be parsed successfully. I was thinking that using the newest TensorRT could cover all operations. So what's your tensorrt version? if possible, would you mind sharing your converted onnx model with me? @AdrienGaidon-TRI @VitorGuizilini-TRI @spillai

Simple overfitting test error

Screen Shot 2020-07-21 at 1 19 23 PM
I am getting this ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() error. How should I fix this?
Thanks!

question about provided models & results

Hi,

Pretty cool work, thanks for sharing!
I have some questions regarding the provided KITTI models (linked in the README file), and their performance:

  1. Is the performance (RMSE etc.) of all the models reported w.r.t the improved-GT
    (your paper's table 3 refers to "Improved[42]")?
  2. is the last row (with "densified GT") means you're training also using the improved ground-truth from ref [42] to train on? (Generally - it would be nice if you couldr add a column saying which models have which supervision, just like in your table 3).
  3. in your paper, the last row in table 3 is absolutely amazing: 3.293 [m] RMSE, in absolute scale, without any lidar ground-truth, only pose information. This makes the gap to supervised methods much smaller. Is this model one of the provided models here?

Thanks for the help!
Z.

Issue with saving our own checkpoint (.ckp) file after our training

Hi,

I have a question regarding training on my own dataset. I tried to first use tiny_KITTI datasets for training and build my own .ckp file. However, I was not able to do that. I tried to make some changes in the .yaml file and defined the save directory for checkpoints there. However, I could not find it and I do not know how to do it. I appreciate it if you help me.

Thanks,
Soheil.

Did you compare packnet to resnet50 monodepth on kitti dataset?

Hello,

Very great paper! In table 3 of your paper, you show that absolute relative error for monodepth2 resnet was 0.115. PackNet was 0.111. This doesn't seem like a very big difference? Was the monodepth2 resnet resnet18 or resnet50? If it was only resnet18, then I would suspect that resnet50 would close this small improvement in accuracy to an even smaller improvement over monodepth2. But on your DDAD dataset results, the difference between packnet and monodepth2 resnet18 and resnet50 is huge.

Can you please elaborate on this?

Thank you.

Reconstruction loss experiment

Hello, thanks for the great work!

In your paper, in figure 4, we can see a comparison of the input image, the reconstruction of the classical conv+maxpool+bilinear upsampling, and the reconstruction of the packnet architecture.

Can you provide code for this experiment?

Solving the Infinite Depth Problem

I read the paper carefully, but it seems that you didn't solve this problem specifically. The weak supervision of velocity doesn't explicitly supervise on this part in my opinion, as there is no clear mathematical evidence that it does. You have 3 attracting depth predictions in the readme where the holes are gone, but I don't think it means you solved it...

Did I miss anything in the paper? Thank you.

Failed to convert depth_net model to TensorRT model

Hi, i noticed that in your CVPR paper, you said:

an inference time of 60ms on a Titan V100 GPU, which can be further improved to < 30ms using TensorRT

So i try to use TensorRT to improve the performance:

  1. Using the following code to export onnx model:
dummy_input = torch.randn(1, 3, 192, 640, device='cuda')
torch.onnx.export(model_wrapper.model.depth_net, dummy_input, "/my_folder/dump.onnx",opset_version=11)
  1. Using the convert tool provided by TensorRT to convert onnx to trt model:
./trtexec --onnx=/my_folder/dump.onnx --saveEngine=dump.trt

However, step 2 failed, the log info is below:

[07/23/2020-17:14:21] [I] === Model Options ===
[07/23/2020-17:14:21] [I] Format: ONNX
[07/23/2020-17:14:21] [I] Model: /data/exp_results/complex_model/dump.onnx
[07/23/2020-17:14:21] [I] Output:
[07/23/2020-17:14:21] [I] === Build Options ===
[07/23/2020-17:14:21] [I] Max batch: 1
[07/23/2020-17:14:21] [I] Workspace: 16 MB
[07/23/2020-17:14:21] [I] minTiming: 1
[07/23/2020-17:14:21] [I] avgTiming: 8
[07/23/2020-17:14:21] [I] Precision: FP32
[07/23/2020-17:14:21] [I] Calibration: 
[07/23/2020-17:14:21] [I] Safe mode: Disabled
[07/23/2020-17:14:21] [I] Save engine: dump.trt
[07/23/2020-17:14:21] [I] Load engine: 
[07/23/2020-17:14:21] [I] Builder Cache: Enabled
[07/23/2020-17:14:21] [I] NVTX verbosity: 0
[07/23/2020-17:14:21] [I] Inputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Outputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Input build shapes: model
[07/23/2020-17:14:21] [I] Input calibration shapes: model
[07/23/2020-17:14:21] [I] === System Options ===
[07/23/2020-17:14:21] [I] Device: 0
[07/23/2020-17:14:21] [I] DLACore: 
[07/23/2020-17:14:21] [I] Plugins:
[07/23/2020-17:14:21] [I] === Inference Options ===
[07/23/2020-17:14:21] [I] Batch: 1
[07/23/2020-17:14:21] [I] Input inference shapes: model
[07/23/2020-17:14:21] [I] Iterations: 10
[07/23/2020-17:14:21] [I] Duration: 3s (+ 200ms warm up)
[07/23/2020-17:14:21] [I] Sleep time: 0ms
[07/23/2020-17:14:21] [I] Streams: 1
[07/23/2020-17:14:21] [I] ExposeDMA: Disabled
[07/23/2020-17:14:21] [I] Spin-wait: Disabled
[07/23/2020-17:14:21] [I] Multithreading: Disabled
[07/23/2020-17:14:21] [I] CUDA Graph: Disabled
[07/23/2020-17:14:21] [I] Skip inference: Disabled
[07/23/2020-17:14:21] [I] Inputs:
[07/23/2020-17:14:21] [I] === Reporting Options ===
[07/23/2020-17:14:21] [I] Verbose: Disabled
[07/23/2020-17:14:21] [I] Averages: 10 inferences
[07/23/2020-17:14:21] [I] Percentile: 99
[07/23/2020-17:14:21] [I] Dump output: Disabled
[07/23/2020-17:14:21] [I] Profile: Disabled
[07/23/2020-17:14:21] [I] Export timing to JSON file: 
[07/23/2020-17:14:21] [I] Export output to JSON file: 
[07/23/2020-17:14:21] [I] Export profile to JSON file: 
[07/23/2020-17:14:21] [I] 
----------------------------------------------------------------
Input filename:   /data/exp_results/complex_model/dump.onnx
ONNX IR version:  0.0.4
Opset version:    11
Producer name:    pytorch
Producer version: 1.3
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()
[07/23/2020-17:14:23] [E] Failed to parse onnx file
[07/23/2020-17:14:23] [E] Parsing model failed
[07/23/2020-17:14:23] [E] Engine creation failed
[07/23/2020-17:14:23] [E] Engine set up failed

i tried the DepthResNet and PackNet01 model, both failed at this ERROR:

ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()

Have you met this error before? Any suggestions? Thanks.

About the warp_ref_image

Hi Vitor,

in

cams.append(Camera(K=K.float()).scaled(scale_factor).to(device))
ref_cams.append(Camera(K=ref_K.float(), Tcw=pose).scaled(scale_factor).to(device))

you only provide one scale to the camera's scaling function. Wouldn't this mean that, in case my image isn't scaled equaly in x and y direction, the camera intrinsic matrix is scaled incorrectly or is this addressed somewhere else?

Thanks for your time and patience for explaining your code to all of us ^^

PackNet Training Based on our own Dataset and Question about train_kitti.yaml

Hi,

I wanted to use training part for my own dataset. However, I'm not sure what are the information do I need for the training part?
Do we need point clouds in addition to the images of our own dataset?

also in the config files in train_kitti.yaml file:

" params:
crop: 'garg'
min_depth: 0.0
max_depth: 80.0 "

What are params and how did you determine the min and max depth? Is it only based on the Kitti dataset or something else?

Thank you so much!
Soheil.

About memory leaks

Hi! When i'm trying to train the SelfSupModel ,I found that memory usage is growing slowly (from 21G to 38G in one epoch). That's meant memory leaks? Any solution?

Which GPU is used? ("Titan V100 GPU")

Thanks for your contribution first of all.

The paper states "an inference time of 60ms on a Titan V100 GPU". If I'm not mistaken, such a GPU does not exist. It is either the "Titan V" or the "Tesla V100". Could you clarify please?

NuScenes Dataset Loader

Hi,

first of all, I want to join the others in congratulating you for your great work and for even providing this well documented code! Thanks.

I am currently trying to write a data loader for the NuScenes dataset. The data split files are formatted as follows

sample_token | backward_context_png | forward_context_png

and the sample is then constructed using this routine

image

However, my training results look as follows
evalImg_ep00002

One problem I encountered, which I am not certain whether it's related or not, is that I had to change the cfg.checkpoint.monitor entry from "loss" to "abs_rel", since the entry "loss" is not initialized if training is False.

My questions are:

  • Do you already have a NuScenes Dataloader which you could share, since you mentioned experimenting with NuScenes in your paper, too?
  • Did you experience such a behavior before or have any insights on what might go wrong?
  • Why did you set the cfg.checkpoint.monitor = "loss" when "loss" cannot be defined in the validation step ... I am guessing, that I did something wrong but don't know what...
  • How shall the gt depth maps be defined? Zeros everywhere, where we don't have detections and depth on pixels with detections or do you directly specify them as inverse depth maps?

Thanks in advance for your time and help.

Needed GPU Memory

Is there any possibility to use the network with an 4GB GPU device? Thanks in advance.

Issue with Using Image dataset (the one without depth)

I was trying to use the image_dataset to train my own dataset, however it did not work and it could not detect the dataset that I provided. It gives me this error:

`### Preparing Model
Model: SelfSupModel
DepthNet: PackNet01
PoseNet: PoseNet

Preparing Datasets

Setup train datasets

######### 0 (x1): ./data/datasets/CITY_tiny/city_tiny.txt

Setup validation datasets

######### 0: ./data/datasets/CITY_tiny/city_tiny.txt
######### 0: ./data/datasets/CITY_tiny/city_tiny.txt

Setup test datasets

######### 0: ./data/datasets/CITY_tiny/city_tiny.txt

########################################################################################################################

Config: configs.default_config -> ..configs.train_city_tiny.yaml

Name: default_config-train_city_tiny-2020.07.27-19h11m01s

########################################################################################################################
config:
-- name: default_config-train_city_tiny-2020.07.27-19h11m01s
-- debug: False
-- arch:
---- seed: 42
---- min_epochs: 1
---- max_epochs: 50
-- checkpoint:
---- filepath: ./data/experiments_new/default_config-train_city_tiny-2020.07.27-19h11m01s/{epoch:02d}_{CITY_tiny-city_tiny-abs_rel_pp_gt:.3f}
---- save_top_k: 5
---- monitor: CITY_tiny-city_tiny-abs_rel_pp_gt
---- monitor_index: 0
---- mode: min
---- s3_path:
---- s3_frequency: 1
---- s3_url:
-- save:
---- folder:
---- depth:
------ rgb: True
------ viz: True
------ npz: True
------ png: True
---- pretrained:
-- wandb:
---- dry_run: True
---- name:
---- project:
---- entity:
---- tags: []
---- dir:
---- url:
-- model:
---- name: SelfSupModel
---- checkpoint_path:
---- optimizer:
------ name: Adam
------ depth:
-------- lr: 0.0002
-------- weight_decay: 0.0
------ pose:
-------- lr: 0.0002
-------- weight_decay: 0.0
---- scheduler:
------ name: StepLR
------ step_size: 30
------ gamma: 0.5
------ T_max: 20
---- params:
------ crop: garg
------ min_depth: 0.0
------ max_depth: 80.0
---- loss:
------ num_scales: 4
------ progressive_scaling: 0.0
------ flip_lr_prob: 0.5
------ rotation_mode: euler
------ upsample_depth_maps: True
------ ssim_loss_weight: 0.85
------ occ_reg_weight: 0.1
------ smooth_loss_weight: 0.001
------ C1: 0.0001
------ C2: 0.0009
------ photometric_reduce_op: min
------ disp_norm: True
------ clip_loss: 0.0
------ padding_mode: zeros
------ automask_loss: True
------ velocity_loss_weight: 0.1
------ supervised_method: sparse-l1
------ supervised_num_scales: 4
------ supervised_loss_weight: 0.9
---- depth_net:
------ name: PackNet01
------ checkpoint_path:
------ version: 1A
------ dropout: 0.0
---- pose_net:
------ name: PoseNet
------ checkpoint_path:
------ version:
------ dropout: 0.0
-- datasets:
---- augmentation:
------ image_shape: (192, 640)
------ jittering: (0.2, 0.2, 0.2, 0.05)
---- train:
------ batch_size: 1
------ num_workers: 16
------ back_context: 1
------ forward_context: 1
------ dataset: ['Image']
------ path: ['./data/datasets/CITY_tiny']
------ split: ['city_tiny.txt']
------ depth_type: ['']
------ cameras: [[]]
------ repeat: [1]
------ num_logs: 5
---- validation:
------ batch_size: 1
------ num_workers: 8
------ back_context: 0
------ forward_context: 0
------ dataset: ['Image', 'Image']
------ path: ['./data/datasets/CITY_tiny', './data/datasets/CITY_tiny']
------ split: ['city_tiny.txt', 'city_tiny.txt']
------ depth_type: ['', '']
------ cameras: [[], []]
------ num_logs: 5
---- test:
------ batch_size: 1
------ num_workers: 8
------ back_context: 0
------ forward_context: 0
------ dataset: ['Image']
------ path: ['./data/datasets/CITY_tiny']
------ split: ['city_tiny.txt']
------ depth_type: ['']
------ cameras: [[]]
------ num_logs: 5
-- config: ./configs/train_city_tiny.yaml
-- default: configs/default_config
-- prepared: True
########################################################################################################################

Config: configs.default_config -> ..configs.train_city_tiny.yaml

Name: default_config-train_city_tiny-2020.07.27-19h11m01s

########################################################################################################################

0.00 images [00:00, ? images/s]
Traceback (most recent call last):
File "scripts/train.py", line 64, in
train(args.file)
File "scripts/train.py", line 59, in train
trainer.fit(model_wrapper)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 58, in fit
self.train(train_dataloader, module, optimizer)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 97, in train
return module.training_epoch_end(outputs)
File "/workspace/packnet-sfm/packnet_sfm/models/model_wrapper.py", line 219, in training_epoch_end
loss_and_metrics = average_loss_and_metrics(output_batch, 'avg_train')`

I'm afraid the way that I put my own dataset is wrong or some similar issue. I appreciate if you help me and tell me how can I use Image as the dataset.

How to split the Cityscapes for pretraining?

Is there a .txt file containing training files? just like eigen_zhou_files you provide in the readme.

I'll appreciate it if you would like to share the dataset file for CS and its training files list. Thanks!

Working directory mismatch in Docker environment

Hi,
In commit f41342d, when you runmake doker-run and other Docker-related targets,
the default working directory is set to /workspace/packnet-sfm/
but everything (code, data, confs...) is prepared under /workspace/packnet/ .

This leads python3 scripts/train.py checkpoint.ckpt to fail because train.py is at ../packnet/scripts/train.py.

Question about using monocular camera images dataset

I was checking the Kitti dataset that you used and noticed there are Right and Left side camera images for the training process. Is it possible for us to use only one camera for the training process with PackNet? (I mean not using left and right camera images)

Thanks,
Soheil

Two stage training code to solve infinite depth

Hi,

First off, really good work :) Is there any plans of adding the two stage training technique code? Is it handled as a post processing step by creating point clouds using depth estimates and applying ransac to determine the ground plane?

Convert pose sequence to trajectory

Hi,
Can you please share the code you used in one of your experiments for constructing the traversed route from the sequence of pose estimations?
For some reason I’m having trouble with that..
Thanks!

How to perform multi-GPU training?

Hello,
I'm trying to train a model with VelSupModel. And I find out that only single GPU is working. How to perform multi-GPU training? thx.

How to force the model to learn the depth in meters?

Hi!

Thanks for your great work! But i'm curious how to force the model to learn the depth in meters , as the DepthDecoder only predict the value in 0~1.

In my mind, the PoseNet may predict the truth scale with with the velocity supervision. But the output of the DepthNet must be multiplied by a factor to restore to the correct scale. Is right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.