tri-ml / packnet-sfm Goto Github PK
View Code? Open in Web Editor NEWTRI-ML Monocular Depth Estimation Repository
Home Page: https://tri-ml.github.io/packnet-sfm/
License: MIT License
TRI-ML Monocular Depth Estimation Repository
Home Page: https://tri-ml.github.io/packnet-sfm/
License: MIT License
It seems that this project only works under docker environments when I'm trying to evaluate your pre-trained models. how to test your project on an environment without docker?
Traceback (most recent call last):
File "scripts/eval.py", line 66, in
test(args.checkpoint, args.config, args.half)
File "scripts/eval.py", line 52, in test
model_wrapper.load_state_dict(state_dict)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ModelWrapper:
I have a question about the size of checkpoint files. Why they all have the same size (519.6 MB). Even I train the program with a tiny version produced checkpoint file has the same volume as you shared it. Is there any specific reason?
Hello!
First and foremost I would like to congratulate you on your great work!
I have read in your paper that you have done some testing on the NuScenes dataset and I was wondering if meanwhile, you have also done some training on it. This is something I am trying to do right now.
I read in the dataset.proto file about the hierarchy of the files and folders, but I could not find anywhere anything about the generation of those .json files.
What have you used to generate them?
I am asking this because the nuScenes dataset also has some .json files but are very differently structured and I don't know how I can generate .json files, having the same structure with the ones that you have, using those from nuScenes.
Thank you in advance for your response!
My issue is although I'm running the program correctly and it wrote that "saving ... to ..." but I cannot see the output.
I have no idea why am I getting this issue, but I assume it relates to the permission problem.
I appreciate it if someone helps me and gives me idea of how to fix it.
Hi,
This is really great work. I'm trying to use this network on my data and I'm able to run the code in shell using the Makefile you provided. But i hope to run it in Pycharm so that debugging can be easier. However, since the project is packaged in Docker and also commands are written in the Makefile. I'm not sure how I can configure the project in Pycharm. Searched for many tutorials on line but couldn't find a working one maybe due to my limited understanding of Docker.
I'm wondering if you have used IDE during the development of this project. Could you give some instructions on how to configure this project in Pycharm?
Best,
Hi,
I wanted to get the 3d points cloud of LiDAR points in the Velodyne directory.
However, I was not sure what software is good to convert these point cloud to a 3d point in the software that I can see the 3D view of all of these point clouds?
I appreciate it if you help me with it.
Best Regards,
Soheil.
Hi there,
Thanks for this great work.
I am trying to apply the two-stream ego-motion network to map the internal part of the shoulder when recorded during the arthroscopy. So far, I tried to use the pose and depth network separately. The video frame from the internal part of the joints have serious problem with the lack of textures. So I pretrained the Mono Depth network (Godard 2017) with high texture frames environments first and then trained it with the arthroscopic images, and I achieve some success in getting the depth for the arthroscopic frames. However, with the pose Network I had little success, yet. So far, I trained the pose network in a supervised manner where the coordinates of each frame is known with respect to the center (Absolute pose) and the it only works well when the same joint is being used.
I was thinking to use the two-stream ego-motion network which gives me the relative pose of the target image I_t with respect to the source image I_s. But I am guessing this may not work for my case as in the arthroscopic videos there are occasions when the whole frame is being suddenly obscured by a tissue and at the same time the camera is moving. In that case, I guess there will be no similarity between the I_t and the I_s (which is the previous frame?). This could lead into a huge drift over time. But then when I was reading your paper I noticed the similarity matching cost function (Eq 3) tries to match the I_t to the context images I_S (which I guess are all the available training images?). In that case, and for my application, the pose network should be able to estimate the pose of the first frame right after when obscuring tissue is removed. Or is it otherwise and after training the pose network only estimates the pose of the I_t with respect to the I_s?
Regards,
Jacob
Hi,
I'm trying to train this model with my private video data, but the program fails to save checkpoint file.
I'm afraid that something is wrong in ImageDataset or my conf for the dataset.
Could you give me some advice?
Console outputs are like following:
Epoch 0 | Avg.Loss 0.0722: 100%|?????????????????????????????????????????????????| 856/856 [09:35<00:00, 1.49 images/s]
session1-frame_{:05d}-: 100%|?????????????????????????????????????????????????????| 746/746 [04:08<00:00, 3.00 images/s]
Traceback (most recent call last):
File "scripts/train.py", line 62, in <module>
train(args.file)
File "scripts/train.py", line 57, in train
trainer.fit(model_wrapper)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 61, in fit
self.check_and_save(module, validation_output)
File "/workspace/packnet-sfm/packnet_sfm/trainers/base_trainer.py", line 42, in check_and_save
self.checkpoint.check_and_save(module, output)
File "/workspace/packnet-sfm/packnet_sfm/models/model_checkpoint.py", line 128, in check_and_save
filepath = self.format_checkpoint_name(epoch, metrics)
File "/workspace/packnet-sfm/packnet_sfm/models/model_checkpoint.py", line 117, in format_checkpoint_name
filename = filename.format(**metrics)
ValueError: unexpected '{' in field name
My config for dataset is like following:
datasets:
...
train:
batch_size: 2
dataset: ['Image']
path: ['/workspace/packnet-sfm/data/my_video_001/session0']
split: ["frame_{:05d}"]
repeat: [1]
I have a number of images for training data and the path is like:
/workspace/packnet-sfm/data/my_video_001/session0/frame_00123.png
I looked into the code and found that ModelCheckpoint's filename
had the value of:
epoch={epoch:02d}_session1-frame_0000{={session1-frame_0000{:01d}--loss:.3f}
.
This was the direct cause of the error above.
Thanks in advance,
If i want to get real distance per pixel for custom image, should i apply inv2depth function to infer.py output (depth from model_wrapper.depth)? Cause as for now, predicted distanced are pretty different from the real ones.
Where can i find this file?
KITTI_raw-eigen_test_files-velodyne: 0.00 images [00:00, ? images/s]
Traceback (most recent call last):
File "scripts/eval.py", line 66, in
test(args.checkpoint, args.config, args.half)
File "scripts/eval.py", line 61, in test
trainer.test(model_wrapper)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 128, in test
self.evaluate(test_dataloaders, module)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 152, in evaluate
return module.test_epoch_end(all_outputs)
File "/workspace/packnet-sfm/packnet_sfm/models/model_wrapper.py", line 262, in test_epoch_end
output_data_batch, self.test_dataset, self.metrics_name)
File "/workspace/packnet-sfm/packnet_sfm/utils/reduce.py", line 53, in all_reduce_metrics
names = [key for key in list(output_data_batch[0][0].keys()) if key.startswith(name)]
IndexError: list index out of range
Makefile:77: recipe for target 'docker-run' failed
I got this error when I tried to evaluate the model.
Also, I did not use the whole dataset of Kitti and I tried to use partial data of KITTI.
Hi, thanks for the great work ! I am trying to fine-tune the 'PackNet01_MR_selfsup_K.ckpt' on my own dataset, but the performance is not good. Here are some examples:
If i use 'PackNet01_MR_selfsup_K.ckpt' without fine-tune, the result looks like this:
after fine-tune this model, i get this result:
The following is the yaml file i used for training.
arch:
max_epochs: 10
checkpoint:
filepath: /workspace/packnet-sfm/results/chept
save_top_k: -1
model:
name: 'SelfSupModel'
checkpoint_path: /data/models/PackNet01_MR_semisup_CStoK.ckpt
optimizer:
name: 'Adam'
depth:
lr: 0.0002
pose:
lr: 0.0002
scheduler:
name: 'StepLR'
step_size: 30
gamma: 0.5
depth_net:
name: 'PackNet01'
version: '1A'
pose_net:
name: 'PoseNet'
version: ''
params:
crop: 'garg'
min_depth: 0.0
max_depth: 80.0
datasets:
augmentation:
image_shape: (192, 640)
train:
batch_size: 1
dataset: ['Image']
path: ['my_training_dataset']
split: ['{:03d}']
repeat: [1]
validation:
dataset: ['Image']
path: ['my_eval_dataset']
split: ['{:03d}']
test:
dataset: ['Image']
path: ['my_test_dataset']
split: ['{:03d}']
i also changed the dummy_calibration parames inside datasets/image_dataset.py:
def dummy_calibration(image):
w, h = [float(d) for d in image.size]
return np.array([[343.169585 , 0. , 321.358181],
[0. , 344.619087, 93.813611],
[0. , 0. , 1. ]])
My dataset contains 3k images, resolution is (192,640). (only mono images)
The training is done on 4 x Tesla M40, with NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1
After the training, the 'Avg.Loss' drops from 0.0370 to 0.0350. (it drops really slow.)
I know my training epochs is too few. i am still training it now. but the Avg.Loss is always around 0.0350.
Is there any other possible reasons about this poor result? Any help or suggestion will be appreciated. Thanks !
First of all, congrats on the impressive work! The image reconstruction sanity check is highly inspiring.
I have a question regarding why PackNet uses 3d conv.
I think what the PackNet wants to do is to blend the 2x2 spatial content that is now scattered into the channel dimension. So PackNet used the 3rd dimension to blend the channel. Maybe group conv makes more sense in this application?
Another comment is that the paper mentioned that "2D conv are not designed to directly leverage the tiled structure of this feature space, instead, we propose to first learn to expand this structured representation via a 3d conv layer." I actually did not see in the ablation study how this is the case -- I only see that with 3d conv the results went better, but perhaps this is due to increased parameters in the model?
Thank you very much for your insights!
docker build
-t packnet-sfm:master-latest . -f docker/Dockerfile
ERRO[0001] failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied
error during connect: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=docker%2FDockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&session=73lzetomzynf87189u2c8lnrk&shmsize=0&t=packnet-sfm%3Amaster-latest&target=&ulimits=null&version=1: context canceled
Makefile:33: recipe for target 'docker-build' failed
make: *** [docker-build] Error 1
Hi, thanks for uploading the code. I had a few queries:
I wonder why training script is not provided like other related projects
Why is the test set part of the validation set?
https://github.com/TRI-ML/packnet-sfm/blob/master/configs/train_kitti.yaml#L37
It may be just an incorrect sample config file.
Using the test set for validation would impact the overall performance in a positive way.
Would this performance gain be noticeable?
Thanks, Fabian
Hi @VitorGuizilini,
Thanks for your code and the great work!
I am trying to train your model on ADE20k(MIT) datasets which have no depth information and parameters like camera intrinsics. In your paper ,you trained your model on cityscapes whose situation is similar to ADE20k, could you tell me how to train datasets like ADE20k and cityscapes?
Thank you very much!!
Hi there,
Thanks so much for sharing the training code, I tried to run with self-sup model over KITTI dataset, but it seems to have a weird result after running for a few epochs with pretrained model.
The loss get smaller during training, but the evaluation metrics of error get higher every epoch. Did you have ever encounter this situation ?
Thanks for sharing this wonderful work!
I'm trying to train the PackNet model with a single RTX2080 card, using batch_size=1 (due to the limited memory). But it basically needs 7 seconds to perform an iteration (forward and backward), while the DispNet only needs 0.2s. Is this the right training speed for the model or something wrong with it?
Hey! Thanks for your wonderful work!
Now i'm trying to reproduce the result of monodepth2 with resnet18 backbone. I used the train_kitti.yaml to train about 50 epochs. But the result is not good with loss around about 0.075 but absrel is only 0.125. I think it is may overfitting with the imperfect hyperparameters. So can you share the .yaml file for training monodepth2?
Hi @VitorGuizilini,
Thanks for your code and the great work! I am trying to get the inference on a custom video/image-sequence, and so was wondering if there needs a major change in the config or is to be done frame-by-frame in the video?
Thanks in advance
Hi,
First of all thank you for releasing this awesome repository and for your continuous support.
You mention in your readme that "can also use a fixed pre-trained semantic segmentation network to guide the representation learning further" but i didn't see any reference to it in the code.
Do you plan to include this feature in any of your upcoming releases?
thanks
dear author, I'm trying to accelerate your given models by using tensorrt. However, I got some problems.
My envs is as below:
TensorRT7.0 onnx-tensorrt pytorch1.4.0 onnx1.6.0
when converting the onnx style into tensrrt, I found that some operations(Pad, group norm) of the given model couldn't be parsed successfully. I was thinking that using the newest TensorRT could cover all operations. So what's your tensorrt version? if possible, would you mind sharing your converted onnx model with me? @AdrienGaidon-TRI @VitorGuizilini-TRI @spillai
Hi,
Pretty cool work, thanks for sharing!
I have some questions regarding the provided KITTI models (linked in the README file), and their performance:
Thanks for the help!
Z.
First of all thank you for sharing your paper and code. One question concerning "Semantically-Guided Representation Learning for Self-Supervised Monocular Depth": Is the pretrained semantic network also necessary during inference or only during training?
Hi,
I have a question regarding training on my own dataset. I tried to first use tiny_KITTI datasets for training and build my own .ckp file. However, I was not able to do that. I tried to make some changes in the .yaml file and defined the save directory for checkpoints there. However, I could not find it and I do not know how to do it. I appreciate it if you help me.
Thanks,
Soheil.
Hello,
Very great paper! In table 3 of your paper, you show that absolute relative error for monodepth2 resnet was 0.115. PackNet was 0.111. This doesn't seem like a very big difference? Was the monodepth2 resnet resnet18 or resnet50? If it was only resnet18, then I would suspect that resnet50 would close this small improvement in accuracy to an even smaller improvement over monodepth2. But on your DDAD dataset results, the difference between packnet and monodepth2 resnet18 and resnet50 is huge.
Can you please elaborate on this?
Thank you.
Hello, thanks for the great work!
In your paper, in figure 4, we can see a comparison of the input image, the reconstruction of the classical conv+maxpool+bilinear upsampling, and the reconstruction of the packnet architecture.
Can you provide code for this experiment?
I read the paper carefully, but it seems that you didn't solve this problem specifically. The weak supervision of velocity doesn't explicitly supervise on this part in my opinion, as there is no clear mathematical evidence that it does. You have 3 attracting depth predictions in the readme where the holes are gone, but I don't think it means you solved it...
Did I miss anything in the paper? Thank you.
Hi, i noticed that in your CVPR paper, you said:
an inference time of 60ms on a Titan V100 GPU, which can be further improved to < 30ms using TensorRT
So i try to use TensorRT to improve the performance:
dummy_input = torch.randn(1, 3, 192, 640, device='cuda')
torch.onnx.export(model_wrapper.model.depth_net, dummy_input, "/my_folder/dump.onnx",opset_version=11)
./trtexec --onnx=/my_folder/dump.onnx --saveEngine=dump.trt
However, step 2 failed, the log info is below:
[07/23/2020-17:14:21] [I] === Model Options ===
[07/23/2020-17:14:21] [I] Format: ONNX
[07/23/2020-17:14:21] [I] Model: /data/exp_results/complex_model/dump.onnx
[07/23/2020-17:14:21] [I] Output:
[07/23/2020-17:14:21] [I] === Build Options ===
[07/23/2020-17:14:21] [I] Max batch: 1
[07/23/2020-17:14:21] [I] Workspace: 16 MB
[07/23/2020-17:14:21] [I] minTiming: 1
[07/23/2020-17:14:21] [I] avgTiming: 8
[07/23/2020-17:14:21] [I] Precision: FP32
[07/23/2020-17:14:21] [I] Calibration:
[07/23/2020-17:14:21] [I] Safe mode: Disabled
[07/23/2020-17:14:21] [I] Save engine: dump.trt
[07/23/2020-17:14:21] [I] Load engine:
[07/23/2020-17:14:21] [I] Builder Cache: Enabled
[07/23/2020-17:14:21] [I] NVTX verbosity: 0
[07/23/2020-17:14:21] [I] Inputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Outputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Input build shapes: model
[07/23/2020-17:14:21] [I] Input calibration shapes: model
[07/23/2020-17:14:21] [I] === System Options ===
[07/23/2020-17:14:21] [I] Device: 0
[07/23/2020-17:14:21] [I] DLACore:
[07/23/2020-17:14:21] [I] Plugins:
[07/23/2020-17:14:21] [I] === Inference Options ===
[07/23/2020-17:14:21] [I] Batch: 1
[07/23/2020-17:14:21] [I] Input inference shapes: model
[07/23/2020-17:14:21] [I] Iterations: 10
[07/23/2020-17:14:21] [I] Duration: 3s (+ 200ms warm up)
[07/23/2020-17:14:21] [I] Sleep time: 0ms
[07/23/2020-17:14:21] [I] Streams: 1
[07/23/2020-17:14:21] [I] ExposeDMA: Disabled
[07/23/2020-17:14:21] [I] Spin-wait: Disabled
[07/23/2020-17:14:21] [I] Multithreading: Disabled
[07/23/2020-17:14:21] [I] CUDA Graph: Disabled
[07/23/2020-17:14:21] [I] Skip inference: Disabled
[07/23/2020-17:14:21] [I] Inputs:
[07/23/2020-17:14:21] [I] === Reporting Options ===
[07/23/2020-17:14:21] [I] Verbose: Disabled
[07/23/2020-17:14:21] [I] Averages: 10 inferences
[07/23/2020-17:14:21] [I] Percentile: 99
[07/23/2020-17:14:21] [I] Dump output: Disabled
[07/23/2020-17:14:21] [I] Profile: Disabled
[07/23/2020-17:14:21] [I] Export timing to JSON file:
[07/23/2020-17:14:21] [I] Export output to JSON file:
[07/23/2020-17:14:21] [I] Export profile to JSON file:
[07/23/2020-17:14:21] [I]
----------------------------------------------------------------
Input filename: /data/exp_results/complex_model/dump.onnx
ONNX IR version: 0.0.4
Opset version: 11
Producer name: pytorch
Producer version: 1.3
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()
[07/23/2020-17:14:23] [E] Failed to parse onnx file
[07/23/2020-17:14:23] [E] Parsing model failed
[07/23/2020-17:14:23] [E] Engine creation failed
[07/23/2020-17:14:23] [E] Engine set up failed
i tried the DepthResNet and PackNet01 model, both failed at this ERROR:
ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()
Have you met this error before? Any suggestions? Thanks.
Any scripts convert depth result to point cloud?
Hi Vitor,
in
packnet-sfm/packnet_sfm/losses/multiview_photometric_loss.py
Lines 156 to 157 in f824ffc
Thanks for your time and patience for explaining your code to all of us ^^
Hi,
I wanted to use training part for my own dataset. However, I'm not sure what are the information do I need for the training part?
Do we need point clouds in addition to the images of our own dataset?
also in the config files in train_kitti.yaml file:
" params:
crop: 'garg'
min_depth: 0.0
max_depth: 80.0 "
What are params and how did you determine the min and max depth? Is it only based on the Kitti dataset or something else?
Thank you so much!
Soheil.
Hi! When i'm trying to train the SelfSupModel ,I found that memory usage is growing slowly (from 21G to 38G in one epoch). That's meant memory leaks? Any solution?
Thanks for your contribution first of all.
The paper states "an inference time of 60ms on a Titan V100 GPU". If I'm not mistaken, such a GPU does not exist. It is either the "Titan V" or the "Tesla V100". Could you clarify please?
Hi,
first of all, I want to join the others in congratulating you for your great work and for even providing this well documented code! Thanks.
I am currently trying to write a data loader for the NuScenes dataset. The data split files are formatted as follows
sample_token | backward_context_png | forward_context_png
and the sample is then constructed using this routine
However, my training results look as follows
One problem I encountered, which I am not certain whether it's related or not, is that I had to change the cfg.checkpoint.monitor entry from "loss" to "abs_rel", since the entry "loss" is not initialized if training is False.
My questions are:
Thanks in advance for your time and help.
Is there any possibility to use the network with an 4GB GPU device? Thanks in advance.
I was trying to use the image_dataset to train my own dataset, however it did not work and it could not detect the dataset that I provided. It gives me this error:
`### Preparing Model
Model: SelfSupModel
DepthNet: PackNet01
PoseNet: PoseNet
######### 0 (x1): ./data/datasets/CITY_tiny/city_tiny.txt
######### 0: ./data/datasets/CITY_tiny/city_tiny.txt
######### 0: ./data/datasets/CITY_tiny/city_tiny.txt
######### 0: ./data/datasets/CITY_tiny/city_tiny.txt
########################################################################################################################
########################################################################################################################
config:
-- name: default_config-train_city_tiny-2020.07.27-19h11m01s
-- debug: False
-- arch:
---- seed: 42
---- min_epochs: 1
---- max_epochs: 50
-- checkpoint:
---- filepath: ./data/experiments_new/default_config-train_city_tiny-2020.07.27-19h11m01s/{epoch:02d}_{CITY_tiny-city_tiny-abs_rel_pp_gt:.3f}
---- save_top_k: 5
---- monitor: CITY_tiny-city_tiny-abs_rel_pp_gt
---- monitor_index: 0
---- mode: min
---- s3_path:
---- s3_frequency: 1
---- s3_url:
-- save:
---- folder:
---- depth:
------ rgb: True
------ viz: True
------ npz: True
------ png: True
---- pretrained:
-- wandb:
---- dry_run: True
---- name:
---- project:
---- entity:
---- tags: []
---- dir:
---- url:
-- model:
---- name: SelfSupModel
---- checkpoint_path:
---- optimizer:
------ name: Adam
------ depth:
-------- lr: 0.0002
-------- weight_decay: 0.0
------ pose:
-------- lr: 0.0002
-------- weight_decay: 0.0
---- scheduler:
------ name: StepLR
------ step_size: 30
------ gamma: 0.5
------ T_max: 20
---- params:
------ crop: garg
------ min_depth: 0.0
------ max_depth: 80.0
---- loss:
------ num_scales: 4
------ progressive_scaling: 0.0
------ flip_lr_prob: 0.5
------ rotation_mode: euler
------ upsample_depth_maps: True
------ ssim_loss_weight: 0.85
------ occ_reg_weight: 0.1
------ smooth_loss_weight: 0.001
------ C1: 0.0001
------ C2: 0.0009
------ photometric_reduce_op: min
------ disp_norm: True
------ clip_loss: 0.0
------ padding_mode: zeros
------ automask_loss: True
------ velocity_loss_weight: 0.1
------ supervised_method: sparse-l1
------ supervised_num_scales: 4
------ supervised_loss_weight: 0.9
---- depth_net:
------ name: PackNet01
------ checkpoint_path:
------ version: 1A
------ dropout: 0.0
---- pose_net:
------ name: PoseNet
------ checkpoint_path:
------ version:
------ dropout: 0.0
-- datasets:
---- augmentation:
------ image_shape: (192, 640)
------ jittering: (0.2, 0.2, 0.2, 0.05)
---- train:
------ batch_size: 1
------ num_workers: 16
------ back_context: 1
------ forward_context: 1
------ dataset: ['Image']
------ path: ['./data/datasets/CITY_tiny']
------ split: ['city_tiny.txt']
------ depth_type: ['']
------ cameras: [[]]
------ repeat: [1]
------ num_logs: 5
---- validation:
------ batch_size: 1
------ num_workers: 8
------ back_context: 0
------ forward_context: 0
------ dataset: ['Image', 'Image']
------ path: ['./data/datasets/CITY_tiny', './data/datasets/CITY_tiny']
------ split: ['city_tiny.txt', 'city_tiny.txt']
------ depth_type: ['', '']
------ cameras: [[], []]
------ num_logs: 5
---- test:
------ batch_size: 1
------ num_workers: 8
------ back_context: 0
------ forward_context: 0
------ dataset: ['Image']
------ path: ['./data/datasets/CITY_tiny']
------ split: ['city_tiny.txt']
------ depth_type: ['']
------ cameras: [[]]
------ num_logs: 5
-- config: ./configs/train_city_tiny.yaml
-- default: configs/default_config
-- prepared: True
########################################################################################################################
########################################################################################################################
0.00 images [00:00, ? images/s]
Traceback (most recent call last):
File "scripts/train.py", line 64, in
train(args.file)
File "scripts/train.py", line 59, in train
trainer.fit(model_wrapper)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 58, in fit
self.train(train_dataloader, module, optimizer)
File "/workspace/packnet-sfm/packnet_sfm/trainers/horovod_trainer.py", line 97, in train
return module.training_epoch_end(outputs)
File "/workspace/packnet-sfm/packnet_sfm/models/model_wrapper.py", line 219, in training_epoch_end
loss_and_metrics = average_loss_and_metrics(output_batch, 'avg_train')`
I'm afraid the way that I put my own dataset is wrong or some similar issue. I appreciate if you help me and tell me how can I use Image as the dataset.
Is there a .txt file containing training files? just like eigen_zhou_files you provide in the readme.
I'll appreciate it if you would like to share the dataset file for CS and its training files list. Thanks!
Hi,
In commit f41342d, when you runmake doker-run
and other Docker-related targets,
the default working directory is set to /workspace/packnet-sfm/
but everything (code, data, confs...) is prepared under /workspace/packnet/
.
This leads python3 scripts/train.py checkpoint.ckpt
to fail because train.py is at ../packnet/scripts/train.py
.
Thanks for such excellent work! I notice that you extract camera pose from IMU and just use translation in velocity loss. Is it possible to replace the PoseNet by the camera pose from IMU?
I was checking the Kitti dataset that you used and noticed there are Right and Left side camera images for the training process. Is it possible for us to use only one camera for the training process with PackNet? (I mean not using left and right camera images)
Thanks,
Soheil
Hi,
First off, really good work :) Is there any plans of adding the two stage training technique code? Is it handled as a post processing step by creating point clouds using depth estimates and applying ransac to determine the ground plane?
Hi,
Can you please share the code you used in one of your experiments for constructing the traversed route from the sequence of pose estimations?
For some reason I’m having trouble with that..
Thanks!
Hello,
I'm trying to train a model with VelSupModel. And I find out that only single GPU is working. How to perform multi-GPU training? thx.
Hi!
Thanks for your great work! But i'm curious how to force the model to learn the depth in meters , as the DepthDecoder only predict the value in 0~1.
In my mind, the PoseNet may predict the truth scale with with the velocity supervision. But the output of the DepthNet must be multiplied by a factor to restore to the correct scale. Is right?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.