Git Product home page Git Product logo

abhi1kumar / deviant Goto Github PK

View Code? Open in Web Editor NEW
200.0 6.0 29.0 8.36 MB

[ECCV 2022] Official PyTorch Code of DEVIANT: Depth Equivariant Network for Monocular 3D Object Detection

Home Page: https://arxiv.org/abs/2207.10758

License: MIT License

CMake 0.01% Shell 0.80% C++ 64.72% Python 34.47%
object-detection 3d-object-detection autonomous-driving eccv2022 equivariance monocular-3d-detection monocular-3d-localization one-stage-detector depth-equivariance eccv

deviant's Introduction

Modern neural networks use building blocks such as convolutions that are equivariant to arbitrary 2D translations $(t_u, t_v)$. However, these vanilla blocks are not equivariant to arbitrary 3D translations $(t_x, t_y, t_z)$ in the projective manifold. Even then, all monocular 3D detectors use vanilla blocks to obtain the 3D coordinates, a task for which the vanilla blocks are not designed for. This paper takes the first step towards convolutions equivariant to arbitrary 3D translations in the projective manifold. Since the depth is the hardest to estimate for monocular detection, this paper proposes Depth EquiVarIAnt NeTwork (DEVIANT) built with existing scale equivariant steerable blocks. As a result, DEVIANT is equivariant to the depth translations $(t_z)$ in the projective manifold whereas vanilla networks are not. The additional depth equivariance forces the DEVIANT to learn consistent depth estimates, and therefore, DEVIANT achieves state-of-the-art monocular 3D detection results on KITTI and Waymo datasets in the image-only category and performs competitively to methods using extra information. Moreover, DEVIANT works better than vanilla networks in cross-dataset evaluation.

Much of the codebase is based on GUP Net. Some implementations are from GrooMeD-NMS and PCT. Scale Equivariant Steerable (SES) implementations are from SiamSE.

Citation

If you find our work useful in your research, please consider starring the repo and citing:

@inproceedings{kumar2022deviant,
   title={{DEVIANT: Depth EquiVarIAnt NeTwork for Monocular $3$D Object Detection}},
   author={Kumar, Abhinav and Brazil, Garrick and Corona, Enrique and Parchami, Armin and Liu, Xiaoming},
   booktitle={ECCV},
   year={2022}
}

Setup

  • Requirements

    1. Python 3.7
    2. PyTorch 1.10
    3. Torchvision 0.11
    4. Cuda 11.3
    5. Ubuntu 18.04/Debian 8.9

This is tested with NVIDIA A100 GPU. Other platforms have not been tested. Clone the repo first. Unless otherwise stated, the below scripts and instructions assume the working directory is the directory DEVIANT:

git clone https://github.com/abhi1kumar/DEVIANT.git
cd DEVIANT
  • Cuda & Python

Build the DEVIANT environment by installing the requirements:

conda create --name DEVIANT --file conda_GUP_environment_a100.txt
conda activate DEVIANT
pip install opencv-python pandas
  • KITTI, nuScenes and Waymo Data

Follow instructions of data_setup_README.md to setup KITTI, nuScenes and Waymo as follows:

DEVIANT
├── data
│      ├── KITTI
│      │      ├── ImageSets
│      │      ├── kitti_split1
│      │      ├── training
│      │      │     ├── calib
│      │      │     ├── image_2
│      │      │     └── label_2
│      │      │
│      │      └── testing
│      │            ├── calib
│      │            └── image_2
│      │
│      ├── nusc_kitti
│      │      ├── ImageSets
│      │      ├── training
│      │      │     ├── calib
│      │      │     ├── image
│      │      │     └── label
│      │      │
│      │      └── validation
│      │            ├── calib
│      │            ├── image
│      │            └── label
│      │
│      └── waymo
│             ├── ImageSets
│             ├── training
│             │     ├── calib
│             │     ├── image
│             │     └── label
│             │
│             └── validation
│                   ├── calib
│                   ├── image
│                   └── label
│
├── experiments
├── images
├── lib
├── nuscenes-devkit        
│ ...
  • AP Evaluation

Run the following to generate the KITTI binaries corresponding to R40:

sudo apt-get install libopenblas-dev libboost-dev libboost-all-dev gfortran
sh data/KITTI/kitti_split1/devkit/cpp/build.sh

We finally setup the Waymo evaluation. The Waymo evaluation is setup in a different environment py36_waymo_tf to avoid package conflicts with our DEVIANT environment:

# Set up environment
conda create -n py36_waymo_tf python=3.7
conda activate py36_waymo_tf
conda install cudatoolkit=11.3 -c pytorch

# Newer versions of tf are not in conda. tf>=2.4.0 is compatible with conda.
pip install tensorflow-gpu==2.4
conda install pandas
pip3 install waymo-open-dataset-tf-2-4-0 --user

To verify that your Waymo evaluation is working correctly, pass the ground truth labels as predictions for a sanity check. Type the following:

/mnt/home/kumarab6/anaconda3/envs/py36_waymo_tf/bin/python -u data/waymo/waymo_eval.py --sanity

You should see AP numbers as 100 in every entry after running this sanity check.

Training

Train the model:

chmod +x scripts_training.sh
./scripts_training.sh

The current Waymo config files use the full val set in training. For Waymo models, we had subsampled Waymo validation set by a factor of 10 (4k images) to save training time as in DD3D. Change val_split_name from 'val' to 'val_small' in waymo configs to use subsampled Waymo val set.

Testing Pre-trained Models

Model Zoo

We provide logs/models/predictions for the main experiments on KITTI Val /KITTI Test/Waymo Val data splits available to download here.

Data_Splits Method Config
(Run)
Weight
/Pred
Metrics All
(0.7)
Easy
(0.7)
Med
(0.7)
Hard
(0.7)
All
(0.5)
Easy
(0.5)
Med
(0.5)
Hard
(0.5)
KITTI Val GUP Net run_201 gdrive AP40 - 21.10 15.48 12.88 - 58.95 43.99 38.07
KITTI Val DEVIANT run_221 gdrive AP40 - 24.63 16.54 14.52 - 61.00 46.00 40.18
KITTI Test DEVIANT run_250 gdrive AP40 - 21.88 14.46 11.89 - - - -
Waymo Val GUP Net run_1050 gdrive APH-L1 2.27 6.11 0.80 0.03 9.94 24.59 4.78 0.22
Waymo Val DEVIANT run_1051 gdrive APH-L1 2.67 6.90 0.98 0.02 10.89 26.64 5.08 0.18

Testing

Make output folder in the DEVIANT directory:

mkdir output

Place models in the output folder as follows:

DEVIANT
├── output
│      ├── config_run_201_a100_v0_1
│      ├── run_221
│      ├── run_250
│      ├── run_1050
│      └── run_1051
│
│ ...

Then, to test, run the file as:

chmod +x scripts_inference.sh
./scripts_inference.sh

Cross-Dataset Evaluation of KITTI on nuScenes Frontal Val

See scripts_inference.sh

Qualitative Plots/Visualization

To get qualitative plots and visualize the predicted+GT boxes, type the following:

python plot/plot_qualitative_output.py --dataset kitti --folder output/run_221/results_test/data
python plot/plot_qualitative_output.py --dataset waymo --folder output/run_1051/results_test/data

Type the following to reproduce our other plots:

python plot/plot_sesn_basis.py
python plot/visualize_output_of_cnn_and_sesn.py

FAQ

  • Inference on older cuda version For inference on older cuda version, type the following before running inference:
source cuda_9.0_env
  • Correct Waymo version You should see a 16th column in each ground truth file inside data/waymo/validation/label/. This corresponds to the num_lidar_points_per_box. If you do not see this column, run:
cd data/waymo
python waymo_check.py 

to see if num_lidar_points_per_box is printed. If nothing is printed, you are using the wrong Waymo dataset version and you should download the correct dataset version.

  • Cannot convert a symbolic Tensor (strided_slice:0) to a numpy array This error indicates that you're trying to pass a Tensor to a NumPy call". This means you have a wrong numpy version. Install the correct numpy as:
pip install numpy==1.19.5

Acknowledgements

We thank the authors of GUP Net, GrooMeD-NMS, SiamSE, PCT and patched nuscenes-devkit for their awesome codebases. Please also consider citing them.

Contributions

We welcome contributions to the DEVIANT repo. Feel free to raise a pull request.

↳ Stargazers

Stargazers repo roster for @nastyox/Repo-Roster

↳ Forkers

Forkers repo roster for @nastyox/Repo-Roster

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

deviant's People

Contributors

abhi1kumar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deviant's Issues

About the Total Loss Function

Encountered a problem that I haven't quite understood. This problem exists in multiple 3D monocular object detection algorithms
The total loss in the algorithm is obtained by adding up multiple loss terms, However, adding the loss term of aa loss during the training process may result in negative values, as shown in the above figure. Will this simple summation method have a negative impact on the total loss??
loss1
The total loss in the algorithm is obtained by adding multiple loss terms, but I have found that some loss terms may be negative. I would like to ask if this negative loss value will have a negative impact on the total loss.
I found through research that negative losses come from laplacian_aleatoric_uncertainty_loss
loss
I have also tried to increase the absolute value of the loss terms that result in negative numbers, so that all the loss terms are positive, but the resulting model performance is not ideal
I have not fully understood this position and hope to receive guidance. Looking forward to your reply. Thank you very much

Cannot find nuScenes blobs for converting to KITTI format in data_setup_README

First many thanks for your work. I am new on object detection and have spent many time on this data_setup part. Yet I cannot setup the data like your structure.
A light suggestion: Download the nuScenes and Waymo datasets. As my understanding, these 2 datasets can be downloaded to somewhere outside the project because we will use soft link to connect them with the DEVIANT project.

My confusion lays in this step:
Then follow the instructions at convert_nuscenes_to_kitti_format_and_evaluate.sh to get nusc_kitti_org folder.

I think I should download the datasets following yours and nuScene's github. But I found it hard to follow convert_nuscenes_to_kitti_format_and_evaluate.sh as I don't have v1.0-trainval#number_blobs_camera.tgz and v1.0-trainval01_blobs_lidar.tgz and many other directories in this .sh file.

So I am not able to generate nusc_kitti_org folder and continue.

There may be some errors in class

rect_to_img
In your project, I used my own training data to train the model and found that some targets could never be trained.
Therefore, I searched for the problem and found the location on the image.
pts_rect is
pts_rect
P2 is
P2
Obtain results
results
But my image size is 2560*1150, The projected 3D coordinates exceeded the boundary, causing the target in the image to be unable to enter training.
According to coordinate transformation rules
微信截图_20231226093442
I think this position should be

pts_img = (pts_2d_hom[:, 0:2].T / pts_2d_hom[:, 2]).T

rather than

pts_img = (pts_2d_hom[:, 0:2].T / pts_rect_hom[:, 2]).T

However, after changing this position, I still haven't achieved ideal results. May I ask if my thinking is correct?? If there is an error, please point it out. If it is correct, are there any other relevant positions that need to be changed? Why did I not achieve the desired result
Sincerely in need of help, thank you very much

CUDA memory issue and multi-GPU training

Hi,
First of all, thanks for the code.
I installed everything by the steps you provided, and I'm trying to run only the deviant model using the following cmd taken from scripts_training.sh:

CUDA_VISIBLE_DEVICES=0 python -u tools/train_val.py --config=experiments/run_221.yaml

My problem is that I get an error for CUDA out of memory at the first epoch just after the logging of the weights.
I did manage to run the code on a very small partial dataset of the KITTI dataset (100 images).
Do you have any advice on how to approach this error?

How can I visualize the 3D boxes?

Hi all!
I managed to run the experiments on kitti, no issues.
But I would like to run the model and visualize the output of a single image. Is there a script to do that?
thank you!

Run on raw live video

I downloaded the pre-trained weighs. I would like to use the kitti weights to run on my raw video/webcam and get the output with 3d box and bird's eye view. How can I do that? Also, how do I include the my extrinsic and intrinsic camera calibration parameters?

Inference Error

Firstly, thanks for your excellent work!

When I try to validate my checkpoint on the KITTI validation set, i.e. CUDA_VISIBLE_DEVICES=7 python -u tools/train_val.py --config=experiments/run_221.yaml --resume_model output/run_221/checkpoints/checkpoint_epoch_20.pth -e.
I meet the FileNotFoundError: [Errno 2] No such file or directory: 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object': 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object'

The log is as follows:

Traceback (most recent call last):
  File "tools/train_val.py", line 155, in <module>
    main()
  File "tools/train_val.py", line 132, in main
    tester.test()
  File "/home/linhb/code/DEVIANT-main/code/lib/helpers/tester_helper.py", line 118, in test
    use_logging= True, logger= self.logger)
  File "/home/linhb/code/DEVIANT-main/code/lib/helpers/rpn_util.py", line 254, in evaluate_kitti_results_verbose
    results_obj.main = run_kitti_eval_script(eval_binary_path, results_data= stats_save_folder, gt_folder= gt_folder, lbls= lbls, use_40=True)
  File "/home/linhb/code/DEVIANT-main/code/lib/helpers/rpn_util.py", line 345, in run_kitti_eval_script
    _ = subprocess.check_output([eval_binary_path, results_data, gt_folder], stderr=devnull)
  File "/home/linhb/miniconda3/envs/DEVIANT/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/home/linhb/miniconda3/envs/DEVIANT/lib/python3.7/subprocess.py", line 488, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/home/linhb/miniconda3/envs/DEVIANT/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/home/linhb/miniconda3/envs/DEVIANT/lib/python3.7/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object': 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object'

Any help?

Testing on Rope3D Dataset

I converted the Rope3D dataset to KITTI format. I tried to test the model with the KITTI pre-training file you provided. But the result is not very satisfactory.
For the description of the data after I converted Rope3D:

  1. Original image resolution:1920×1080 , validation with resolution set to 960×512
  2. calib:
    P0: 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00
    P1: 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00
    P2: 2.173379882812e+03 0.000000000000e+00 9.618704833984e+02 0.000000000000e+00 0.000000000000e+00 2.322043945312e+03 5.883443603516e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
    P3: 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00
    R0_rect: 1.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00
    Tr_velo_to_cam: 1.994594966642e-03 -9.998606204387e-01 1.657520384002e-02 -1.115697257486e-01 -2.372202408477e-01 -1.657520384002e-02 -9.713144706380e-01 6.538036584690e+00 9.714538501993e-01 -1.994594966642e-03 -2.372202408477e-01 1.596758475422e+00
    Tr_imu_to_velo: 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00
  3. Visualisation of validation results
    image

So I have the following thoughts

  1. For KITTI, the data is acquired with the camera of the acquisition vehicle parallel to the ground. In Rope3D, its camera is located on the roadside traffic light frame, and the camera is non-parallel to the ground. So the geometric projection a priori for traditional 3D object detection does not apply to Rope3D's similar roadside dataset does it?
  2. In addition to this, when I validated the private dataset, I only provided the P2 internal reference, but I don't know how to combine the rotation and translation matrices of the external reference (I think this must be a problem of my knowledge base), but I can't find the corresponding question and answer, so I am asking this question, and I hope that you can answer it.
  3. For the next step, I would like to use Rope3D for training. Thank you very much for your outstanding contribution and activity. Salute!!!

P-matrix in KITTI

P matrix in calib in KITTI, what exactly does it mean?
According to my search, it is the product of the Intrinsic and Extrinsic parameters of the camera, where the Extrinsic parameters are the rotation matrix and translational vector.
Here's a simple matrix I got using checkerboard correction in matlab. He seems to have problems at [0,3] and [1,3]. I use this matrix to replace P2 in calib,but the inference results in null. It seems worse than before when you suggested that I just use something like containing only the Intrinsic parameters without the Extrinsic parameters. It is possible to get inference results using only the Intrinsic parameters, except that the 3D box do not match exactly
I'm sorry to bother you again, but this question has been bothering me for a long time and I can't find any useful information!
image

Local vs Global Orientation

I'm very sorry, but I have come across another place that I haven't quite understood and would like to ask for advice
The kitti dataset is annotated with alpha, Why obtain alpha through ry conversion. And I found through testing that the converted alpha and the annotated alpha are not exactly the same. I really don't understand why everyone is doing this in engineering.
Hope to receive guidance, thank you very much
微信截图_20240112173437

nan loss after 5 epochs on custom dataset

Hi,
Thanks for sharing your work.
I was training on a custom dataset.
The losses after 6 epochs are nan. Tried reducing the learning rate but that didnt help either. Wondering if @abhi1kumar you encountered this issue while training.

INFO  ------ TRAIN EPOCH 006 ------
INFO  Learning Rate: 0.001250
INFO  Weights:  depth_:nan, heading_:nan, offset2d_:1.0000, offset3d_:nan, seg_:1.0000, size2d_:1.0000, size3d_:nan,
INFO  BATCH[0020/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,
INFO  BATCH[0040/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,
INFO  BATCH[0060/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,
INFO  BATCH[0080/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,
INFO  BATCH[0100/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,
INFO  BATCH[0120/3150] depth_loss:nan, heading_loss:nan, offset2d_loss:nan, offset3d_loss:nan, seg_loss:nan, size2d_loss:nan, size3d_loss:nan,

Before epoch 6 losses are reducing as expected.

For submission on KITTI 3D OD benchmark

Hello. Thank you for the great research.

I have two questions.

  1. When submitting to the KITTI dataset benchmark, is it correct to use all 7,481 images in the training folder for training?

  2. If so, could you please explain how you validated for submission to the test?

Thank you.

Waymo Dataset Filtering

Hi,
congratulations to your very nice paper!

I would have a question regarding Waymo. In the paper you mention that you filter out objects with depth <= 2m and objects with too few lidar points (car: 100, pedestrian / cyclists: 50). In general I think it makes sense to do that.

I wonder whether that is even strict enough. Here is an example where I used your approach for data generation and plotted the results:
284152
Here the labels for the two cars on the right that are not even visible anymore:
Car 0 0 -10 1858.27 625.86 1920.0 872.42 1.84 2.14 4.75 9.95 1.78 17.5 1.52 1094
Car 0 0 -10 1769.02 728.22 1920.0 1280.0 1.8 2.16 4.86 4.17 2.15 5.16 -1.62 7286

Do you do more filtering that I am not aware of at the moment?
And do you also filter the ground truth labels in the same way for evaluation as for training? If not, what is the difference?

Best wishes

Johannes

evaluate result is too low

Hello, thanks for your great work~
I trained for 20 epoch on 2*GTX1080, but I find evaluate result is too low, pls help.

Here is the comparison between yours(left) and mine(right):
Screen Shot 2023-08-03 at 10 17 14

Here is my training log:
20230802_171210.txt

Waymo Dataset Training Epochs

I'm trying to train deviant on waymo, and I read 1051.yaml. Is it ok to train 30 epochs on waymo? Or is it just a pretrained model config, and I should train more epochs.

Failure during Multi-GPU evaluation

Hi,
I'm encountering an error in the first eval epoch. The error I get is:
Screenshot from 2024-01-17 16-32-04
I am running the gupnet model training:

CUDA_VISIBLE_DEVICES=0,1 python -u tools/train_val.py --config=experiments/run_221.yaml 

I was successful training the model on a sub dataset of only 300 images. The error appears what I train the full dataset.
Any suggestions?

Nuscenes evalution

Thanks for your great work!

Do you happen to know methods for evaluating nuScenes using metrics other than MAE, such as mAP in your code?

Waymo Dataset converter.py

First of all, thank you for the great work. I used the converter as instructed to convert the Waymo dataset. Afterward, I counted the number of images and calibrations in the Waymo validation_org set and found that there were 39,987 of each, but only 39,047 labels. When I applied the setup_split, I discovered that some data had not been converted due to the lack of labels. As a result, I only have 51,257 training samples and 38,960 validation samples, while your paper states that there are 52,386 training and 39,848 validation samples. How can I obtain the same number of samples as described in the paper (52,386, 39,848)?

Waymo dataset convert

thanks for the great work , I used the convert script you provided to convert to the kitti format, but there is no label in label_all folder, and then I tried to use the label.zip in #5 , and found that it cannot correspond to the converted image. Do you have any experience with this?

Understanding hard-coded values

Thank you again for your great work. I am still failing to make it work on another custom dataset. I tried getting more images, now more than 10,000, but the model still seems to not make any prediction at all. I was looking at the code, and I noticed that there are certain assumptions for max depth in kitti_utils.py that weren't mentioned elsewhere:

  • L302 : np.linspace(2,78,wsize*hsize).reshape(hsize,wsize,1)],-1)).reshape(-1,3)

  • L333 : random_depth = np.linspace(2,78,wsize*hsize).reshape(hsize,wsize,1)

Is there any documentation of similar assumptions made to fit Kitti dataset?

Shape of tensor a does not match the shape of tensor b

I am trying to work with the model on a custom dataset. I made a config file which is very similar to run_221.yaml.
I changed the dataset type(created another one to fit my custom classes and dimensions) and the resolution. The resolution of my images is WxH = 1024x750. I believe that the downsampling factor is causing the error, which states that shape of tensor a (188, ) does not match shape of tensor b (187, ). Note that 750/4 = 187.5, right in the middle of the shape mismatch.

After several tries of playing with the resolutions, the one that worked and that was closest to my original ratio was 704, 512. I wanted to know whether there is some part of code which I could change, so that I maintain my original image size. Also generally I wanted to know if this causes problems for the model to learn. Also, do I need to change the sesn_scales?

Thanks in advance!

I'm new to 3D object detection . I meet some troubles, can you give me some advice? thanks!

INFO ------ EVAL EPOCH 020 ------
Evaluation Progress: 100%|████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:54<00:00, 2.75s/it]
2023-10-07 15:28:18,751 INFO ==> Saving results in output/config_run_201_a100_v0_1/result_20
Traceback (most recent call last):
File "/home/ys/DEVIANT/code/tools/train_val.py", line 158, in
main()
File "/home/ys/DEVIANT/code/tools/train_val.py", line 152, in main
trainer.train()
File "/home/ys/DEVIANT/code/lib/helpers/trainer_helper.py", line 87, in train
self.eval_one_epoch()
File "/home/ys/DEVIANT/code/lib/helpers/trainer_helper.py", line 207, in eval_one_epoch
use_logging= True, logger= self.logger)
File "/home/ys/DEVIANT/code/lib/helpers/rpn_util.py", line 254, in evaluate_kitti_results_verbose
results_obj.main = run_kitti_eval_script(eval_binary_path, results_data= stats_save_folder, gt_folder= gt_folder, lbls= lbls, use_40=True)
File "/home/ys/DEVIANT/code/lib/helpers/rpn_util.py", line 345, in run_kitti_eval_script
_ = subprocess.check_output([eval_binary_path, results_data, gt_folder], stderr=devnull)
File "/home/ys/.conda/envs/DEVIANT/lib/python3.7/subprocess.py", line 411, in check_output
**kwargs).stdout
File "/home/ys/.conda/envs/DEVIANT/lib/python3.7/subprocess.py", line 488, in run
with Popen(*popenargs, **kwargs) as process:
File "/home/ys/.conda/envs/DEVIANT/lib/python3.7/subprocess.py", line 800, in init
restore_signals, start_new_session)
File "/home/ys/.conda/envs/DEVIANT/lib/python3.7/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object': 'data/KITTI/kitti_split1/devkit/cpp/evaluate_object'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.