cvmi-lab / pla Goto Github PK

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

License: Apache License 2.0

Python 89.46% C++ 5.12% Cuda 3.93% C 1.32% Shell 0.17%

3d-scene-understanding cvpr2023 deep-learning open-vocabulary open-world

pla's People

Contributors

Stargazers

Watchers

Forkers

lixiang007666 jihanyang a3dgroupatcsu yzj2019 kwonyoung9120 mfkiwl yangyong2018 js-622 wlc2424762917 bcrevincent hiyyg

pla's Issues

RuntimeError: DataLoader worker (pid 29969) is killed by signal: Bus error.

Hello !
Thank you for the interesting work.
while putting the model to training, I am getting this error. Can you please guide what is the solution? As shared memory limit can not be increased in my case.

Thanks in advance

Bus Error Core Dumped while running "generate_caption_idx.py" file

Hello !
Thank you for the interesting work.
I am getting this error wehn I am trying to run the generate_caption_idx.py file. It is happening when I put train, val and test folders generated from pre-processing step used in PointGroup in the data/scannetv2 folder. Can you please guide what can be the underlying issue?

About instance segmentation

Hi,
Thank you so much for the great work. I am writing to clarify a question regarding the instance segmentation:

I can see that you have modified the SoftGroup architecture by replacing anything that revealed novel category class labels.
My question is if the ground truth (GT) class-agnostic instance mask is needed during training. My understanding is yes, because you need that information to train the Class-agnostic Score Head as well as the Offset Head, both of which require GT class-agnostic mask information.
Please correct me if I am wrong. Thank you so much for your time
Best,
Zhening

About Visualization

Dear Authors,

How to visualize the tested results?

Thanks

how can i deal with it when i‘d like to segment any novel class i want？Need i train the model from scratch？

I want to reproduce the results of Figure 4 in the paper, segmenting unannotated categories in the dataset.
Do i need to modify the cfg file and then train from scratch, or i can use the model weight like B13/N4 during inference directly?
But when i directly used the model weight like B13/N4 in Scannet, and i added some unannotated novel classes in the cfg and then get text embeddings online during inference time, i just got 'nan' on novel classes.
So i want to ask that if i need to train from scratch when i add some novel classes???

A Question About S3DIS

Hi， I noticed that the PLA method is based on 2D images to obtain the correspondence between natural language and point clouds. However, as S3DIS does not include 2D images, I am curious how you obtain the correspondence between natural language and point clouds in this dataset.

How to carry out experiments on scannet200.

Hi,I found in tools/cfgs/dataset_configs/scannet200_dataset.yaml's 1line,the 'data' folder has no scannetv2_200.

question about training time

Thanks for the great work! I notice that 8 GPUS are used to train a model. I wonder what model of graphics card do you use?And how long does it take for 8 GPUS to finish a training ?

How to get captions' corresponding point indices

Hi, thanks for your interesting work!
I'm wondering how you got captions' corresponding point indices, in other words, how did you get these pickle files(e.g.,scannetv2_entity_vit-gpt2_matching_idx.pickle)?

caption_idx

Hi, thanks for your great work!
I understand the meaning of equation (18). But I found in the code that the target of the loss function is caption_idx, is it convenient for you to explain the exact meaning of caption_idx?

Why using "force_fp32"?

Hi,

I notice that some of the functions are using "force_fp32" although trained under amp. What is the rationale of this?

How to get the captions' 2D rendered images?

Hi, thanks for your interesting work!
I can only find the captions and the corresponding point indices, but I can't find the corresponding 2D rendering images. How could I get them?
Thanks a lot!!

2D verion of S3DIS

Hello !
Your work is very interesting. Thank you for sharing it.
I reuest you to please provide the link for the 2d version of S3DIS dataset. I checked the Dataset.md file but the link is very confusing. Please provide the link for the 2d version of the dataset.

Regards
Rabbia Hassan

How to train Fully-supervised model?

Hi,

I want to train a fully-supervised baseline ("Fully-sup." in table 1). Are the data augmentation, learning rate and optimizer the same as PLA?

How to reproduce the annotation-free result in Table 5 in the RegionPLC paper?

Hi, I noticed that in the RegionPLC paper, there is a new annotation-free setting and result for PLA, which is 17.7 mIoU. How can I reproduce the result? Can it be reproduced by calculating the caption loss only?

Missing file of nuScenes:nuscenes_infos_1sweeps_train.pkl nuscenes_infos_1sweeps_val.pkl

I want to ask how to generate the nuscenes_infos_1sweeps_train.pkl file. I couldn't find a file similar to create_data.py in the mmdet project directory.

RegionPLC codes

Hi, @Dingry,

Thanks for your great work.
When will you release the source codes of RegionPLC?

Thanks a lot!

scripts to generate the captions for S3DIS dataset

Hello !
Can you please provide the link for the scripts for the generation of the captions for S3DIS dataset. because I beileve they are not implemented in the scripts that you shared in Dataset.md file.

I look forward to your response

Thanks

Command to generate RegionPLC captions

Hello,

Thank you for your time and for sharing your work!

I'm interested in reproducing the generated caption dataset provided in here. However, I couldn't find specific instructions on how to generate this dataset.

Could you please provide the exact command(s) or script used to produce the generated captions?
Especially, the one used for scannet - spconv_clip_base15 experiment, i.e. caption_detic-template_and_kosmos_125k_iou0.2.json and scannet_caption_idx_detic-template_and_kosmos_125k_iou0.2.pkl?
This would be extremely helpful for reproducing your results and understanding the generation process.

Question about table 5 in RegionPLC

Hi,

I have a question about table 5 in RegionPLC, where you compare with openscene. OpenScene use a different feature extractor from PLA. Do you use the same feature extractor in all the baselines in table 5?

About Formula 10 in your paper

Hi, thanks for your interesting work.

I want to ask about the point selection for the view-level caption association. Can you explain why voxelization in Formula 10 is necessary? Directly using the intrinsic matrix and extrinsic matrix of the camera can give the pixel-point pairs of the view. Is that right?

Thanks for your attention.

error when training on 1 gpu

Hi,
Thanks for releasing code! When I training S3DIS on one gpu, I got an error below. It seems that not all modules are participated during training? How to solve it?

Question about binary head

Thanks for your great work and high-quality open-source code! I have some questions regarding the paper.
Since the open vocabulary problem can only access the annotation of the base categories during training, how are the ground-truth labels (i.e., y^{b} in Eq(4) of the main paper) indicating the base and novel categories set when calculating the binary loss (Eq(4) of the main paper)? Theoretically, any information about the novel class should be unseen during the training.

Training result on s3dis dataset is low

Hi !Thanks for your great work!

I follow your experiment setting on your paper, 8 A100, batchsize=32. But the result on s3dis are as follows, which are far from the results provided in your paper. Do you have any suggestions with training?
s3dis B8/N4

About the meanings of the datas in a batch

Hi, thanks the great work! Now I'm facing the problem that I don't konw the exact meaning of each components in a batch, such as points_xyz_voxel_scale, v2p_map, original_idx, offsets, and so on.
Could you explain them and introduce the usage? Thanks a lot!

Training Issue:OSError: [Errno 24] Too many open files

Hi~
When I run python train.py --cfg_file cfgs/scannet_models/spconv_clip_base15_caption_adamw.yaml,


2023-08-29 12:51:48,581   INFO  cfg_file         cfgs/scannet_models/spconv_clip_base15_caption_adamw.yaml
2023-08-29 12:51:48,583   INFO  batch_size       4
2023-08-29 12:51:48,584   INFO  epochs           128
2023-08-29 12:51:48,586   INFO  workers          4
...
2023-08-29 12:51:55,038   INFO  **********************Start training scannet_models/spconv_clip_base15_caption_adamw(default)**********************
2023-08-29 12:53:11,734   INFO  Epoch [1/128][20/1201] LR: 0.004, ETA: 6 days, 18:09:08, Data: 0.08 (0.42), Iter: 1.91 (3.80), Accuracy: 0.56, loss_seg=1.61, binary_loss=0.46, caption_view=0.17, caption_entity=0.15, loss=2.38, n_captions=13.2(13.1)
2023-08-29 12:53:51,830   INFO  Epoch [1/128][40/1201] LR: 0.004, ETA: 5 days, 3:52:20, Data: 0.05 (0.25), Iter: 1.97 (2.90), Accuracy: 0.54, loss_seg=1.55, binary_loss=0.29, caption_view=0.16, caption_entity=0.14, loss=2.15, n_captions=13.2(12.8)
2023-08-29 12:54:29,663   INFO  Epoch [1/128][60/1201] LR: 0.004, ETA: 4 days, 13:28:58, Data: 0.07 (0.19), Iter: 2.13 (2.56), Accuracy: 0.61, loss_seg=1.42, binary_loss=0.36, caption_view=0.17, caption_entity=0.15, loss=2.11, n_captions=16.2(12.8)
2023-08-29 12:55:09,310   INFO  Epoch [1/128][80/1201] LR: 0.004, ETA: 4 days, 7:14:50, Data: 0.09 (0.16), Iter: 2.17 (2.42), Accuracy: 0.67, loss_seg=1.28, binary_loss=0.35, caption_view=0.19, caption_entity=0.17, loss=1.99, n_captions=18.8(12.7)
2023-08-29 12:55:47,967   INFO  Epoch [1/128][100/1201] LR: 0.004, ETA: 4 days, 3:05:21, Data: 0.10 (0.14), Iter: 2.03 (2.32), Accuracy: 0.66, loss_seg=1.24, binary_loss=0.17, caption_view=0.16, caption_entity=0.15, loss=1.72, n_captions=13.2(12.7)
2023-08-29 12:56:31,885   INFO  Epoch [1/128][120/1201] LR: 0.004, ETA: 4 days, 2:10:41, Data: 0.06 (0.13), Iter: 1.77 (2.30), Accuracy: 0.70, loss_seg=1.20, binary_loss=0.21, caption_view=0.14, caption_entity=0.10, loss=1.65, n_captions=7.5(12.9)
2023-08-29 12:57:11,898   INFO  Epoch [1/128][140/1201] LR: 0.004, ETA: 4 days, 0:20:09, Data: 0.08 (0.12), Iter: 1.93 (2.26), Accuracy: 0.63, loss_seg=1.25, binary_loss=0.19, caption_view=0.15, caption_entity=0.14, loss=1.73, n_captions=9.8(12.9)
2023-08-29 12:57:50,126   INFO  Epoch [1/128][160/1201] LR: 0.004, ETA: 3 days, 22:28:16, Data: 0.07 (0.11), Iter: 2.11 (2.21), Accuracy: 0.67, loss_seg=1.21, binary_loss=0.19, caption_view=0.16, caption_entity=0.15, loss=1.71, n_captions=12.0(12.8)
2023-08-29 12:58:31,512   INFO  Epoch [1/128][180/1201] LR: 0.004, ETA: 3 days, 21:46:22, Data: 0.07 (0.11), Iter: 1.72 (2.20), Accuracy: 0.61, loss_seg=1.41, binary_loss=0.28, caption_view=0.16, caption_entity=0.09, loss=1.93, n_captions=9.0(12.9)
2023-08-29 12:59:12,194   INFO  Epoch [1/128][200/1201] LR: 0.004, ETA: 3 days, 21:03:14, Data: 0.07 (0.11), Iter: 2.43 (2.18), Accuracy: 0.54, loss_seg=1.49, binary_loss=0.26, caption_view=0.18, caption_entity=0.17, loss=2.09, n_captions=18.0(13.0)
2023-08-29 12:59:51,518   INFO  Epoch [1/128][220/1201] LR: 0.004, ETA: 3 days, 20:12:49, Data: 0.05 (0.10), Iter: 1.69 (2.16), Accuracy: 0.74, loss_seg=0.96, binary_loss=0.26, caption_view=0.14, caption_entity=0.10, loss=1.46, n_captions=7.8(12.9)
2023-08-29 13:00:32,747   INFO  Epoch [1/128][240/1201] LR: 0.004, ETA: 3 days, 19:50:16, Data: 0.10 (0.10), Iter: 2.08 (2.15), Accuracy: 0.60, loss_seg=1.40, binary_loss=0.28, caption_view=0.16, caption_entity=0.15, loss=2.01, n_captions=12.5(12.9)
2023-08-29 13:01:13,779   INFO  Epoch [1/128][260/1201] LR: 0.004, ETA: 3 days, 19:29:12, Data: 0.06 (0.10), Iter: 2.38 (2.15), Accuracy: 0.62, loss_seg=1.29, binary_loss=0.23, caption_view=0.18, caption_entity=0.16, loss=1.86, n_captions=19.0(12.9)
2023-08-29 13:01:52,565   INFO  Epoch [1/128][280/1201] LR: 0.004, ETA: 3 days, 18:51:06, Data: 0.06 (0.09), Iter: 1.83 (2.13), Accuracy: 0.66, loss_seg=1.18, binary_loss=0.27, caption_view=0.15, caption_entity=0.12, loss=1.72, n_captions=9.8(12.9)
Traceback (most recent call last):
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 322, in reduce_storage
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/reduction.py", line 194, in DupFd
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files
2023-08-29 13:02:32,142   INFO  Epoch [1/128][300/1201] LR: 0.004, ETA: 3 days, 18:24:18, Data: 0.08 (0.09), Iter: 1.62 (2.12), Accuracy: 0.52, loss_seg=1.61, binary_loss=0.20, caption_view=0.16, caption_entity=0.13, loss=2.10, n_captions=12.0(12.9)

Then, I reduced the num_workers and the batch_size

2023-08-29 14:35:17,045   INFO  **********************Start logging**********************
2023-08-29 14:35:17,045   INFO  CUDA_VISIBLE_DEVICES=ALL
2023-08-29 14:35:17,046   INFO  cfg_file         cfgs/scannet_models/spconv_clip_base15_caption_adamw.yaml
2023-08-29 14:35:17,058   INFO  batch_size       1
2023-08-29 14:35:17,060   INFO  epochs           128
2023-08-29 14:35:17,062   INFO  workers          1
...
2023-08-29 14:35:24,926   INFO  **********************Start training scannet_models/spconv_clip_base15_caption_adamw(default)**********************
2023-08-29 14:35:43,175   INFO  Epoch [1/128][20/4804] LR: 0.004, ETA: 6 days, 9:57:34, Data: 0.01 (0.03), Iter: 0.80 (0.90), Accuracy: 0.57, loss_seg=1.62, binary_loss=0.61, caption_view=0.08, caption_entity=0.06, loss=2.37, n_captions=11.0(11.8)
2023-08-29 14:35:57,832   INFO  Epoch [1/128][40/4804] LR: 0.004, ETA: 5 days, 19:34:41, Data: 0.02 (0.03), Iter: 0.67 (0.82), Accuracy: 0.77, loss_seg=1.19, binary_loss=0.52, caption_view=0.04, caption_entity=0.00, loss=1.75, n_captions=3.0(12.0)
2023-08-29 14:36:12,468   INFO  Epoch [1/128][60/4804] LR: 0.004, ETA: 5 days, 14:40:01, Data: 0.03 (0.03), Iter: 0.80 (0.79), Accuracy: 0.59, loss_seg=1.53, binary_loss=0.60, caption_view=0.11, caption_entity=0.07, loss=2.32, n_captions=15.0(12.6)
2023-08-29 14:36:25,845   INFO  Epoch [1/128][80/4804] LR: 0.004, ETA: 5 days, 9:36:36, Data: 0.01 (0.02), Iter: 0.28 (0.76), Accuracy: 0.43, loss_seg=2.38, binary_loss=0.31, caption_view=0.10, caption_entity=0.07, loss=2.86, n_captions=13.0(12.4)
2023-08-29 14:36:39,422   INFO  Epoch [1/128][100/4804] LR: 0.004, ETA: 5 days, 6:51:36, Data: 0.02 (0.02), Iter: 0.46 (0.74), Accuracy: 0.43, loss_seg=1.95, binary_loss=0.26, caption_view=0.09, caption_entity=0.07, loss=2.37, n_captions=10.0(12.9)
2023-08-29 14:36:54,429   INFO  Epoch [1/128][120/4804] LR: 0.004, ETA: 5 days, 7:03:13, Data: 0.02 (0.02), Iter: 0.70 (0.74), Accuracy: 0.78, loss_seg=0.90, binary_loss=0.33, caption_view=0.08, caption_entity=0.04, loss=1.34, n_captions=9.0(12.9)
2023-08-29 14:37:07,454   INFO  Epoch [1/128][140/4804] LR: 0.004, ETA: 5 days, 4:47:05, Data: 0.02 (0.03), Iter: 0.79 (0.73), Accuracy: 0.64, loss_seg=1.29, binary_loss=0.16, caption_view=0.06, caption_entity=0.03, loss=1.55, n_captions=7.0(12.8)
2023-08-29 14:37:21,984   INFO  Epoch [1/128][160/4804] LR: 0.004, ETA: 5 days, 4:40:53, Data: 0.03 (0.03), Iter: 0.91 (0.73), Accuracy: 0.70, loss_seg=1.09, binary_loss=0.18, caption_view=0.11, caption_entity=0.08, loss=1.47, n_captions=15.0(13.1)
2023-08-29 14:37:35,895   INFO  Epoch [1/128][180/4804] LR: 0.004, ETA: 5 days, 4:02:25, Data: 0.01 (0.02), Iter: 0.45 (0.73), Accuracy: 0.67, loss_seg=1.09, binary_loss=0.24, caption_view=0.06, caption_entity=0.00, loss=1.39, n_captions=4.0(13.1)
2023-08-29 14:37:49,432   INFO  Epoch [1/128][200/4804] LR: 0.004, ETA: 5 days, 3:11:09, Data: 0.03 (0.02), Iter: 0.70 (0.72), Accuracy: 0.78, loss_seg=0.98, binary_loss=0.14, caption_view=0.09, caption_entity=0.06, loss=1.27, n_captions=10.0(13.1)
2023-08-29 14:38:02,928   INFO  Epoch [1/128][220/4804] LR: 0.004, ETA: 5 days, 2:27:42, Data: 0.01 (0.02), Iter: 0.55 (0.72), Accuracy: 0.83, loss_seg=0.75, binary_loss=0.26, caption_view=0.09, caption_entity=0.02, loss=1.12, n_captions=7.0(13.3)
2023-08-29 14:38:18,365   INFO  Epoch [1/128][240/4804] LR: 0.004, ETA: 5 days, 3:14:10, Data: 0.03 (0.02), Iter: 0.45 (0.72), Accuracy: 0.52, loss_seg=2.06, binary_loss=0.30, caption_view=0.11, caption_entity=0.12, loss=2.59, n_captions=24.0(13.4)
2023-08-29 14:38:31,951   INFO  Epoch [1/128][260/4804] LR: 0.004, ETA: 5 days, 2:40:36, Data: 0.02 (0.02), Iter: 0.39 (0.72), Accuracy: 0.78, loss_seg=0.89, binary_loss=0.15, caption_view=0.07, caption_entity=0.03, loss=1.15, n_captions=6.0(13.2)
2023-08-29 14:38:45,580   INFO  Epoch [1/128][280/4804] LR: 0.004, ETA: 5 days, 2:13:03, Data: 0.02 (0.02), Iter: 0.49 (0.72), Accuracy: 0.44, loss_seg=2.29, binary_loss=0.32, caption_view=0.11, caption_entity=0.05, loss=2.77, n_captions=12.0(13.1)
2023-08-29 14:38:59,079   INFO  Epoch [1/128][300/4804] LR: 0.004, ETA: 5 days, 1:44:38, Data: 0.02 (0.02), Iter: 0.70 (0.71), Accuracy: 0.69, loss_seg=1.06, binary_loss=0.17, caption_view=0.09, caption_entity=0.07, loss=1.39, n_captions=10.0(13.0)
2023-08-29 14:39:13,142   INFO  Epoch [1/128][320/4804] LR: 0.004, ETA: 5 days, 1:37:37, Data: 0.02 (0.02), Iter: 0.95 (0.71), Accuracy: 0.72, loss_seg=1.03, binary_loss=0.86, caption_view=0.13, caption_entity=0.13, loss=2.15, n_captions=28.0(13.1)
Traceback (most recent call last):
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 321, in reduce_storage
RuntimeError: unable to open shared memory object </torch_21979_3890372270> in read-write mode

I guess it is because the open file limited.So I set ulimit -n 10240.

But I still got

2023-08-29 16:39:54,976   INFO  Epoch [1/128][920/2402] LR: 0.004, ETA: 4 days, 2:55:56, Data: 0.05 (0.04), Iter: 0.87 (1.16), Accuracy: 0.77, loss_seg=0.80, binary_loss=0.28, caption_view=0.13, caption_entity=0.11, loss=1.32, n_captions=13.0(13.1)
Traceback (most recent call last):
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 322, in reduce_storage
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/reduction.py", line 194, in DupFd
  File "/home/usrs/wangjuan/anaconda3/envs/PLA/lib/python3.7/multiprocessing/resource_sharer.py", line 48, in __init__
OSError: [Errno 24] Too many open files

Have you ever had a similar problem? Looking forward to your reply!

"To do" part in the "select_images" function in caption_utils.py file

Hello !
Thank you for your work which is very interesting.
Can you please explain, what is supposed to be done as "To do" part in the "select_images" function in caption_utils.py file ? As mentioned in the comments by you.

Many thanks for your time

Regards

Reproducing scannet200 zeroshot

Hi!
I tried to process scannet200 by myself. And eval with provided ckpt. However, I got following result:
B170
hIoU/mIoU/IoU_base/IoU_novel: 16.89 / 20.28 / 21.48 / 13.91

B150
hIoU/mIoU/IoU_base/IoU_novel: 14.32 / 19.17 / 22.28 / 10.55

zeroshot
2024-06-20 10:57:30,169 INFO mIoU: 6.32
2024-06-20 10:57:30,169 INFO mAcc: 16.05

zeroshot+openscene
2024-06-20 11:06:15,668 INFO mIoU: 7.76
2024-06-20 11:06:15,668 INFO mAcc: 15.44

which is strange, B150 and B170 eval results are normal. But zeroshot results are dropped a lot.

I follow #42 and change .tsv and class remapper to generate 200 pth. And I try to use ScanNet official processed code to generate 200 pth. (I found the different is official code aligned pointcloud axis)

Could you please share your process script or give me more details to process scannet200? Thanks a lot!

ScannetV2_train/test.txt Path

Thank you for sharing interesting work here. I'm curious about your caption module, but it seems that there exists a dataset_path folder for scannetv2_train.txt that can not be found in your updated folder, need I to download it somewhere?

eval_freq

Hi, thanks for your great work!
I want to know why the default "eval_freq" during training is 10 instead of 1. Would setting it to 1 result in better outcomes?

visualization problem

Hi, I found a visualization problem when I follow your docs/INFER.md.

python test.py --cfg_file cfgs/scannet_models/spconv_clip_base15_caption_adamw.yaml --ckpt output/scannet_models/spconv_clip_base15_caption/exp_tag/ckpt/checkpoint_epoch_128_6488.pth --save_results semantic,instance
python visual_utils/visualize_indoor.py

Traceback (most recent call last):
  File "visual_utils/visualize_indoor.py", line 217, in <module>
    rooms = sorted(os.listdir(opt.prediction_path + '/pred_instance'))
FileNotFoundError: [Errno 2] No such file or directory: './results/pred_instance'

Then I got into eval_utils.py,

PLA/tools/eval_utils/eval_utils.py

Line 85 in 3d8494b

save_npy(eval_output_dir, 'semantic_pred', scene_names, pred)

The value of parameter eval_output_dir is /PLA/output/scannet_models/spconv_clip_base15_caption_adamw/default/eval/epoch_6488. So I'm confused that when I run test.py, set --save_results semantic,instance, it should generate files ending in py and pth in the PLA/data/scannetv2/val/ and PLA/data/scannetv2/val_pth/ directory, but it not.

Or am I misunderstanding the steps of visualization? Looking forward to your reply.Any assistant will be appreciated.

LSeg-3D

Hello, your work is very good. In the paper, I saw that you implemented 3D semantic segmentation based on LSeg and used it as a baseline. Is it convenient to share the code of this part?

scene captions?

Dear authors, thank you for your great work.

I was skimming through the code and found out that scene captions are never encountered in caption_head.py.

I also noticed that the caption_idx directory of your shared onedrive does not contain matching indices of scenes.

Do you plan to release them later?

Thanks,

ModuleNotFoundError: No module named 'pcseg.version'

Hi, how can I find the package pcseg.version? I meet the Error: ModuleNotFoundError: No module named 'pcseg.version' in the file pcseg/__init__.py when runing scrips/dist_train.sh. Thanks!

Generate_caption_idx.py

Hello !
Thank you for sharing very interesting work.
Can you please confirm that which version of scannet do you refer to here in generate_caption)idx.py file. DO you refer to the path of point cloud data of scannetv2 which you pre-processed by using PointGroup in the previuos step? Or do you refer to the path of scannet_frames_25k data here ? Your reply will be really appriciated. Thanks.

About nuscene caption data

Hi!
It seems that provided nuscenes_caption_idx_kosmos2_and_detic_iou0.3-0.0.pkl is not match to caption_kosmos_and_sw_iou0.3-0.0.json

(a is nuscenes_caption_idx_kosmos2_and_detic_iou0.3-0.0.pkl and b is caption_kosmos_and_sw_iou0.3-0.0.json)

Please check it, thanks!

Is the "Image-bridged Point-Language Association Module" needed for the test?

Hi, thanks for your interesting work!
Is the "Image-bridged Point-Language Association Module" needed for the test?
Thanks.

Missing caption_2d_intersect_v3.json

Hi, the caption file: caption_2d_intersect_v3.json and scannetv2_matching_idx_intersect_v3.pickle are missing. And I also want to know how to generate it?
Thanks!

training with captions

Hello !
Thank you for sharing your work.
I was wondering if we have to train using the captions, then we have to un-comment the "commented part" in train.py as shown below.

But when we do so, we get this error. Is their something missing in the code that you provided?

Many thanks

training from scratch does not reproduce the results (scannet B15/N4)

Hello !
Thank you for your interesting work.
when I trained your model from scratch, (using scannet B15/N4 split) it produces vedy low results upon inference. I am attaching the results for your consideration.

While, when I test the pre-trained model provided by you, it gives the following results.

That would be great if you can comment on the possible underlying issue here.
Many thanks for your time.
Regards.

Missing part of the caption data.

Hi,in your submitted scannet caption data, I could not find this file:scannet_caption_idx_kosmos_and_detic-template_125k_iou0.2.pkl,which is used in tools/cfgs/scannet200_models/spconv_clip_base170_caption.yaml's 54line and tools/cfgs/scannet200_models/spconv_clip_base170_caption.yaml's 52line.

The 2D-3D projection on S3DIS

Dear Runyu and Jihan,

Thanks for this inspiring work, and I'm curious about the 2D-3D projection on S3DIS.
In the provided "s3dis_view_vit-gpt2_matching_idx.zip" data, it seems already included the corresponding index, how to get them?

regards,
zihui

Question about the caption data.

Hi,thank you for your great work!I found that the novel category names are already included in the generated captioning data.For example,"a image of sofa" and "a image of toilet".So does this mean that the model has seen the novel category names during the training phase?

Problems about the codes corresponding to Eq(9)

Thanks for your previous anwsering! When I try to understand the code with the quations in the paper (PLA), there are some questions:

As to

PLA/pcseg/models/head/caption_head.py

Lines 116 to 154 in 3a7103a

 def _forward_given_type_caption(self, batch_dict, caption_info, adapter_feats): 

 frame_corr_idx = caption_info['select_image_corr'] 

 pooled_feats = [] 

 real_n_points = [] 

 if_has_pts = [] 

 batch_idx = batch_dict['batch_idxs'] 

 for b in range(len(frame_corr_idx)): 

 _frame_corr_idx = frame_corr_idx[b] 

 offsets = batch_dict['offsets'] 

 origin_idx = batch_dict['origin_idx'][offsets[b]: offsets[b + 1]] if 'origin_idx' in batch_dict else None 

 pc_count = batch_dict['pc_count'][b] 

 batch_if_has_pts = torch.zeros(len(_frame_corr_idx), dtype=torch.bool).cuda() 

 for i, idx in enumerate(_frame_corr_idx): 

 selected_mask = self.get_point_mask_for_point_img_points(pc_count, idx, origin_idx) 

 # visualization debug code 

 # import tools.visual_utils.open3d_vis_utils as vis 

 # points = batch_dict['points'][batch_idx == b] 

 # points_batch = points[selected_mask] 

 # points_colors = batch_dict['rgb'][batch_idx == b][selected_mask] 

 # 

 # vis_dict = { 

 # 'points': points_batch[:, 1:].detach().cpu().numpy(), 

 # 'point_colors': points_colors.detach().cpu().numpy(), 

 # 'point_size': 2.0 

 # } 

 # vis.dump_vis_dict(vis_dict, './vis_dict_2.pkl') 

 # import ipdb; ipdb.set_trace(context=20) 

 _pooled_feats = adapter_feats[batch_idx == b][selected_mask] 

 batch_if_has_pts[i] = selected_mask.sum() > 0 

 if selected_mask.sum() > 0: 

 real_n_points.append(selected_mask.sum().view(1)) 

 pooled_feats.append(_pooled_feats.mean(0, keepdim=True)) 

 if_has_pts.append(batch_if_has_pts) 

 return pooled_feats, real_n_points, if_has_pts

What's the meaning of select_image_corr in caption_info? As presented in the paper (View-Level Point-Caption Association Section), RGB image v is back-projected to 3D space using the depth information d to get its corresponding point set. I can't find the back-project process in the code. And how to select the corresponding view images or cropped image regions for a given scene?

I would be very grateful if you could reply.

Install issues

Hi, thanks for your great work!
I am trying to run the PLA, but when I run python3 setup.py build_ext develop,I encountered the following problem:

running build_ext
building 'softgroup_ops' extension
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/cuda/__init__.py:104: UserWarning: 
NVIDIA RTX A6000 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the NVIDIA RTX A6000 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
Emitting ninja build file /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/bin/nvcc --generate-dependencies-with-compile --dependency-output /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/cuda.o.d -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/cuda.cu -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++14
FAILED: /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/cuda.o 
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/bin/nvcc --generate-dependencies-with-compile --dependency-output /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/cuda.o.d -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/cuda.cu -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 -std=c++14
In file included from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/bfs_cluster/bfs_cluster.h:9,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/bfs_cluster/bfs_cluster.cu:7,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/cuda.cu:4:
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/cuda/CUDAContext.h:10:10: fatal error: cusolverDn.h: No such file or directory
   10 | #include <cusolverDn.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.
[2/3] c++ -MMD -MF /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_api.o.d -pthread -B /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_api.cpp -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_api.o -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_api.o 
c++ -MMD -MF /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_api.o.d -pthread -B /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_api.cpp -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_api.o -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:140,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:13,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/extension.h:4,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_api.cpp:1:
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:83: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
   83 | #pragma omp parallel for if ((end - begin) >= grain_size)
      | 
In file included from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/bfs_cluster/bfs_cluster.h:9,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_ops.h:3,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_api.cpp:4:
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/cuda/CUDAContext.h:10:10: fatal error: cusolverDn.h: No such file or directory
   10 | #include <cusolverDn.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.
[3/3] c++ -MMD -MF /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_ops.o.d -pthread -B /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_ops.cpp -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_ops.o -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
FAILED: /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_ops.o 
c++ -MMD -MF /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_ops.o.d -pthread -B /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/anaconda3/envs/pt18/include/ -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/TH -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/THC -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include -I/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/include/python3.8 -c -c /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_ops.cpp -o /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/build/temp.linux-x86_64-cpython-38/ops/src/softgroup_ops.o -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=softgroup_ops -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:140,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:13,
                 from /home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/torch/extension.h:4,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_ops.cpp:3:
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:83: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
   83 | #pragma omp parallel for if ((end - begin) >= grain_size)
      | 
In file included from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/bfs_cluster/bfs_cluster.h:9,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/bfs_cluster/bfs_cluster.cpp:9,
                 from /var/autofs/home/hale/usrs/wangjuan/code/PLA/pcseg/external_libs/softgroup_ops/ops/src/softgroup_ops.cpp:5:
/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/include/ATen/cuda/CUDAContext.h:10:10: fatal error: cusolverDn.h: No such file or directory
   10 | #include <cusolverDn.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1667, in _run_ninja_build
    subprocess.run(
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "setup.py", line 6, in <module>
    setup(
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup
    return distutils.core.setup(**attrs)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
    dist.run_commands()
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
    self.run_command(cmd)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/dist.py", line 1234, in run_command
    super().run_command(command)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
    cmd_obj.run()
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 84, in run
    _build_ext.run(self)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
    self.build_extensions()
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 708, in build_extensions
    build_ext.build_extensions(self)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
    self._build_extensions_serial()
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
    objects = self.compiler.compile(
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 529, in unix_wrap_ninja_compile
    _write_ninja_file_and_compile_objects(
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1354, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "/home/usrs/wangjuan/anaconda3/envs/softgroup3.8/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

I noticed that it says here that it's caused by a mismatch between pytorch's version and cuda's, so I tried the version inside the answer, but it didn't work. Then I noticed that the version in your install.md is not the same as the version in the install.md in softgroup.
In PLA:

CUDA 11.1
torch==1.8.1+cu111
torchvision==0.9.1+cu111

But in SoftGroup:
cuda=10.2

So I tried these following 2 command：

conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia

Also, since my cuda is 11.6, I also tried：
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge

But unfortunately, all these versions didn't work, so may I ask your your environment?

mine:
Ubuntu 18.04
GCC 7.5
CUDA 11.6

Looking forward to your reply, any assistant will be appreiciate.

how about cuda10.2

if i use cuda10.2 and torch==1.8, may i confront some problems?
Is cuda10.2 feasible？

Regarding the freezing of the training process

Hello, I would like to know that when I executed dist_train.sh script, it was frozen at the beginning of training, and I did not successfully call the video card for training. I tried to modify the port, but the port number of the idle port was still frozen, could you please help me find out the reason?

test.py occurred FileNotFoundError

Thanks for this siginificant and interesting work.
I'm running inference code following your guidance, but occurred FileNotFoundError when running test.py, and i don't know how to fix it.

=> loaded text embedding from path '../data/scannetv2/text_embed/scannet_clip-ViT-B16_id.pth'
Traceback (most recent call last):
  File "test.py", line 236, in <module>
    main()
  File "test.py", line 232, in main
    eval_output_dir=eval_output_dir)
  File "test.py", line 61, in eval_single_ckpt
    epoch_id = model.load_params_from_file(filename=args.ckpt, logger=logger, epoch_id=epoch_id, to_cpu=dist_test)
  File "/workspace/PLA/pcseg/models/vision_networks/network_template.py", line 200, in load_params_from_file
    raise FileNotFoundError
FileNotFoundError

All error information is in the attached file.
err-230614.log

How to self-train?

Thanks a lot for your great work! However, I have some uncertainty regarding the method "Self-Bootstrap with Novel Category Prior" in the paper.
Does the provided code indeed implement the "with Novel Category Prior"? If it doesn't, could you please share the code that how to implement?

Based on my understanding, the TextSegHead loads CLIP text features for all categories (including the novel classes) through the "set_cls_head_with_text_embed" function. During training, the model appears to be aware of the names of novel categories.

Once again, thank you for your valuable work!

pc_mean.json file not found

Hy! Thank you for sharing your work. Can you please tell us where to locate this file named as Pc_mean.jason ? As it is not available anywhere neither in the scannetv2 dataset nor on web.

Thanks

	def _forward_given_type_caption(self, batch_dict, caption_info, adapter_feats):
	frame_corr_idx = caption_info['select_image_corr']

	pooled_feats = []
	real_n_points = []
	if_has_pts = []
	batch_idx = batch_dict['batch_idxs']

	for b in range(len(frame_corr_idx)):
	_frame_corr_idx = frame_corr_idx[b]
	offsets = batch_dict['offsets']
	origin_idx = batch_dict['origin_idx'][offsets[b]: offsets[b + 1]] if 'origin_idx' in batch_dict else None
	pc_count = batch_dict['pc_count'][b]
	batch_if_has_pts = torch.zeros(len(_frame_corr_idx), dtype=torch.bool).cuda()
	for i, idx in enumerate(_frame_corr_idx):
	selected_mask = self.get_point_mask_for_point_img_points(pc_count, idx, origin_idx)

	# visualization debug code
	# import tools.visual_utils.open3d_vis_utils as vis
	# points = batch_dict['points'][batch_idx == b]
	# points_batch = points[selected_mask]
	# points_colors = batch_dict['rgb'][batch_idx == b][selected_mask]
	#
	# vis_dict = {
	# 'points': points_batch[:, 1:].detach().cpu().numpy(),
	# 'point_colors': points_colors.detach().cpu().numpy(),
	# 'point_size': 2.0
	# }
	# vis.dump_vis_dict(vis_dict, './vis_dict_2.pkl')
	# import ipdb; ipdb.set_trace(context=20)

	_pooled_feats = adapter_feats[batch_idx == b][selected_mask]
	batch_if_has_pts[i] = selected_mask.sum() > 0
	if selected_mask.sum() > 0:
	real_n_points.append(selected_mask.sum().view(1))
	pooled_feats.append(_pooled_feats.mean(0, keepdim=True))
	if_has_pts.append(batch_if_has_pts)

	return pooled_feats, real_n_points, if_has_pts