Dear Authors, I am confused about the pc_type. When will you set the

Hi We use pre-trained mask3d from their repo(ScanNet200 checkp

Hi means "setting pc to gt" to evaluate localization acc regar

pc_type about 3d-vista HOT 10 CLOSED

dingjiansw101 commented on July 18, 2024

pc_type

from 3d-vista.

Comments (10)

zhuziyu-edward commented on July 18, 2024

Hi, thank you for your interest in our work. In these 3D-VL tasks evaluation, there are two settings, results using ground-truth mask and results using predicted mask. On benchmarks like Sr3D, Nr3D, you should use gt mask for evaluation. On benchmarks like ScanRefer, ScanQA, Scan2Cap, SQA3D, you should set pc_type to pred.

Best,
Ziyu

from 3d-vista.

dingjiansw101 commented on July 18, 2024

Hi Ziyu,

Thanks for your reply. However, I found that the pc_type is set to "gt" during training and testing in the "ScanQA" task.

Best,
Jian Ding

from 3d-vista.

zhuziyu-edward commented on July 18, 2024

Yes, setting pc_type to "gt" will make testing faster(for checking model performance only). If you want to submit the result to benchmark, you should change it to "pred" for comparison with other paper.

Best,
Ziyu

from 3d-vista.

dingjiansw101 commented on July 18, 2024

Are the data for pc_type "pred" read from the path "./data/scanfamily/save_mask"? How to generate such files? Are they predicted by 3dvista or other models?

Best,
Jian Ding

from 3d-vista.

zhuziyu-edward commented on July 18, 2024

They are predicted by mask3d segmentation model. These masks can be found in this issue #12.

Best,
Ziyu

from 3d-vista.

dingjiansw101 commented on July 18, 2024

Thanks for the reply.
Have you fine-tuned the mask3d according to the labels of scanqa, or just used the pretrained mask3d model?

Best,
Jian Ding

from 3d-vista.

dingjiansw101 commented on July 18, 2024

An additional question, have you evaluated on the task "Object localization performance on the ScanQA dataset"?

from 3d-vista.

zhuziyu-edward commented on July 18, 2024

We use pre-trained mask3d from their repo(ScanNet200 checkpoint).
In our implementation, object localization accuracy is around 56% using the ground-truth mask in ScanQA.

Best,
Ziyu

from 3d-vista.

dingjiansw101 commented on July 18, 2024

Hi Ziyu,
Thanks for your reply. I still have a few questions since I am new to this area.

What is the meaning of "using the ground-truth mask"? Did you use the ground truth mask just for evaluation, or send the ground truth mask to the model and just predict the localization scores? Is 56% under the metric [email protected]?
Have you included the evaluation code for object localization in the repo?
An additional question, the scanqa paper found that the object localization loss is helpful for the qa task. Did you have the similar finding?
I found it seems that you used 607 raw categories from scannetv2. However, in the scanqa paper, they used 18 categories. I am confused about this. Are the 18 categories merged from the 607 categories? And will the category difference influence the qa performance?
Best,
Jian Ding

from 3d-vista.

zhuziyu-edward commented on July 18, 2024

means "setting pc to gt" to evaluate localization acc regardless of IoU.
Yes, it is in "eval_qa" function, "og_acc" metric.
We did not conduct rigorous experiments to study the effect of this localization loss, it is usually used in ScanQA to prevent overfitting. Thus, we follow this setting and add this loss.
Raw 607 class is the full set of ScanNet semantics. 607 class can be merged to 18 categories(ScanNet20, remove wall and floor) or 200 categories(ScanNet200).

Best,
Ziyu

from 3d-vista.

pc_type about 3d-vista HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent