Comments (10)
Hi, thank you for your interest in our work. In these 3D-VL tasks evaluation, there are two settings, results using ground-truth mask and results using predicted mask. On benchmarks like Sr3D, Nr3D, you should use gt mask for evaluation. On benchmarks like ScanRefer, ScanQA, Scan2Cap, SQA3D, you should set pc_type to pred.
Best,
Ziyu
from 3d-vista.
Hi Ziyu,
Thanks for your reply. However, I found that the pc_type is set to "gt" during training and testing in the "ScanQA" task.
Best,
Jian Ding
from 3d-vista.
Yes, setting pc_type to "gt" will make testing faster(for checking model performance only). If you want to submit the result to benchmark, you should change it to "pred" for comparison with other paper.
Best,
Ziyu
from 3d-vista.
Are the data for pc_type "pred" read from the path "./data/scanfamily/save_mask"? How to generate such files? Are they predicted by 3dvista or other models?
Best,
Jian Ding
from 3d-vista.
They are predicted by mask3d segmentation model. These masks can be found in this issue #12.
Best,
Ziyu
from 3d-vista.
Thanks for the reply.
Have you fine-tuned the mask3d according to the labels of scanqa, or just used the pretrained mask3d model?
Best,
Jian Ding
from 3d-vista.
An additional question, have you evaluated on the task "Object localization performance on the ScanQA dataset"?
from 3d-vista.
Hi
- We use pre-trained mask3d from their repo(ScanNet200 checkpoint).
- In our implementation, object localization accuracy is around 56% using the ground-truth mask in ScanQA.
Best,
Ziyu
from 3d-vista.
Hi Ziyu,
Thanks for your reply. I still have a few questions since I am new to this area.
- What is the meaning of "using the ground-truth mask"? Did you use the ground truth mask just for evaluation, or send the ground truth mask to the model and just predict the localization scores? Is 56% under the metric [email protected]?
- Have you included the evaluation code for object localization in the repo?
- An additional question, the scanqa paper found that the object localization loss is helpful for the qa task. Did you have the similar finding?
- I found it seems that you used 607 raw categories from scannetv2. However, in the scanqa paper, they used 18 categories. I am confused about this. Are the 18 categories merged from the 607 categories? And will the category difference influence the qa performance?
Best,
Jian Ding
from 3d-vista.
Hi
- means "setting pc to gt" to evaluate localization acc regardless of IoU.
- Yes, it is in "eval_qa" function, "og_acc" metric.
- We did not conduct rigorous experiments to study the effect of this localization loss, it is usually used in ScanQA to prevent overfitting. Thus, we follow this setting and add this loss.
- Raw 607 class is the full set of ScanNet semantics. 607 class can be merged to 18 categories(ScanNet20, remove wall and floor) or 200 categories(ScanNet200).
Best,
Ziyu
from 3d-vista.
Related Issues (20)
- The `ScanScribe` dataset HOT 3
- Demo HOT 3
- pre-trained dataset HOT 1
- ScanScribe dataset HOT 1
- How to run 3D-VisTA for Pre-Train? HOT 1
- Questions about the pc_type and scanrefer_metrics HOT 10
- Question about the dense caption evaluation HOT 1
- How do I run the pre-training script? HOT 2
- Require for the data/scanfamily/annotations/ data
- Request for the data/scanfamily/annotations data HOT 3
- save_mask.zip for ScanRefer training HOT 2
- Fine-tuning on ScanQA HOT 4
- Demo
- The code of processing the 3RScan data for pretrain.
- scanrefer_vocab.pth and other files missing HOT 3
- Question about tgt_object_id HOT 1
- Ask for more Mask3D results HOT 2
- `scannetv2_raw_categories.json` and `cat2glove42b.json` missing? HOT 1
- installation dependency
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 3d-vista.