Dataset	[email protected]	[email protected]	Model	Log (train)	Log (test)
ScanRefer	54.59	42.26	OneDrive*	54_59.txt¹ / 54_44.txt²	log.txt
ScanRefer (Single-Stage)	53.83	41.70	OneDrive	53_83.txt¹ / 53_47.txt²	log.txt
SR3D	68.1	-	OneDrive	68_1.txt¹ / 67_6.txt²	log.txt
NR3D	52.1	-	OneDrive	52_1.txt¹ / 54_7.txt²	log.txt

should we add `self.text_encoder.eval()` ?

To freeze the text encoder, you set the param.requires_grad as False:

self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type, local_files_only=True)
self.text_encoder = RobertaModel.from_pretrained(t_type, local_files_only=True)
for param in self.text_encoder.parameters():
    param.requires_grad = False

However, to make sure the Dropout layer in the RobertaModel work as in evaluation (do not drop out neurons randomly), we need add self.text_encoder.eval() at the end of the codes above:

self.tokenizer = RobertaTokenizerFast.from_pretrained(t_type, local_files_only=True)
self.text_encoder = RobertaModel.from_pretrained(t_type, local_files_only=True)
for param in self.text_encoder.parameters():
    param.requires_grad = False
self.text_encoder.eval()

Question about "point_instance_label"

Hi, thanks for you great work.

But I have a question about "point_instance_label" when reading the code.

In src/joint_det_dataset.py, function _get_target_boxes:

point_instance_label = -np.ones(len(scan.pc))
for t, tid in enumerate(tids):
    point_instance_label[scan.three_d_objects[tid]['points']] = t

Is there any problem with setting the label of the point to the sequence index t?

In my opinion, the code should be modified as follows:

point_instance_label = -np.ones(len(scan.pc))
for t, tid in enumerate(tids):
    point_instance_label[scan.three_d_objects[tid]['points']] = tid

In this way, all points in the same scene can be correctly classified into the corresponding object IDs.

If we set the point_instance_label to t, the label of the point can only be 0 on the scanrefer dataset when joint_det is false, which leads to logic errors.

Question about performance on ScanRefer

When I run the train_scanrefer.sh, the result in the log of output is extremely bad, I don't konw why.

last_ position alignment Acc0.25: Top-1: 0.00032, Top-5: 0.00294, Top-10: 0.00936
last_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
last_ semantic alignment Acc0.25: Top-1: 0.00179, Top-5: 0.00947, Top-10: 0.01536
last_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00032, Top-10: 0.00032
proposal_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
proposal_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
proposal_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
proposal_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
0head_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
0head_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
0head_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
0head_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
1head_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
1head_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
1head_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
1head_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
2head_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
2head_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
2head_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
2head_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
3head_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
3head_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
3head_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
3head_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
4head_ position alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
4head_ position alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
4head_ semantic alignment Acc0.25: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000
4head_ semantic alignment Acc0.50: Top-1: 0.00000, Top-5: 0.00000, Top-10: 0.00000

Analysis
[email protected]
easy 0.0
hard 0.002426491578646874
vd 0.0014448257178977786
vid 0.002266431629312516
unique 0.0
multi 0.0021016194832488566
[email protected]
easy50 0.0
hard50 0.0
vd50 0.0
vid50 0.0
unique50 0.0
multi50 0.0

Ask for visualization code

Can i ask you for the visualization code about plotting the bounding box on the RGB-D data and save as the 2D RGB image, as shown in your figures?

The log does not contain the evaluated value. Can the epoch be made smaller? Which indicator represents Overall performance?

The log file only contains loss information, and the evaluation information is only printed in the terminal, which is very unfriendly for epoch 400 training. I have to test each epoch again.
Is epoch 400 necessary? In my experiments, val loss became overfitting near epoch 50. Can the epoch be adjusted to 75 or other relatively small values? Will it affect the results? Have you done any corresponding experiments?
How long does it take to run the complete 400 epochs?
In your public training log, it only reaches epoch 72. In issue 3, I find you load epoch 60, and the overall result is from last_semantic alignment Acc0.25: Top-1: 0.54586.
But in the training log, the results of epoch 60 are:

last_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
last_ Box given span (contrastive) Acc0.25: Top-1: 0.546, Top-5: 0.680, Top-10: 0.736
last_ Box given span (contrastive) Acc0.50: Top-1: 0.423, Top-5: 0.573, Top-10: 0.627
proposal_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
proposal_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
proposal_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
proposal_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
0head_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
0head_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
0head_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
0head_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
1head_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
1head_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
1head_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
1head_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
2head_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
2head_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
2head_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
2head_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
3head_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
3head_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
3head_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
3head_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
4head_ Box given span (soft-token) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
4head_ Box given span (soft-token) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
4head_ Box given span (contrastive) Acc0.25: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000
4head_ Box given span (contrastive) Acc0.50: Top-1: 0.000, Top-5: 0.000, Top-10: 0.000

Why is one segment and one box?

problem of evaluation

When evaluating the model, how can I get the overall acc? I can't find overall acc in the output log.

when will the code be released?

Thanks for your great job!!!
I am really interested in this paper. I wonder when will the code be released.

group_free_pred_bboxes

How do you get the bboxes? Do the bboxes sizes are normalized? I can not visualize it in the ScanNetv2 dataset.

Question of visualization

Thanks for your excellent job.I have run through your code,but because I just entered this research field recently,so I have problem to visualize your results,such as detection frame like Figure 1.Would you please
release code of visualization?Thank you very much!

ImportError: Could not import _ext module.

I created the environment as README says, and train on the SR3D，here are the errors:

/home/sd/anaconda3/envs/EDA/lib/python3.7/site-packages/torch/distributed/launch.py:164: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
"The module torch.distributed.launch is deprecated "
The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

WARNING:torch.distributed.run:--use_env is deprecated and will be removed in future releases.
Please read local_rank from os.environ('LOCAL_RANK') instead.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs:
entrypoint : train_dist_mod.py
min_nodes : 1
max_nodes : 1
nproc_per_node : 2
run_id : none
rdzv_backend : static
rdzv_endpoint : 127.0.0.1:3333
rdzv_configs : {'rank': 0, 'timeout': 900}
max_restarts : 3
monitor_interval : 5
log_dir : None
metrics_cfg : {}

INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_2x3bsv88/none_l9qwazu3
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
/home/sd/anaconda3/envs/EDA/lib/python3.7/site-packages/torch/distributed/elastic/utils/store.py:53: FutureWarning: This is an experimental API and will be changed in future.
"This is an experimental API and will be changed in future.", FutureWarning
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=0
master_addr=127.0.0.1
master_port=3333
group_rank=0
group_world_size=1
local_ranks=[0, 1]
role_ranks=[0, 1]
global_ranks=[0, 1]
role_world_sizes=[2, 2]
global_world_sizes=[2, 2]

INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_2x3bsv88/none_l9qwazu3/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_2x3bsv88/none_l9qwazu3/attempt_0/1/error.json
Traceback (most recent call last):
File "/home/sd/Harddisk/sba/BS/EDA-master/pointnet2/pointnet2_utils.py", line 26, in
import pointnet2._ext as _ext
ImportError: /home/sd/.local/lib/python3.7/site-packages/pointnet2-0.0.0-py3.7-linux-x86_64.egg/pointnet2/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNSt15__exception_ptr13exception_ptr9_M_addrefEv