danganea / imtfa Goto Github PK

View Code? Open in Web Editor NEW

70.0 70.0 11.0 15.55 MB

Code accompanying the paper 'Incremental Few-Shot Instance Segmentation'

License: Apache License 2.0

Python 89.86% Shell 0.59% C++ 3.18% Cuda 6.28% Dockerfile 0.09%

imtfa's People

Contributors

Stargazers

Watchers

Forkers

qinzhengmei gaobb yidfeng cvlinks ry-jojo susu218 ans-qureshi jack6099boy reaganwu zhangyuxuan321 zhouthirty2022 cinout

imtfa's Issues

Why is seed_9 used for iMTFA configs for the NOVEL + ALL classes?

Dear authors,

Thank you for your great work and your code.

I followed your instructions to re-run your code, but when I run iMTFA in novel phase, I found that the data is taken from 'coco_trainval_novel_1shot_seed9'. Would you please explain this setup.
I want to know why seed_9 is chosen but not another seed? Is this seed 9 for your results reported in your paper.

Thank you in advance! I hope to hear your response soon.

Sincerely,
DuyNN

Paper architecture

Hi Dan
Thanks for your implementation, I have a concern. Is Meta Learning applied for iMTFA model?

Thank you.

NaN losses

Hi,
I am observing Nan losses just after 1st or 2nd iteration. I am running following command python3 tools/run_train.py --config-file configs/coco-experiments/mask_rcnn_R_50_FPN_fc_fullclsag_base.yaml

Following is the error that I am getting:

Traceback (most recent call last):
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 217, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 240, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=2!
loss_dict = {'loss_cls': tensor(6.5565, device='cuda:0', grad_fn=<NllLossBackward>), 'loss_box_reg': tensor(0.0876, device='cuda:0', grad_fn=<DivBackward0>), 'loss_mask': tensor(nan, device='cuda:0', grad_fn=<BinaryCrossEntropyWithLogitsBackward>), 'loss_rpn_cls': tensor(0.7022, device='cuda:0', grad_fn=<MulBackward0>), 'loss_rpn_loc': tensor(0.0971, device='cuda:0', grad_fn=<MulBackward0>)}
[07/25 13:14:56 d2.engine.hooks]: Total training time: 0:00:01 (0:00:00 on hooks)
Traceback (most recent call last):
  File "tools/run_train.py", line 151, in <module>
    args=(args,),
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/launch.py", line 72, in launch
    main_func(*args)
  File "tools/run_train.py", line 126, in main
    return trainer.train()
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/defaults.py", line 393, in train
    super().train(self.start_iter, self.max_iter)
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 132, in train
    self.run_step()
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 217, in run_step
    self._detect_anomaly(losses, loss_dict)
  File "/home/puneet/segmentation/iMTFA/detectron2/engine/train_loop.py", line 240, in _detect_anomaly
    self.iter, loss_dict
FloatingPointError: Loss became infinite or NaN at iteration=2!
loss_dict = {'loss_cls': tensor(6.5565, device='cuda:0', grad_fn=<NllLossBackward>), 'loss_box_reg': tensor(0.0876, device='cuda:0', grad_fn=<DivBackward0>), 'loss_mask': tensor(nan, device='cuda:0', grad_fn=<BinaryCrossEntropyWithLogitsBackward>), 'loss_rpn_cls': tensor(0.7022, device='cuda:0', grad_fn=<MulBackward0>), 'loss_rpn_loc': tensor(0.0971, device='cuda:0', grad_fn=<MulBackward0>)}
Segmentation fault (core dumped)```

Any idea what could be the issue here?

map and recall after training of every session

Dear authors,

I read your paper, it's a great work.
I did not find that the dynamic change of accuracy ,such as training in 5 ways K shots.There will be four sets of results.
Where can I find this part？

Thanks!

Training on ImageNet to compare with Siamese-Mask-RCNN

Proposed neural networks MTFA/iMTFA are compared in the paper with Siamese Mask-RCNN.
But, in order to compare, nets in those experiments train 1K ImageNet classes, which violates few-shot conventions a little. At the same time, a list of 687 classes (1000 ImageNet w/o COCO) and 771 (1000 ImageNet w/o Pascal-VOC) were uploaded by this link at bethgelab/siamese-mask-rcnn/data/.

Moreover, weights of models trained with 687 and 771 classes were released at this link bethgelab/siamese-mask-rcnn/releases.

Additional experiments with training MTFA starting from (687 / 771)-backbone will reveal the difference between using base classes on the pretrain stage and a fully-pretrained feature maps predictor.

'model_reset.pth' in second stage

when I try to run iMTFA's second training stage using configs/coco-experiments/mask_rcnn_R_50_FPN_ft_fullclsag_cos_bh_base.yaml, how can i get the model_reset.pth

Models for the Mask-RCNN

By the link in the README I have reached only faster_rcnn_xxx config files. It seems strange because the work is dedicated to instance segmentation. Where I could download the mask_rcnn models?

Usage of few shots of novel classes

After an analysis it did not became clear how the few shots for novel classes are used.
It would be great to get a feedback on that topic about MTFA and iMTFA both. Could you please denote those scripts?

Assuming MTFA is a two-stage approach, there has to be a place in the code where cls, bbox and mask heads / layers increase because of addition of novel classes.

On the other hand, since iMTFA captures information about novel classes in another way, somewhere in the code cls vectors are computed and stored.

test on VOC dataset (coco2voc)

Hi, thanks for sharing the codes.
Could you please share this file "VOCOutput/annotations/val_converted.json" for voc testing?
I am facing with the errors like below when testing with --set-voc-test:

FileNotFoundError: [Errno 2] No such file or directory: 'datasets/VOCOutput/annotations/val_converted.json'

Thanks in advance!

How to convert VOC XML to COCO format

Hi, thank you for sharing your amazing work.

According to the README, you manually converted the annotation of VOC to COCO format.
Can I get the source code to convert the entire VOC dataset to COCO json?

Best,
Dongmin Choi

ValueError: path '/home/danganea/Desktop/paper/iMTFA_code/detectron2/layers/csrc/vision.cpp' cannot be absolute

I have tried to install on Windows 10 with CPU versions of torch and torchvision, Cython and pycocotools (warning, not pycocotools-windows)
Thus, I have cloned and entered the iMTFA directory
After pip3 install -e an error occurred which required change in detectron2.egg-info/SOURCES.txt file from

/home/danganea/Desktop/paper/iMTFA_code/detectron2/layers/csrc/[vision, ...].cpp to
detectron2/layers/csrc/[vision, ...].cpp or an absolute path like
C:/Users/tooHotSpot/.../iMTFA/detectron2/layers/csrc/[vision, ...].cpp

Small error which requires just remapping to the cloned directory path.

Parsing to windows

Hey,

I'm trying to make this repo windows friendly.

I adapted the detectron2 to be windows friendly. Installing went smoothly I've set up the data as instructed. When I try to train the MTFA first training stage (or any stage for that matter) I get the following error "Dataset 'coco_trainval_base' is not registered!

Should this registration be done manually or should this have happened during installation?

How to run the model in a N-way K-shot manner?

Hi,

Thank you so much for an amazing paper and for publishing this code!

I have a question regarding the way to use one of your trained models in a N-way K-shot manner. I can see the repository closely follows detectron2. This means that I can load the model's weights and get predictions via DefaultPredictor. That class's __call__ only takes an input image though. I'm struggling to see the way to incorporate the N*K embeddings (and ways to get them in the first place) of the support set, and then to use the predictor to get the results. (I've downloaded the weights for the mask_rcnn_R_50_FPN_ft_fullclsag_cos_bh_all_5shot_manyiters_V2_correct config).

I'm very likely not seeing something very obvious, apologies for asking naive questions in this case. I'll appreciate any pointers on ways to use it in the N-way K-shot manner though!

Thank you so much again