junjiehe96 / fastinst Goto Github PK

View Code? Open in Web Editor NEW

175.0 175.0 16.0 614 KB

[CVPR2023] FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation

License: MIT License

Python 100.00%

fastinst's People

Contributors

Stargazers

Watchers

Forkers

zyqdragon wolfworld6 pengyulpy chenxi-guo web-logs2 firstelfin misslibra chenluchi420 science000 aleksisch xkevin gjtjx zahrakhademi1997 danhntd koteruon raviakash

fastinst's Issues

使用两张3090训练最终结果是39.6258，没有达到您的40.1。请问问题可能出在哪里？

完全使用fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml默认设置

[01/05 12:26:03 d2.evaluation.testing]: copypaste: Task: segm
[01/05 12:26:03 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[01/05 12:26:03 d2.evaluation.testing]: copypaste: 39.6315,61.6258,41.5632,17.0921,43.0538,62.6699

訓練時遇到了 RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)錯誤

遇到了和 #4 一樣的問題，修正成"with autocast()"之後出現了下面錯誤：

本問題，目前無法排除

WEIGHT_DECAY的困惑

大佬您好，我看maskformer的 WEIGHT_DECAY: 0.0001，为啥您的模型的 WEIGHT_DECAY设置的这么大呢，有什么讲究吗，恳请大佬指点

Regarding the issue of resume

What is the reason why I got stuck while reading the weights of the pre trained model for training.

[Checkpointer] Loading from “****.pth” ...

Always stuck here with this prompt

Question about the experiment

In the paper, you mentioned the zero-initialized object query and the learnable object query in Table 2. IA-guided queries. Would you mind telling me how you implement the zero-initialized object query and the learnable object query?

Best regards,
xiaolin

About Visualizing the results

Hi, thanks for your wonderful work, i have trained FastInst on customized datasets, and now i want to visualize the prediction results, so i used the demo.py, but i failed to visualize it. Is there an another way i can try to visualize the results?

The gap between AP val and AP

AP val means mAP on validation set and AP means mAP on test set, right? The gap between them is large, is it normal?

请问如何进行cocotest-dev测试

更改DATASETS:
TRAIN: ("coco_2017_train",)
TEST: ("coco_2017_test-dev",)后进行测试生成的json上传官网提示报错

AssertionError:Attribute 'thing_classes' in the metadata of 'coco_2017_val' cannot be set to a different value!

AssertionError:Attribute 'thing_classes' in the metadata of 'coco_2017_val' cannot be set to a different value!
[''person','bicycle','car',''motorcycle',''airplane'···································]

Model does not output bounding boxes

I am visualizing results of FastInst model using demo.py. Mask segmentation of objects are correct but all bounding boxes are in top-left corner of an image. Is this behaviour expected?

about datasets/prepare_coco_semantic_annos_from_panoptic_annos.py

thanks for your job. I have a question: why COCO Instance Segmentation need panoptic_annos in the eval stage?

我想使用swin_transformer作为我的backbone

请问这样做的话，我的cfg文件该如何去写?

Not satisfied with overall results of the model.

So I was able to finish the training of the model for roughly 280,000 iterations and I am not satisified with the overall results of the model.
The overall mean AP for resnet50 with batch size set to 6 comes out to be only 27% with the AP for the smaller objects coming out to be only 9%.

Any reasons why this drop in AP is happening,is it because of the batch size .
I noticed that whenver I resume training,instead of resuming from where it left off,it starts at iteration 0.
An example of the instance segmentation on coco image is shown below
Original image

Result image

出现了个错误：An error occurred: '>' not supported between instances of 'NoneType' and 'int'

我是初学者，运行代码时出现了个错误：An error occurred: '>' not supported between instances of 'NoneType' and 'int'
下面是运行的过程

[10/22 19:13:14 detectron2]: Command line arguments: Namespace(config_file='./configs/coco/instance-segmentation/Fast-COCO-InstanceSegmentation.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)
[10/22 19:13:14 detectron2]: Contents of args.config_file=./configs/coco/instance-segmentation/Fast-COCO-InstanceSegmentation.yaml:
[10/22 19:13:14 detectron2]: Full config saved to ./output/config.yaml
[10/22 19:13:20 d2.engine.defaults]: Model:
[10/22 19:13:20 fastinst.data.dataset_mappers.fastinst_instance_dataset_mapper]: [FastInstInstanceDatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(416, 448, 480, 512, 544, 576, 608, 640), max_size=853, sample_style='choice'), RandomFlip()]
[10/22 19:13:39 d2.data.datasets.coco]: Loading datasets/coco/annotations/instances_train2017.json takes 18.91 seconds.
[10/22 19:13:40 d2.data.datasets.coco]: Loaded 118287 images in COCO format from datasets/coco/annotations/instances_train2017.json
[10/22 19:13:47 d2.data.build]: Removed 1021 images with no usable annotations. 117266 images left.
[10/22 19:13:51 d2.data.build]: Distribution of instances among all 80 categories:
[10/22 19:13:51 d2.data.build]: Using training sampler TrainingSampler
[10/22 19:13:51 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[10/22 19:13:51 d2.data.common]: Serializing 117266 elements to byte tensors and concatenating them all ...
[10/22 19:13:54 d2.data.common]: Serialized dataset takes 451.21 MiB
/
之后报错了
An error occurred: '>' not supported between instances of 'NoneType' and 'int'

errors when converting timm to d2

The problems are as follows

关于模型的输入

你好，请问模型的输入尺寸是动态尺寸吗，还是静态定好的尺寸。尺寸的格式是怎样的

Inference demo.py

0%| | 0/76 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data3/shenbaoyue/code/FastInst-try-maskattention/demo/demo.py", line 123, in
img = read_image(path, format="BGR")
File "/data3/shenbaoyue/code/FastInst-try-maskattention/detectron2/detectron2/data/detection_utils.py", line 180, in read_image
with PathManager.open(file_name, "rb") as f:
File "/data3/shenbaoyue/anaconda3/envs/fastry/lib/python3.8/site-packages/iopath/common/file_io.py", line 1012, in open
bret = handler._open(path, mode, buffering=buffering, **kwargs) # type: ignore
File "/data3/shenbaoyue/anaconda3/envs/fastry/lib/python3.8/site-packages/iopath/common/file_io.py", line 604, in _open
return open( # type: ignore
IsADirectoryError: [Errno 21] Is a directory: '/'

进程已结束,退出代码1

How can I resolve this error when I configure the parameters to run demo demo. py? The training process is normal. I have tried to solve it online but it has not been successful. I would greatly appreciate it if I could receive assistance

Conversion to ONNX

Do you provide method/script to convert model to ONNX format?

GPU computational resources quantity

Dear Authors,
Thank you for your very interesting work and source code.

Could you please confirm the number of GPUs used in the training process? Whether it is 1x V100 or 4x V100?
In the paper, it is indicated that 1x A100 GPU is used to evaluate and infer. But in the source code, 4x GPU is pre-set up in the script.

Many thanks in advance.

Training error

Running train_ net.py file occurred Error: TypeError :__ init__ () Got an unexpected keyword argument 'dtype', error location is detection2/engine/train_ loop.py
May I ask if anyone has encountered the same problem? How did you solve it?

How to test models with "coco_2017_test-dev"

I download the "image_info_test-dev2017.json" in COCO. And use the "coco_2017_test-dev" as the TEST DATASET. When I run the train_net.py to evaluate the model, I can't get the accuracy and the result shows "Annotations are not available for evaluation."

If you can help me to solve the problem,I really appreciate it.

在重新训练时遇到如下报错，排查后发现在backbone提取特征时，有的值变为NaN，应该怎么处理？

demo.py可视化，阈值没有发挥作用，掩码重叠

./demo/demo.py
--config-file .n/configs/coco/instance-segmentation/fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml
--input /hy-tmp/datasets/coco/val2017/000000001818.jpg
--output ./可视化结果
--confidence-threshold 0.5
--opts MODEL.WEIGHTS ./fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth

Convert to ONNX inference model

Can you please provide the script for converting Fastinst's pytorch model to ONNX inference model?

AttributeError: 'collections.OrderedDict' object has no attribute 'detach'

(fastinst) root@/FastInst# python tools/convert-timm-to-d2.py checkpoint_file/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pth checkpoint_file/fastinst_R50-vd-dcn_ppm-fpn_x3_640_40.1.pkl
model -> backbone.model
Traceback (most recent call last):
File "tools/convert-timm-to-d2.py", line 34, in
newmodel[k] = obj.pop(old_k).detach().numpy()
AttributeError: 'collections.OrderedDict' object has no attribute 'detach'
试图使用您的工具转换模型时出现了如上错误，我该如何解决？

How to resume training from last checkpoint

Hi,I came across your work on instance segmentation and I am currently trying to reproduce the results.I was previously able to train the model for 90,000 iterations but when I tried resuming the training from the last checkpoint,I ended up getting some errors related to not properly loading the configuration file.

as i am new to detectron2,could you provide pointers on how to resume training from existing checkpoint.Does the resume option expect a cfg file as an argument or does it expect a model weights?
thanks

关于添加无目标背景图进行训练

作者您好，我们想把背景图也加入网络进行训练，但发现根本不使用背景图，参考了一些方案依旧无效，facebookresearch/detectron2#819 ，请问是因为网络设计本身不允许加入背景图还是在dataloader哪里已经过滤？

关于模型的输出结果分析

你好，我在打印一张图像的检测结果时，发现输出的结果是由premask、bbox、socres、classes;其中bbox为什么输出的100x4尺寸的tensor值都是0，还有已经有了阈值限制，为什么输出的类别数都是100个

请问为什么加了DCN的resnet和不加DCN的resnet要用两种不同的预训练模型呢

小白，不太懂

请问您是通过什么方式在coco2017test-dev数据集上得到的测试结果？为什么我在coco2017testdev上eval生成的json文件特别大，并且上传coco官网评估会出错？

你好，这是一项非常棒的工作！
我正在复现你们的论文成果，但是
python train_net.py --eval-only --num-gpus 2 --config-file configs/coco/instance-segmentation/fastinst_R50_ppm-fpn_x1_576.yaml MODEL.WEIGHTS /path/to/checkpoint_file
我运行了上面的代码，权重文件是在这里下载的，测试数据集改成了coco2017test-dev，并下载了相应的标注json文件，评估结束之后在output路径下生成了对应的json文件，但是这个json文件特别大，1.1G左右，如果不使用test-dev而是test的话则有2.6G。然后我按coco官网评估要求修改了文件名称，上传json文件到服务器，都没有成功
第一次上传报错：
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
Traceback (most recent call last):
File "/tmp/codalab/tmpa2BTU6/run/program/run.py", line 112, in
res.extend(json.load(data_file))
File "/opt/conda/lib/python2.7/json/init.py", line 291, in load
**kw)
File "/opt/conda/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/opt/conda/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/conda/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 48971776 (char 48971775)
第二次上传报错：
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
Traceback (most recent call last):
File "/tmp/codalab/tmpC6m7hj/run/program/run.py", line 120, in
cocoDt=cocoGt.loadRes(resFile)
File "/opt/conda/lib/python2.7/site-packages/pycocotools/coco.py", line 309, in loadRes
anns = json.load(open(resFile))
File "/opt/conda/lib/python2.7/json/init.py", line 287, in load
return loads(fp.read(),
MemoryError
第三次上传报错：
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
/opt/conda/lib/python2.7/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
还请指教，谢谢！

您好，请问是否方便提供IA-guided queries的可视化代码？

感谢。

预测图片

如何使用训练好的模型预测一张图片呢？

can not from detectron2.projects.deeplab import add_deeplab_config, build_lr_scheduler

我在train_net.py 中from detectron2.projects.deeplab import add_deeplab_config, build_lr_scheduler无法导入或找不到。

请问如何查看loss函数的变化曲线

About the number of instances

There are only six categories, but the inference result shows that a total of 100 instances are detected, what is the reason?

你好，我正在尝试改进您的模型，想请问一下您的帧率是如何计算的。

比如有输出
[03/30 02:34:58 d2.evaluation.evaluator]: Inference done 1192/1250. Dataloading: 0.0016 s/iter. Inference: 0.0399 s/iter. Eval: 0.1135 s/iter. Total: 0.1550 s/iter. ETA=0:00:08
[03/30 02:35:03 d2.evaluation.evaluator]: Inference done 1227/1250. Dataloading: 0.0016 s/iter. Inference: 0.0399 s/iter. Eval: 0.1132 s/iter. Total: 0.1548 s/iter. ETA=0:00:03
[03/30 02:35:08 d2.evaluation.evaluator]: Total inference time: 0:03:13.537369 (0.155452 s / iter per device, on 4 devices)
[03/30 02:35:08 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:49 (0.039901 s / iter per device, on 4 devices)

请问您计算帧率是直接 1/ 0.039901 吗？

magic_number = pickle_module.load(f, **pickle_load_args)

请问resnet50d_ra2-464e36ba.pkl文件哪里下载

training log file

Hi,
Thank you for your hard work on FastInst. Could you please provide me with a copy of your training log file?

Best regards,
[Xiaolin]

试图训练时遇到问题：TypeError: init() got an unexpected keyword argument 'dtype'

您好，我在试图训练时遇到了如上所示错误，完整的出错信息如下：
File "train_net.py", line 416, in
launch(
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/launch.py", line 84, in launch
main_func(*args)
File "train_net.py", line 410, in main
return trainer.train()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/defaults.py", line 486, in train
super().train(self.start_iter, self.max_iter)
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/train_loop.py", line 155, in train
self.run_step()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/defaults.py", line 496, in run_step
self._trainer.run_step()
File "/home/h/deep_learn/f_d/detectron2/detectron2/engine/train_loop.py", line 493, in run_step
with autocast(dtype=self.precision):
TypeError: init() got an unexpected keyword argument 'dtype'
请问我该如何解决？

V100无法复现您的帧率

Originally posted by @junjiehe96 in #37 (comment)

        +---------------------------------------------------------------------------------------+

[04/06 17:46:17 d2.evaluation.evaluator]: Inference done 370/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1694 s/iter. Total: 0.2281 s/iter. ETA=0:17:36
[04/06 17:46:22 d2.evaluation.evaluator]: Inference done 391/5000. Dataloading: 0.0022 s/iter. Inference: 0.0567 s/iter. Eval: 0.1699 s/iter. Total: 0.2289 s/iter. ETA=0:17:35
[04/06 17:46:27 d2.evaluation.evaluator]: Inference done 412/5000. Dataloading: 0.0022 s/iter. Inference: 0.0568 s/iter. Eval: 0.1704 s/iter. Total: 0.2296 s/iter. ETA=0:17:33
[04/06 17:46:32 d2.evaluation.evaluator]: Inference done 433/5000. Dataloading: 0.0023 s/iter. Inference: 0.0567 s/iter. Eval: 0.1712 s/iter. Total: 0.2303 s/iter. ETA=0:17:31
[04/06 17:46:37 d2.evaluation.evaluator]: Inference done 457/5000. Dataloading: 0.0022 s/iter. Inference: 0.0565 s/iter. Eval: 0.1706 s/iter. Total: 0.2295 s/iter. ETA=0:17:22
[04/06 17:46:42 d2.evaluation.evaluator]: Inference done 480/5000. Dataloading: 0.0022 s/iter. Inference: 0.0564 s/iter. Eval: 0.1704 s/iter. Total: 0.2292 s/iter. ETA=0:17:16

        请问，为什么我用V100无法复现你的帧率？完全使用fastinst_R50-vd-dcn_ppm-fpn_x3_640.yaml默认设置只有不到20帧。

关于小目标的问题

请问作者，因为在论文里面看到有说，该分割器对小目标不太友好，是否是因为是把pixde decoder 的最后一层1/8size的feature map送入transformer deocde？如果我再把1/8size的再upsample +conv 变成1/4再送入transformer decoder这样是否能够提高分割的精度并且对小目标有一个更好的效果？即使牺牲一点推理速度也是没有关系

Not able to Classify the dataset

Hello,

I'm trying to implement this model on my own dataset which has 6 diferent types of objects and I need to perform instance segmentation on them.. I'm currently using Resnet101 architecture and SGD optimizer with LR 0.0001. Even though I trained the model for 1Lakh epoch, the model is not able to classify the images

Can you please let me know what might be the possible error I made while training.

On a two 3090 GPUs server and a two A6000 GPUs server, both of them reported pure inference time of ~0.022s (about 45 FPS), which
is much higher than that reported in paper and main page of FastInst github. Is there any other post-processing task that didn't count
by 'pure inference time'? If True, could you please guide me how to count the full inference time. Or it just caused by
GPU/CPU/CUDA/Pytorch/... difference.
Looking forward to receiving a response, Thank you.