goatmessi7 / rfbnet Goto Github PK

View Code? Open in Web Editor NEW

1.4K 47.0 357.0 1.87 MB

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

License: MIT License

Python 91.79% Shell 0.86% C++ 0.07% Cuda 2.38% C 4.90%

detection pytorch mobilenet rfbnet

rfbnet's Introduction

Receptive Field Block Net for Accurate and Fast Object Detection

By Songtao Liu, Di Huang, Yunhong Wang

Updatas (2021/07/23): YOLOX is here!, stronger YOLO with ONNX, TensorRT, ncnn, and OpenVino supported!!

Updates: we propose a new method to get 42.4 mAP at 45 FPS on COCO, code is available here

Introduction

Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the discriminability and robustness of features. We further assemble the RFB module to the top of SSD with a lightweight CNN model, constructing the RFB Net detector. You can use the code to train/evaluate the RFB Net for object detection. For more details, please refer to our ECCV paper.

VOC2007 Test

System	mAP	FPS (Titan X Maxwell)
Faster R-CNN (VGG16)	73.2	7
YOLOv2 (Darknet-19)	78.6	40
R-FCN (ResNet-101)	80.5	9
SSD300* (VGG16)	77.2	46
SSD512* (VGG16)	79.8	19
RFBNet300 (VGG16)	80.7	83
RFBNet512 (VGG16)	82.2	38

COCO

System	test-dev mAP	Time (Titan X Maxwell)
Faster R-CNN++ (ResNet-101)	34.9	3.36s
YOLOv2 (Darknet-19)	21.6	25ms
SSD300* (VGG16)	25.1	22ms
SSD512* (VGG16)	28.8	53ms
RetinaNet500 (ResNet-101-FPN)	34.4	90ms
RFBNet300 (VGG16)	30.3	15ms
RFBNet512 (VGG16)	33.8	30ms
RFBNet512-E (VGG16)	34.4	33ms

MobileNet

System	COCO minival mAP	#parameters
SSD MobileNet	19.3	6.8M
RFB MobileNet	20.7	7.4M

Citing RFB Net

Please cite our paper in your publications if it helps your research:

@InProceedings{Liu_2018_ECCV,
author = {Liu, Songtao and Huang, Di and Wang, andYunhong},
title = {Receptive Field Block Net for Accurate and Fast Object Detection},
booktitle = {The European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}
}

Installation
Datasets
Training
Evaluation
Models

Installation

Install PyTorch-0.4.0 by selecting your environment on the website and running the appropriate command.
Clone this repository. This repository is mainly based on ssd.pytorch and Chainer-ssd, a huge thank to them.
- Note: We currently only support PyTorch-0.4.0 and Python 3+.
Compile the nms and coco tools:

./make.sh

Note: Check you GPU architecture support in utils/build.py, line 131. Default is:

'nvcc': ['-arch=sm_52',

Then download the dataset by following the instructions below and install opencv.

conda install opencv

Note: For training, we currently support VOC and COCO.

Datasets

To make things easy, we provide simple VOC and COCO dataset loader that inherits torch.utils.data.Dataset making it fully compatible with the torchvision.datasets API.

VOC Dataset

Download VOC2007 trainval & test

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>

Download VOC2012 trainval

# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>

COCO Dataset

Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure

$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/

UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.

Training

First download the fc-reduced VGG-16 PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth or from our BaiduYun Driver
MobileNet pre-trained basenet is ported from MobileNet-Caffe, which achieves slightly better accuracy rates than the original one reported in the paper, weight file is available at: https://drive.google.com/open?id=13aZSApybBDjzfGIdqN1INBlPsddxCK14 or BaiduYun Driver.
By default, we assume you have downloaded the file in the RFBNet/weights dir:

mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth

To train RFBNet using the train script simply specify the parameters listed in train_RFB.py as a flag or manually change them.

python train_RFB.py -d VOC -v RFB_vgg -s 300

Note:
- -d: choose datasets, VOC or COCO.
- -v: choose backbone version, RFB_VGG, RFB_E_VGG or RFB_mobile.
- -s: image size, 300 or 512.
- You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train_RFB.py for options)
- If you want to reproduce the results in the paper, the VOC model should be trained about 240 epoches while the COCO version need 130 epoches.

Evaluation

To evaluate a trained network:

python test_RFB.py -d VOC -v RFB_vgg -s 300 --trained_model /path/to/model/weights

By default, it will directly output the mAP results on VOC2007 test or COCO minival2014. For VOC2012 test and COCO test-dev results, you can manually change the datasets in the test_RFB.py file, then save the detection results and submitted to the server.

Models

rfbnet's People

Contributors

Stargazers

Watchers

Forkers

issac8huxley tqdavid liuguoyou dl-85 10183308 wanjinchang statml likeucode csgaobb wwwanghao wjgaas arsenluca aymenx17 lilacyue hdjang dengshuo fanxianyou horaccefeng liangxi627 zxt881108 laycoding elejke vandmoon starstylesky xialuxi jacke121 fostorhunt mahlermozart zhanghaoinf hsakas shubhampachori12110095 qdet 32l liujie3948 hzhang57 wjyao runauto kyocen halfanengineer fqss0436 cvtower zgsxwsdxg dicksonyuan aust-hansen anguoyang dreadlord1984 northrend shlpu zhangjunyi1225054736 matrixplayer picekl grseb9s xtanitfy felixcaae hxl1990 liben2018 lighttoyang onexuan ml-lab zhiweiyan-96 kadeng gaobb wattx huishuai-nuist piaomiaoju snooble zbxzc35 xiaoyigwr litingsjj arasharchor huaifeng1993 liyibest angleboy8 zhancr wanggs950730 amigocdt zqdeeplearning zqdeepbluesky hdjsjyl haonan-qin vikasmech deftruth seongkyun wxinbeings wyxhahaha lizhen2017 chl916185 thoringondor ieyer humengdoudou ddeeppnneett haochen-rye shentanyue tatsuyashirakawa m0redr1nk vsunn foreverfruit liweiq xinxin12345 kixiang

rfbnet's Issues

TEST error

When I finished the training step and use the trained weight to test ,the mismatch error below happened .
I used my dataset to make the VOClike dataset and trained the net .Where should I changed when I use the weight to test ?
RuntimeError: Error(s) in loading state_dict for RFBNet:
size mismatch for conf.0.weight: copying a param of torch.Size([24, 512, 3, 3]) from checkpoint, where the shape is torch.Size([126, 512, 3, 3]) in current model.
size mismatch for conf.0.bias: copying a param of torch.Size([24]) from checkpoint, where the shape is torch.Size([126]) in current model.

Loss_l : inf in training ?

Hello, first of all, thanks for your code releasing.
I got the training loss inf, acutally loss_l = inf, i use your original code (only fixed some bug), but i don't know why i got inf.
Parameters: lr:0.004, batchsize:32, base_model:vgg_reducedfc.pth
GPU: 1080ti

Any comments will be appreciated.
Thanks very much!

Final detection_eval = 0.737428

Thanks your code! I run your code to training, After 120K iterations, the loss=3.09611, but the detection_eval = 0.737428. It's lower than your paper's result. Is something wrong with my train?

The FPS of SSD300* is different between the code and paper.

Hello, first of all, thanks for your code releasing.
The FPS of SSD300* is 46 in your code. But in your paper it's 120. So i just want to know which is faster ?SSD300 or RFBNet300?

More details about the parameters in Table 3 of your paper

Thanks for your brilliant work. I want to know how to calculate the parameters in Table 3?
Such as RFB's parameters 34.5M in Table 3, I do not get the result, so can you give me some details about it?

I calculate the parameters as follows:
(1)VGG16(base net)
conv1_1: weights 64(output numbers)x3(input numbers)x3x3(kernel size)=1728; bias 64(output numbers)
conv1_2: weights 64x64x3x3=36864; bias 64
......
fc7: weights 1024x1024x1x1=1048576; bias 1024
total: 12220096(weights)+6272(bias)=12226368=12.226368M
(2)RFB(fc7)
branch0: (1x1conv)weights 256x1024x1x1=262144, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
branch1:(1x1conv)weights 128x1024x1x1=131072, bn(batch norm) 128(output numbers)x2=256
(3x3conv)weights 256x128x3x3=294912, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
branch2:(1x1conv)weights 128x1024x1x1=131072, bn(batch norm) 128(output numbers)x2=256
(3x3conv)weights 192x128x3x3=221184, bn(batch norm) 192(output numbers)x2=384
(3x3conv)weights 256x192x3x3=442368, bn(batch norm) 256(output numbers)x2=512
(3x3conv)weights 256x256x3x3=589824, bn(batch norm) 256(output numbers)x2=512
ConvLinear: weights 1024x768x1x1=786432, bn(batch norm) 1024(output numbers)x2=2048
shortcut: weights 1024x1024x1x1=1048576, bn(batch norm) 1024(output numbers)x2=2048
total: 5087232(weights)+8064(bn)=5095396=5.095396M
(3)RFB(stride 2 or conv8)
total: 4169728(weights)+6016(bn)=4175744=4.175744M
(4)RFB(stride 2 or conv9)
total: 1042432(weights)+3008(bn)=1045440=1.04544M
(5)conv10_1: weights 128x256=32768; bn 128x2=256
conv10_2: weights 256x128x3x3=294912; bn 256x2=512
conv11_1: weights 128x256=32768; bn 128x2=256
conv11_2: weights 256x128x3x3=294912; bn 256x2=512
total: 655360(weights)+1536(bn)=656896=0.656896M
(6)multi_box
conv4_3: conf weights (21x6)x512x3x3=580608; conf bias 21x6=126
loc weights (4x6)x512x3x3=110592; loc bias 4x6=24
fc7:conf weights (21x6)x1024x3x3=1161216; conf bias 21x6=126
loc weights (4x6)x1024x3x3=221184; loc bias 4x6=24
conv8:conf weights (21x6)x512x3x3=580608; conf bias 21x6=126
loc weights (4x6)x512x3x3=110592; loc bias 4x6=24
conv9:conf weights (21x6)x256x3x3=290304; conf bias 21x6=126
loc weights (4x6)x256x3x3=55296; loc bias 4x6=24
conv10_2:conf weights (21x4)x256x3x3=193536; conf bias 21x4=84
loc weights (4x4)x256x3x3=36864; loc bias 4x4=16
conv11_2:conf weights (21x4)x256x3x3=193536; conf bias 21x4=84
loc weights (4x4)x256x3x3=36864; loc bias 4x4=16
total: 3571200(weights)+800(bn)=3572000=3.572M
above all, the total parametes are: 12.226368+5.095396+4.175744+1.04544+0.656896+3.572=26.771844M < 34.5M
Do you think my calculation is correct? Why do I calculate so many fewer parameters?
Any comments will be appreciated.
Thanks.

USE multi-scale testing strategy

hello
Do you use multi-scale testing strategy in test_RFB.py ?
your paper seems not mention about using multi-scale testing strategy.
thanks!

Can not to test my dataset

when I used my dataset to train the net,after 300 epoches the lowest Location loss is under 1 and the class loss is still exeed 1 ,and I use the weight to test my test set ,but I found that if I changed the class_num in the test script ,the mismatch error will happen.so I used the VOC class_num 21 the test script can run normally .But the result is none ,the weight I trained 300 epoches detect nothing,the AP,mAP is 0. Can you give me some suggestions about how to change the situation? Thanks

I meet the problem.

Traceback (most recent call last):
File "test_RFB.py", line 49, in
from models.RFB_Net_E_vgg import build_net
File "/media/media_share/linkfile/RFBNet/models/RFB_Net_E_vgg.py", line 405
return RFBNet(phase, size, *multibox(size, vgg(base[str(size)], 3),add_extras(size, extras[str(size)], 1024),mbox[str(size)], num_classes), num_classes)
SyntaxError: only named arguments may follow *expression

could you help me?

thank you very much

Why is your model‘s inference so much faster than original ssd？

ImportError: cannot import name '_mask'

Hi, thank you for releasing your code. The idea of this paper is amazing.
I tried to train the model, but I met an error as follows:

python train_RFB.py
Traceback (most recent call last):
File "train_RFB.py", line 14, in
from data import VOCroot, COCOroot, VOC_300, VOC_512, COCO_300, COCO_512, COCO_mobile_300, AnnotationTransform, COCODetection, VOCDetection, detection_collate, BaseTransform, preproc
File "/data_1/models/RFBNet-master/data/init.py", line 3, in
from .coco import COCODetection
File "/data_1/models/RFBNet-master/data/coco.py", line 21, in
from utils.pycocotools.coco import COCO
File "/data_1/models/RFBNet-master/utils/pycocotools/coco.py", line 55, in
from . import mask as maskUtils
File "/data_1/models/RFBNet-master/utils/pycocotools/mask.py", line 4, in
from . import _mask
ImportError: cannot import name '_mask'

How can I fix it? Thanks a lot.

How can i get MS COCO 'train2014(or2017)_gt_roidb.pkl'file in cache folder?

I'm trying to train this code on my pc with your instructions.
But there is a problem that generating ~_gt_roidb.pkl process.
I've read dataset instructions but nothing explained about generating MS COCO cache folder and ~_gt_roidb.pkl file, even can not find the method on https://github.com/rbgirshick/py-faster-rcnn/blob/77b773655505599b94fd8f3f9928dbf1a9a776c7/data/README.md .
Is there any way to get that file to train this model?

Reproducibility of RFB speed

Hi,

I think you should add torch.cuda.synchronize() inside timer(e.g. after net(x) ), because CUDA is asynchronous.
By adding this, I got ~0.12s/forward.

High overhead GPU to CPU

The conversion of boxes (cuda float tensors) which are returned from the detector forward to cpu float tensors has extremely high overhead. (I ignored the conversion to numpy array,takes about a microsecond)

boxes = boxes.cpu().numpy()

It takes approximately 22 milliseconds on a 512 input size (detection time is approximately 9 milliseconds)

VOC and COCO results reproduction problem

Hello, first of all, thanks for your code releasing.

Since I want to see whether the reported accuracy is reproducible or not, I trained exactly the same code on the git. However, even I tried several times, your reported accuracy of voc2007(80.5%) and COCO(29.9%) is not attainable. I got 79.9% and 28.8% respectively.

For the fair comparison, I trained SSD using same training scheme as RFBNet and I obtained 78.8%.

Any comments will be appreciated.

Thanks.

Contributors

Hi,

are you looking for contributors / partners in science ? :)

Lukas

regarding min_sizes and max_size in config file

can you please suggest how should i define the min_sizes and max_sizes for a costume dataset?

More detail about Table 3 in paper

Hi, @ruinmessi .

Thanks for your brilliant work. I want to ask you something about results in Table 3.

Table 3 shows "Performance comparison of different block architectures". The architecture of RFB block is similar to inception [34] module in GoogLeNet, so I can regard results between RFB and inception as comparison between different block architectures. But the rest architecture such as Deformable CNN [4] , Dilated Conv [3] are not the same (or similar) as RFB (actually they are deeply embedded with CNN networks such as VGG and ResNet). I wonder whether results in Table 3 come from replacing components in RFB block or directly replacing the whole network by Deformable CNN [4] , Dilated Conv [3] and others. I have confused with these results for a long time and need some clearer detail on experiments.

Thanks.

About some train details

error when running test_RFB.py without cuda

test_RFB.py runs OK when cuda sets as True. However, when I set cuda as False in test_RFB.py and got the following error:

Traceback (most recent call last):
File "test_RFB.py", line 193, in
top_k, thresh=0.01)
File "demo_RFB.py", line 91, in test_net
out = net(x) # forward pass
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/topspinn/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 185, in forward
x = self.basek
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 72, in forward
input = module(input)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected object of type Variable[torch.FloatTensor] but found type Variable[torch.cuda.FloatTensor] for argument #1 'weight'

Can I use my dataset to train this code ?

I used my dataset to make a VOClike dataset and changed the VOC classes like
VOC_CLASSES = ( 'background', # always index 0
'Car', 'Cyclist', 'Pedestrain')
but the error below happened
Traceback (most recent call last):
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1664, in
main()
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/jsu/下载/pycharm-community-2018.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/jsu/yuyijie/RFBNet/train_RFB.py", line 257, in
train()
File "/home/jsu/yuyijie/RFBNet/train_RFB.py", line 208, in train
images, targets = next(batch_iterator)
File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in next
return self._process_next_batch(batch)
File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
KeyError: 'Traceback (most recent call last):\n File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/jsu/anaconda3/envs/torch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in \n samples = collate_fn([dataset[i] for i in batch_indices])\n File "/home/jsu/yuyijie/RFBNet/data/voc0712.py", line 185, in getitem\n target = self.target_transform(target)\n File "/home/jsu/yuyijie/RFBNet/data/voc0712.py", line 136, in call\n label_idx = self.class_to_ind[name]\nKeyError: 'car'\n'
Could not find thread pid_14959_id_139789338658968
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129560
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139790992329136
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129840
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789339129000
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14959_id_139789452204296']
Could not find thread pid_14959_id_139789441732792
Available: ['pid_14959_id_139790992515760', 'pid_14959_id_139790992516208', 'pid_14959_id_139790992472160', 'pid_14959_id_139789339129280', 'pid_14

VOC dataset mAP with my trained-model is 10% lower than RFB300_80_5.pth

hello,

I just follow your guide step by step and then trained a new VOC model (RFB-300-vgg16). I turned it down when epoches went to 130, so finally I used model with RFB_vgg_VOC_epoches_130.pth in test_RFB.py but only get mAP 70%...
could you help me how can I train RFB model with so good results as yours?
thanks a lot

How to train from scratch

Hi
Thanks for sharing your code.
Is it possible to train from scratch without using the pretrain weights?

RuntimeError: randperm is only implemented for CPU

Traceback (most recent call last):
File "train_RFB.py", line 253, in
train()
File "train_RFB.py", line 189, in train
shuffle=True, num_workers=args.num_workers, collate_fn=detection_collate))
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 247, in init
self._put_indices()
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 295, in _put_indices
indices = next(self.sample_iter, None)
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 139, in iter
for idx in self.sampler:
File "/home/user/anaconda2/envs/py3/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 53, in iter
return iter(torch.randperm(len(self.data_source)).tolist())
RuntimeError: randperm is only implemented for CPU
I am using torch=0.4, python=2.5, ubuntu=14.04, and I have problems during training.

demo.py

@ruinmessi

I like to test your pre-trained model on several images I have.
Do you have a demo.py code which takes an image name as input and display detection result?

Thanks,

In config.py, coco-512 settings are different from those of voc-512.

Hi, ruinmessi,

Thank you for sharing your impressive work, but I have some questions while reading the codes.
When I am using RFB_NET_E_512, I found the anchor-box size of Coco-512 and VOC-512 (min size, max size) differs from each other. Is there any special reasons ? I expected the anchor boxes to be fixed for the same network.
Thanks & Regards

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda

Environment:
Linux 16.04
python 3.7
cuda 8.0

When I compile the nms and coco tools:
by using
./make.sh
I got this kind of error:
running build_ext skipping 'nms/cpu_nms.c' Cython extension (up-to-date) building 'nms.cpu_nms' extension {'gcc': ['-Wno-cpp', '-Wno-unused-function']} gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/lib/python3.7/site-packages/numpy/core/include -I/usr/local/include/python3.7m -c nms/cpu_nms.c -o build/temp.linux-x86_64-3.7/nms/cpu_nms.o -Wno-cpp -Wno-unused-function nms/cpu_nms.c: In function ‘__pyx_pf_3nms_7cpu_nms_2cpu_soft_nms’: nms/cpu_nms.c:3172:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] __pyx_t_8 = ((__pyx_v_pos < __pyx_v_N) != 0); ^ nms/cpu_nms.c:3683:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] __pyx_t_8 = ((__pyx_v_pos < __pyx_v_N) != 0); ^ nms/cpu_nms.c: In function ‘__Pyx_PyCFunction_FastCall’: nms/cpu_nms.c:8431:12: error: too many arguments to function ‘(PyObject * (*)(PyObject *, PyObject * const*, Py_ssize_t))meth’ return (*((__Pyx_PyCFunctionFast)meth)) (self, args, nargs, NULL); ^ nms/cpu_nms.c: In function ‘__Pyx__ExceptionSave’: nms/cpu_nms.c:8892:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ *type = tstate->exc_type; ^ nms/cpu_nms.c:8893:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ *value = tstate->exc_value; ^ nms/cpu_nms.c:8894:17: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ *tb = tstate->exc_traceback; ^ nms/cpu_nms.c: In function ‘__Pyx__ExceptionReset’: nms/cpu_nms.c:8901:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tmp_type = tstate->exc_type; ^ nms/cpu_nms.c:8902:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tmp_value = tstate->exc_value; ^ nms/cpu_nms.c:8903:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tmp_tb = tstate->exc_traceback; ^ nms/cpu_nms.c:8904:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tstate->exc_type = type; ^ nms/cpu_nms.c:8905:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tstate->exc_value = value; ^ nms/cpu_nms.c:8906:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tstate->exc_traceback = tb; ^ nms/cpu_nms.c: In function ‘__Pyx__GetException’: nms/cpu_nms.c:8961:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tmp_type = tstate->exc_type; ^ nms/cpu_nms.c:8962:23: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tmp_value = tstate->exc_value; ^ nms/cpu_nms.c:8963:20: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tmp_tb = tstate->exc_traceback; ^ nms/cpu_nms.c:8964:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’ tstate->exc_type = local_type; ^ nms/cpu_nms.c:8965:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’ tstate->exc_value = local_value; ^ nms/cpu_nms.c:8966:11: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’ tstate->exc_traceback = local_tb; ^ error: command 'gcc' failed with exit status 1
Anyone can help? How to solve this problem?

what is the difference between 'RFB_Net_E_vgg.py' and 'RFB_Net_vgg.py'?

Hello,thanks for your code.But I am confused by the file 'RFB_Net_E_vgg.py' and 'RFB_Net_vgg.py'.I've seen there are some differences in these two files, but I don't understand what the different purposes of the two of them are. Can you tell me when should I use the former and when the latter?

Do you have RetinaNet + Resnet101

Can this model be used together with resnet101 and is so, Do you guys the results of RetinaNet + Resnet101?
Is it possible to share please?

RFB_mobiel_20_7.pth load failed

log:

RuntimeError: Error(s) in loading state_dict for RFBNet:
While copying the parameter named "conf.0.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.0.weight", whose dimensions in the model are torch.Size([126, 512, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 512, 1, 1]).
While copying the parameter named "conf.1.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.1.weight", whose dimensions in the model are torch.Size([126, 1024, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 1024, 1, 1]).
While copying the parameter named "conf.2.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.2.weight", whose dimensions in the model are torch.Size([126, 512, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 512, 1, 1]).
While copying the parameter named "conf.3.bias", whose dimensions in the model are torch.Size([126]) and whose dimensions in the checkpoint are torch.Size([486]).
While copying the parameter named "conf.3.weight", whose dimensions in the model are torch.Size([126, 256, 1, 1]) and whose dimensions in the checkpoint are torch.Size([486, 256, 1, 1]).
While copying the parameter named "conf.4.bias", whose dimensions in the model are torch.Size([84]) and whose dimensions in the checkpoint are torch.Size([324]).
While copying the parameter named "conf.4.weight", whose dimensions in the model are torch.Size([84, 256, 1, 1]) and whose dimensions in the checkpoint are torch.Size([324, 256, 1, 1]).
While copying the parameter named "conf.5.bias", whose dimensions in the model are torch.Size([84]) and whose dimensions in the checkpoint are torch.Size([324]).
While copying the parameter named "conf.5.weight", whose dimensions in the model are torch.Size([84, 128, 1, 1]) and whose dimensions in the checkpoint are torch.Size([324, 128, 1, 1]).

Did you test this mobilenet + RFB ?

About the FPS (RFB and SSD)

In the paper, SSD has larger FPS than RFB.
However, RFB has larger FPS than SSD in this repository.

Why are these different?

How to use the txt format label files

The code is based on xml format label file, but I want to use txt format label file....., I have tried to modified voc0712.py, but I failed (crying), Please help me...I really need your help
Thank you so much for your kindness!!

How to reproduce the COCO result

Hi, I feel excited about your amazing work and I am trying to reproduce the training process. May I ask about the training arguments of COCO dataset? In detail, --batch_size, --max_epoch and so on.

Thanks for your reply!

mobilenet training schedule

can you give me some suggestions about the training parameters while training the mobile net version?

inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming.

Hi, when I test the result, I found that even though other parts is pretty fast, the nms tims cost is pretty high.

In this case, I test the time cost step by step and found that inds = torch.nonzero(scores[:,j]>0.01).view(-1) is super time consuming. It will takes nearly 50ms per iteration in k40c.

Does anyone has any ideas about that?

How to depict Fig.3 in your paper?

Hi,
I'm really impressed by your good work.

As shown Fig3, I want to depict the effective receptive field like you.

Could you share the method?

map of RFB_mobile trained on VOC is 71.16%

More details about Fig.3 in your paper?

Hi, thanks for your brilliant work.
I want to know how to draw Fig.3 in your paper, I had tried the method you said in #11. But I can not get that beautiful picture.
There are four questions I want to ask:
Q1: The effective receptive field drawn in Fig.3 is which module's input gradient map (RFB module after conv4_3, fc7, conv8 or conv9)
Q2: You said in #11 that the input is an image, what is the number of channels in this image (as conv4_3's RFB has 512 channels; fc7's RFB has 1024 channels...)?
Q3: The author in "Understanding the effective receptive field in deep convolutional neural networks" only describes the case where all convolutional layers are one channel. There is no description of multi-channel conditions, such as the RFB module followed by the fc7 layer, the input of this module has 1024 channels (feature map’s size is 19x19), the output also has 1024 channels, then the gradient of the central pixel on which channel is set to 1.0, that is, \frac{\partial l}{\partial y_{0,0}}(LaTeX code) in the paper.

Q4: The input of the module is multi-channel, that is, the input gradient map is also multi-channel. Then, is the final effective receptive field image selected one of the channels, or is an average value obtained on all channels? Or other operations?

My code for drawing the effective receptive field image is as follows (where the input is a randomly generated 1x1024x19x19 size feature map, the module uses RFB after fc7, the output gradient map takes Zero_grad[0][512][9][9] = 1.0, and the rest are 0): temp.txt (as the uploaded code is messy, I put it in temp.txt and cut the graph as follows)

But the image I got is as follows, and it is very different from Fig.3, so I don't know if it is caused by the details reflected in the above four questions.

RFB-max pooling in paper

Hi, ruinmessi:
Thank your Great codes, I have some questions:
In paper 4.2 Ablation Study mention that By simply replacing the last convolution layer with the
RFB-max pooling, we can see that the result is impoved to 79.1%。
What the RFB-max pooling refer to ? The RFB(stride 2) ?
Which is the last convolution layer ?
Here the learning rate strategy is same as code? Warmup then use 4e-3, then decay by 0.1?
If it is convenient， Could I have your WeChat ID?
Thank you!

The structure of BasicRFB is not same with Fig.4.(a) in your paper

Hello, thanks for your code releasing.

There is a branch of 1×1 conv tailed by 3×3 conv rate=1 in Fig.4.(a) in your paper(arXiv:1107.07767v3),which is not shown in class BasicRFB of models/RFB_Net_vgg.py.
There are other two branch(self.branch0 and self.branch1 in class BasicRFB) and a shortcut connection in your code. So which choice is more helpful in performance?

check something!

@ruinmessi
hey!
Can you give me your 2015test-dev result (a .json file ) which you submitted to server，I just want to check my results. thanks ，my email：[email protected]

thanks!

error in training mobilenet

@ruinmessi

I follow your instruction below to train VOC with mobilenet, but got an error:

python3 train_RFB.py -d VOC -v RFB_mobile -s 300
300 21
Traceback (most recent call last):
File "train_RFB.py", line 88, in
net = build_net('train', img_dim, num_classes)
File "/home/topspin/2TB/src/RFBNet/models/RFB_Net_mobile.py", line 348, in build_net
mbox[str(size)], num_classes), num_classes)
TypeError: init() missing 2 required positional arguments: 'head' and 'num_classes'

Any idea why this happens?

Thanks,

some question about l2 loss

作者你好，我发现您的代码和原始的SSD相比，没有了L2 loss，原本的L2正则化是接在conv4-3后面的，但是现在没有了，请问如果加上会有什么影响吗？谢谢您的回答！

The difference of RFBNet in RFB_Net_vgg.py and Fig.5 ?

Hello, thank you for code releasing.

There are two branches from fc7 layer output in Fig.5, one is the input of RBF and another is the input of RBF(stride 2).

    # apply vgg up to fc7
    for k in range(23, len(self.base)):
        x = self.base[k](x)

    # apply extra layers and cache source layer outputs
    for k, v in enumerate(self.extras):
        x = v(x)
        if k < self.indicator or k%2 ==0:
            sources.append(x)

But, in RFB_Net_vgg.py, the output x from fc7 layer is the input of RFB layer, and then, the output of RFB layer is the input of RFB(stride 2) layer. There might not be two branches. And the architecture of RFBNet might be like the picture below?

Results reproducibility

Hello, @ruinmessi. Thank you for great research and open source implementation of paper. I tried your implementation on COCO and was disappointed by results.

Shortly, I used your code (without any changes) and your weights for validation on COCO 2014 minival and got next results.

For RBF512-E:

alexkirnas@dev:~/Projects/RFBNet$ CUDA_VISIBLE_DEVICES='0' python test_RFB.py -d COCO \
-v RFB_E_vgg -s 512 --trained_model ./weights/RFB512_E_34_4.pth --save_folder=./eval

....
A lot strings
....

~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.273
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.204
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.137
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.271
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.271
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.287
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.430
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.484
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.250
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.521
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.686

For RBF-Mobile:

alexkirnas@dev:~/Projects/RFBNet$  CUDA_VISIBLE_DEVICES='0' python test_RFB.py -d COCO \
-v RFB_mobile -s 300 --trained_model ./weights/RFB_mobile_20_7.pth --save_folder=./eval
....
A lot strings
....

~~~~ Summary metrics ~~~~
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.135
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.217
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.144
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.015
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.159
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.207
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.300
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.311
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.035
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.328
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.584

Note: I run code on Python 3.6 with newest PyTorch (v 0.3.0, installed via pip) and CUDA8.0.

This results is to much different from reported one in the paper so this is cannot be connected with usage of different data (your report results on trainval35k set). Can you reproduce your results with newest PyTorch?

Thanks,
Alex

the difference of the backbone between vgg and mobilenet

Hi, Songtao Liu:
Glad to read your paper, one question has no relationship with the code.
In your paper, it seems that the vgg is much powerful than mobilenet, For example, VGGNet300 + RFB in coco test could get MAP30.3, while the SSD 300 MobileNet+RFB only get MAP20.7
my conclusion is right or not ?

gpu_nms bug

Hi, I ran the following code after running ./make.sh

from utils.nms_wrapper import nms
import numpy as np

for x in range(10):
    #generate random detection box
    x1y1 = np.random.randint(0, 600,(100,2))
    x2y2 = x1y1+np.random.randint(0, 600,(100,2))
    conf = np.random.random((100,1))

    dets = np.concatenate([x1y1,x2y2,conf],axis=1).astype(np.float32)

    keep_cpu = nms(dets.copy(), 0.45, force_cpu=True)
    keep_gpu = nms(dets.copy(), 0.45, force_cpu=False)
    print ("CPU:%s, GPU:%s" %(len(keep_cpu),len(keep_gpu)))

But this results,

CPU:70, GPU:100
CPU:67, GPU:100
CPU:64, GPU:100
CPU:70, GPU:100
CPU:73, GPU:100
CPU:69, GPU:100
CPU:70, GPU:100
CPU:68, GPU:100
CPU:70, GPU:100
CPU:67, GPU:100

Is gpu_nms working properly on your machine?

The final loss

Good work! Can you tell me the final loss when you finish training. And how should I judge it? Thanks a lot!

my train file will seize up at a random epoch

there is no error, just don't run, and it will seize up at a random epoch？？I dont know how to deal with it............It makes me mad!!!!

How to convert caffe's model to pytorch?

I want to know the transform of vgg16_reducedfc.pth. Would like to provide the code of the conversion？
Thanks!

how to deal with the bug of make.sh

Hi !@ruinmessi
I run the ./make.sh
but get
g++ -pthread -shared -B /home/liye/anaconda3/compiler_compat -L/home/liye/anaconda3/lib -Wl,-rpath=/home/liye/anaconda3/lib,--no-as-needed build/temp.linux-x86_64-3.6/nms/nms_kernel.o build/temp.linux-x86_64-3.6/nms/gpu_nms.o -L/usr/local/cuda-8.0/lib64 -L/home/liye/anaconda3/lib -R/usr/local/cuda-8.0/lib64 -lcudart -lpython3.6m -o /media/ubuntue/extdisk1/liye/RFBNet-master/utils/nms/gpu_nms.cpython-36m-x86_64-linux-gnu.so
g++: error: unrecognized command line option ‘-R’
error: command 'g++' failed with exit status 1

why?
can you help me?

goatmessi7 / rfbnet Goto Github PK

rfbnet's Introduction

Receptive Field Block Net for Accurate and Fast Object Detection

Updatas (2021/07/23): YOLOX is here!, stronger YOLO with ONNX, TensorRT, ncnn, and OpenVino supported!!

Updates: we propose a new method to get 42.4 mAP at 45 FPS on COCO, code is available here

Introduction

VOC2007 Test

COCO

MobileNet

Citing RFB Net

Contents

Installation

Datasets

VOC Dataset

Download VOC2007 trainval & test

Download VOC2012 trainval

COCO Dataset

Training

Evaluation

Models

rfbnet's People

Contributors

Stargazers

Watchers

Forkers

rfbnet's Issues

Recommend Projects

Recommend Topics

Recommend Org