bubbliiiing / efficientdet-pytorch Goto Github PK

这是一个efficientdet-pytorch的源码，可以用于训练自己的模型。

License: MIT License

Python 100.00%

efficientdet-pytorch's Introduction

Efficientdet：Scalable and Efficient Object目标检测模型在Pytorch当中的实现

Top News

2022-04:进行了大幅度的更新，支持step、cos学习率下降法、支持adam、sgd优化器选择、支持学习率根据batch_size自适应调整、新增图片裁剪。支持多GPU训练，新增各个种类目标数量计算。
BiliBili视频中的原仓库地址为：https://github.com/bubbliiiing/efficientdet-pytorch/tree/bilibili

2021-10:进行了大幅度的更新，增加了大量注释、增加了大量可调整参数、对代码的组成模块进行修改、增加fps、视频预测、批量预测等功能。

性能情况

训练数据集	权值文件名称	测试数据集	输入图片大小	mAP 0.5:0.95
COCO-Train2017	efficientdet-d0.pth	COCO-Val2017	512x512	33.1
COCO-Train2017	efficientdet-d1.pth	COCO-Val2017	640x640	38.8
COCO-Train2017	efficientdet-d2.pth	COCO-Val2017	768x768	42.1
COCO-Train2017	efficientdet-d3.pth	COCO-Val2017	896x896	45.6
COCO-Train2017	efficientdet-d4.pth	COCO-Val2017	1024x1024	48.8
COCO-Train2017	efficientdet-d5.pth	COCO-Val2017	1280x1280	50.2
COCO-Train2017	efficientdet-d6.pth	COCO-Val2017	1408x1408	50.7
COCO-Train2017	efficientdet-d7.pth	COCO-Val2017	1536x1536	51.2

所需环境

torch==1.2.0

文件下载

训练所需的pth可以在百度网盘下载。
包括Efficientdet-d0到d7所有权重。
链接: https://pan.baidu.com/s/1cTNR63gTizlggSgwDrmwxg
提取码: hk96

VOC数据集下载地址如下，里面已经包括了训练集、测试集、验证集（与测试集一样），无需再次划分：
链接: https://pan.baidu.com/s/1-1Ej6dayrx3g0iAA88uY5A
提取码: ph32

训练步骤

a、训练VOC07+12数据集

数据集的准备
本文使用VOC格式进行训练，训练前需要下载好VOC07+12的数据集，解压后放在根目录
数据集的处理
修改voc_annotation.py里面的annotation_mode=2，运行voc_annotation.py生成根目录下的2007_train.txt和2007_val.txt。
开始网络训练
train.py的默认参数用于训练VOC数据集，直接运行train.py即可开始训练。
训练结果预测
训练结果预测需要用到两个文件，分别是efficientdet.py和predict.py。我们首先需要去efficientdet.py里面修改model_path以及classes_path，这两个参数必须要修改。
model_path指向训练好的权值文件，在logs文件夹里。
classes_path指向检测类别所对应的txt。
完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

b、训练自己的数据集

数据集的准备
本文使用VOC格式进行训练，训练前需要自己制作好数据集，
训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
数据集的处理
在完成数据集的摆放之后，我们需要利用voc_annotation.py获得训练用的2007_train.txt和2007_val.txt。
修改voc_annotation.py里面的参数。第一次训练可以仅修改classes_path，classes_path用于指向检测类别所对应的txt。
训练自己的数据集时，可以自己建立一个cls_classes.txt，里面写自己所需要区分的类别。
model_data/cls_classes.txt文件内容为：

cat
dog
...

修改voc_annotation.py中的classes_path，使其对应cls_classes.txt，并运行voc_annotation.py。

开始网络训练
训练的参数较多，均在train.py中，大家可以在下载库后仔细看注释，其中最重要的部分依然是train.py里的classes_path。
classes_path用于指向检测类别所对应的txt，这个txt和voc_annotation.py里面的txt一样！训练自己的数据集必须要修改！
修改完classes_path后就可以运行train.py开始训练了，在训练多个epoch后，权值会生成在logs文件夹中。
训练结果预测
训练结果预测需要用到两个文件，分别是efficientdet.py和predict.py。在efficientdet.py里面修改model_path以及classes_path。
model_path指向训练好的权值文件，在logs文件夹里。
classes_path指向检测类别所对应的txt。
完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。

预测步骤

a、使用预训练权重

下载完库后解压，在百度网盘下载权值，放入model_data，运行predict.py，输入

img/street.jpg

在predict.py里面进行设置可以进行fps测试和video视频检测。

b、使用自己训练的权重

按照训练步骤训练。
在efficientdet.py文件里面，在如下部分修改model_path和classes_path使其对应训练好的文件；model_path对应logs文件夹下面的权值文件，classes_path是model_path对应分的类。

_defaults = {
    #--------------------------------------------------------------------------#
    #   使用自己训练好的模型进行预测一定要修改model_path和classes_path！
    #   model_path指向logs文件夹下的权值文件，classes_path指向model_data下的txt
    #   如果出现shape不匹配，同时要注意训练时的model_path和classes_path参数的修改
    #--------------------------------------------------------------------------#
    "model_path"        : 'model_data/efficientdet-d0.pth',
    "classes_path"      : 'model_data/coco_classes.txt',
    #---------------------------------------------------------------------#
    #   用于选择所使用的模型的版本，0-7
    #---------------------------------------------------------------------#
    "phi"               : 0,
    #---------------------------------------------------------------------#
    #   只有得分大于置信度的预测框会被保留下来
    #---------------------------------------------------------------------#
    "confidence"        : 0.3,
    #---------------------------------------------------------------------#
    #   非极大抑制所用到的nms_iou大小
    #---------------------------------------------------------------------#
    "nms_iou"           : 0.3,
    #---------------------------------------------------------------------#
    #   该变量用于控制是否使用letterbox_image对输入图像进行不失真的resize，
    #   在多次测试后，发现关闭letterbox_image直接resize的效果更好
    #---------------------------------------------------------------------#
    "letterbox_image"   : False,
    #---------------------------------------------------------------------#
    #   是否使用Cuda
    #   没有GPU可以设置成False
    #---------------------------------------------------------------------#
    "cuda"              : True
}

运行predict.py，输入

img/street.jpg

在predict.py里面进行设置可以进行fps测试和video视频检测。

评估步骤

a、评估VOC07+12的测试集

本文使用VOC格式进行评估。VOC07+12已经划分好了测试集，无需利用voc_annotation.py生成ImageSets文件夹下的txt。
在efficientdet.py里面修改model_path以及classes_path。model_path指向训练好的权值文件，在logs文件夹里。classes_path指向检测类别所对应的txt。
运行get_map.py即可获得评估结果，评估结果会保存在map_out文件夹中。

b、评估自己的数据集

本文使用VOC格式进行评估。
如果在训练前已经运行过voc_annotation.py文件，代码会自动将数据集划分成训练集、验证集和测试集。如果想要修改测试集的比例，可以修改voc_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例，默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例，默认情况下训练集:验证集 = 9:1。
利用voc_annotation.py划分测试集后，前往get_map.py文件修改classes_path，classes_path用于指向检测类别所对应的txt，这个txt和训练时的txt一样。评估自己的数据集必须要修改。
在efficientdet.py里面修改model_path以及classes_path。model_path指向训练好的权值文件，在logs文件夹里。classes_path指向检测类别所对应的txt。
运行get_map.py即可获得评估结果，评估结果会保存在map_out文件夹中。

Reference

https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch
https://github.com/Cartucho/mAP

efficientdet-pytorch's People

Contributors

Stargazers

Watchers

efficientdet-pytorch's Issues

下载预训练权值出现问题：无效哈希值

b导好，我在下载预训练权值时给我报错如下,这是为什么呢？
RuntimeError: invalid hash value (expected "b0", got "73f3a3d3c70508a1dfc1fcb58f8ba0edb1a5aaf2f0aaa2ce4dcd34b18b1a97df")

voc map

请问用efficientDet D0训练voc数据集的map是多少呢，我训练时的最高精度是三十多，我觉得不大行

F:\EfficientNet\efficientdet-pytorch-master\nets\efficientdet_training.py:264: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). plt.figure()

运行train.py的时候，这个报错会对结果有影响吗？
怎么去除它呢？

您好，非常感谢分享，我再执行get_dr_txt.py时，发现生成得detection下得txt文件全空，查找错误发现，执行test策略时，执行EfficientNet的forward时(位置在代码目录$./nets/efficientdet.py)，第413行x = self.model._bn0(x)，也即self._bn0 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)函数时，上一步不同的输出数值，经过bn0后结果都变为相同的输出，不管接受的是什么输入，只输出'backbone_net.model._bn0.bias'的值，并且weight并不为0，查看上一步的结果，至于形状
test策略时：
[16, 3, 1024, 1024] input
[16, 48, 512, 512] 经过x = self.model._conv_stem(x)
train策略时：
[2, 3, 1024, 1024] input
[2, 48, 512, 512] 经过x = self.model._conv_stem(x)

test策略调用的模型是训练的第40代(phi=4)，训练误差和验证误差均为0.0002。

试着调整test时的送入形状，我分别截取输出，使map调用输入的形状和训练时的，两者形状完全一致，还是有以上错误；

截取了训练时的数据，结果证明是完全没问题的，

查看训练和map输入的数据，进行对比发现，并无明显差异，

非常期待您的回复，谢谢

如何将训练好的.pth权重和模型转换为 .onnx or .pb通用模型文件

非常感谢博主视频和博文，已经跟了博主学习了一段时间，但是无法将训练好的权重转换为onnx或pb模型，建议能否出一期视频或博文，专门讲如何把训练好的权重与模型的各种格式文件(如.pt .pth.h5)转换为.onnx或.pb后缀的通用模型文件，以便于其他平台部署和推理。这有利于将视觉深度学习在工业环境中得到应用，非常感谢。

跑get_map.py

为什么DO跑出来准确率95%，召回率只有29%,mAP也只有34.70%

关于loss函数的问题：在github代码efficientdet-pytorch/nets/efficientdet_training.py L48-L50,"重合度小于0.4需要参与训练 ",是否应该是"重合度大于0.4需要参与训练 "？

在github代码efficientdet-pytorch/nets/efficientdet_training.py L48-L50,"重合度小于0.4需要参与训练 ",是否应该是"重合度大于0.4需要参与训练 "？因为下面计算loss使用的是大于0.5的部分
这一点可能是导致下面那位同学没有训练出结果的原因

COCO 训练问题

请问作者，COCO训练集您训练了多少个epoch

你好，在获得训练用的2007_train.txt、2007_val.txt存在一些问题

你好，在使用voc_annotation.py生成用于训练用的2007_train.txt、2007_val.txt时，如果数据集中存在negative image set，即影像无对应的地物目标时，会报错xml.etree.ElementTree.ParseError: no element found: line 1, column 0，大概是可能因为图像中没有用于训练的目标，那么在生成的时候要把这些图片给跳过吗？

关于先验框

代码里怎么没有先验anchor的设置呢？

为什么MAP的结果总是0

你好，我在用Efficient-Det训练Text-COCO做文字检测，bounding box框的位置是正确的，之后我按照说明运行get_map.py 文件来求map和precision，recall。
但是得到的结果总是0. 绝大部分结果被认定为false positive，请问这是数据的问题还是get_map代码的问题？

训练问题

博主您好。
训练过程中，会突然报错，错误代码如下所示：
E:\py_file\efficientdet-pytorch-master\venv\Scripts\python.exe E:/py_file/efficientdet-pytorch-master/train_1.py
Loading weights into state dict...
Finished!
Start Train
Epoch 1/50: 26%|██▌ | 428/1675 [09:09<26:39, 1.28s/it, Conf Loss=994, Regression Loss=0.0386, lr=0.001]
Traceback (most recent call last):
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 214, in
val_loss = fit_one_epoch(net, efficient_loss, epoch, epoch_size, epoch_size_val, gen, gen_val, Freeze_Epoch, Cuda)
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 44, in fit_one_epoch
targets = [torch.from_numpy(ann).type(torch.FloatTensor).cuda() for ann in targets]
File "E:/py_file/efficientdet-pytorch-master/train_1.py", line 44, in
targets = [torch.from_numpy(ann).type(torch.FloatTensor).cuda() for ann in targets]
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, int64, int32, int16, int8, uint8, and bool.

查询之后没有找到解决办法，希望能得到您的帮助，谢谢

training warning

Epoch 3/25: 0%| | 0/204 [00:00<?, ?it/s<class 'dict'>]D:\Model\efficientDet\04\efficientdet-pytorch\Utils\dataloader.py:130: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
bboxes = np.array(bboxes)
D:\Model\efficientDet\04\efficientdet-pytorch\Utils\dataloader.py:130: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
bboxes = np.array(bboxes)

更改网络问题

大佬，我把这个backbone换成了yolov5的backbone，训练之后loss很低，但是map测试很差，大佬能给点意见参考参考是为什么吗？

训练自己的数据，mAP很低

作者您好，感谢你的工作。
我使用efficientdet-d3对其他数据集进行训练，效果比较差，可以请您帮忙分析一下吗？谢谢！
数据集是铝型材表面缺陷检测数据。
前30个epoch冻结主干，之后解冻训练，训练到val_loss不再降低。
map结果如下：
`Get map.

2.78% = 不导电 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 喷流 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 擦花 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

38.01% = 杂色 AP || score_threhold=0.5 : F1=0.11 ; Recall=5.56% ; Precision=100.00%

13.00% = 桔皮 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

0.00% = 漆泡 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

0.00% = 漏底 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

5.01% = 脏点 AP || score_threhold=0.5 : F1=0.14 ; Recall=10.84% ; Precision=21.43%

0.00% = 角位漏底 AP || score_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%

0.00% = 起坑 AP || score_threhold=0.5 : F1=0.00 ; Recall=0.00% ; Precision=0.00%

mAP = 5.88%

Get map done.`

每个epoch的val_loss如下：
692.5821533203125 18.020043546503242 6.0580932082551895 2.9250449375672773 1.7247823729659573 1.1773547078623916 0.8997215122887583 0.7432047751816836 0.6556733747323354 0.5967527682130987 0.5552684529261156 0.5247120744351185 0.5120401086680817 0.490136006564805 0.48692420892643207 0.4789598101016247 0.46626631509174 0.47130427938519104 0.46737760531179834 0.4521835113339352 0.4599695607568278 0.45360165202256403 0.44911260993191693 0.4493762557253693 0.4430947389566537 0.44133371464682347 0.44138025108611945 0.44033303834272153 0.4294479764772184 0.4288688725368543 0.43481780243898505 0.4259181917825742 0.4300796524691048 0.41619802549926205 0.41009346857222156 0.4145459183561268 0.4220570447618392 0.4012312131530758 0.3930108847735978 0.4246373758998825 0.38846801557758853 0.39046983073340424 0.5298791383462611 0.3835990623251271 0.424211601240199 0.40973395342702296 0.374117885035143 0.3753899414537113 0.3762250279924318 0.3619030268668239 0.4167037764147146 0.37196493557473614 0.3779459266798265 0.3777198547691996 0.3685086570235331 0.3662413968635139 0.3724715964062445 0.37364781386594276 0.3646606824068881 0.40380949271259026 0.36690836080085876 0.39359222296903384 0.35407807564001476 0.35932715515147395 0.35652801939355794 0.3581762860126015 0.3570717303777364 0.3549506452712995 0.3696301862208256 0.36184008566857273 0.35130936827566195 0.35368453527786836 0.36518700056667647 0.34808981838399794 0.3540204591326304 0.35394072493732864 0.3597586819651856 0.35117012387447394 0.35453211001829427 0.338129321863847 0.354188849065286 0.3480956403733189 0.3560672045421244 0.3494431595665528 0.3579848694489963 0.3562058041344828 0.3504434914620065 0.36181518032368437 0.3520742502730729 0.3408385811568196 0.3392267982239154 0.34833704064419463 0.3375129512330489 0.34490444361051514 0.3474811746168937 0.3642430662997623 0.3400071593029286 0.3533157893359216 0.34791290949084863 0.35537530932186256 0.3504680647000448 0.3470999870949717 0.3505480893011858 0.351605375561474 0.35297540341740224 0.3379963515743391 0.34117161613235725 0.3530546065364311 0.35188829584686615 0.35485441400322004 0.3438295838492575 0.3458844947058763 0.35485429858872247 0.3565744514674393 0.34367825498165033 0.34764408359109467 0.35074018681449676 0.3437748175225596 0.34253188953804437 0.3441715170976831 0.3461703913870142 0.34832563582084963 0.3480878883222146 0.34780260700899274 0.3481335105068648 0.3435267209172694 0.34988239888491024 0.35219536335277024 0.3490558216876503 0.34662854827162043 0.3428226285961582 0.3569837624985558 0.34416547353699134 0.34747738218796786 0.34722422989113116 0.34134776695673147 0.343578678819893 0.3511959823654659 0.3519623815012512 0.34963406944897635 0.3476591583118955 0.34318768255301374 0.3484093218819419 0.35494244101443395 0.3509057753010472 0.3456782892957997 0.3371015375118647 0.3482280739708178 0.3487955643447922 0.3454236375140165 0.35292010598663076 0.3519064793780224 0.3401252330461545 0.3439494139278558 0.34353317512171483 0.34736122029708394 0.3405051428491055 0.34928369027242734 0.34589640301332547 0.34704446495135327 0.348259352842596 0.34758604331803855 0.34305048622746964 0.3531607194428346 0.3395370377311066 0.3502034348950012 0.3446665857988062 0.3422466347605657 0.34934172028703475 0.35288102302088664 0.35932373247151056 0.3504922179255023 0.3531327823093578 0.3520925745868416 0.3523084268029501 0.34706903154503055 0.35040349913622015 0.3543376238432838 0.3538556429766007 0.34093494554842585 0.3473847135901451 0.3451451262764966 0.34536995963930195 0.3503570426533471 0.34914408012557385 0.3532811352399303 0.34506166659629167 0.3482327019489968 0.3509918149393886 0.3524799474806928 0.35067986360570386 0.3517061372134668 0.3485593108543709 0.3451726289827432 0.34738497355424647 0.34392267045801256 0.3420782624674377 0.33875321841506817 0.347360303236255 0.3521433267186382 0.3485356454473378 0.34775876684753754 0.3512924103094126 0.3482155179632689 0.3382312229264583 0.35628125399573524 0.34736081468525215 0.3492826893369653 0.3421760888964827 0.3456490655610366 0.34405294005105747 0.34908889931862924 0.34774825335549775 0.35118296286508216 0.3519998987886443 0.3402652802474018 0.34220543765087624 0.34587571233399766

TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect

/project/nets/layers.py:323: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_h = (math.ceil(w / self.stride[1]) - 1) * self.stride[1] - w + self.kernel_size[1]
/project/nets/layers.py:324: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_v = (math.ceil(h / self.stride[0]) - 1) * self.stride[0] - h + self.kernel_size[0]
/project/nets/layers.py:357: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_h = (math.ceil(w / self.stride[1]) - 1) * self.stride[1] - w + self.kernel_size[1]
/project/nets/layers.py:358: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
extra_v = (math.ceil(h / self.stride[0]) - 1) * self.stride[0] - h + self.kernel_size[0]
/project/utils/anchors.py:25: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if image_shape[1] % stride != 0:
/project/utils/anchors.py:30: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
x = np.arange(stride / 2, image_shape[1], stride)
/project/utils/anchors.py:30: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
x = np.arange(stride / 2, image_shape[1], stride)
/project/utils/anchors.py:31: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
y = np.arange(stride / 2, image_shape[0], stride)
/project/utils/anchors.py:31: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
y = np.arange(stride / 2, image_shape[0], stride)
/project/utils/anchors.py:49: TracerWarning: torch.from_numpy results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
anchor_boxes = torch.from_numpy(anchor_boxes).to(image.device)

关于feature_size

关于pretrain, 主干权值加载报错

Downloading: "https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b0-b64d5a18.pth" to /home/dyd/.cache/torch/hub/checkpoints/efficientnet-b0-b64d5a18.pth

urllib.error.HTTPError: HTTP Error 404: The specified resource does not exist.

get_dr_txt.py出现问题

RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([396, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([810, 64, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([396]) from checkpoint, the shape in current model is torch.Size([810]).
为什么会出现上面的维度不匹配的问题

predict遇到问题

Traceback (most recent call last):
File "E:\yk\Code\efficientdet-pytorch\predict.py", line 77, in
r_image = efficientdet.detect_image(image, crop = crop, count=count)
File "E:\yk\Code\efficientdet-pytorch\efficientdet.py", line 216, in detect_image
draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[c])
File "D:\Anaconda\envs\objectbox\lib\site-packages\PIL\ImageDraw.py", line 296, in rectangle
self.draw.draw_rectangle(xy, ink, 0, width)
ValueError: x1 must be greater than or equal to x0

作者大大好，我在使用d3预训练权重进行预测时，出现了这个错误，不知道是什么原因

train.py

Fail To Load Key: ['classifier.header.pointwise_conv.conv.weight', 'classifier.header.pointwise_conv.conv.bias'] ……
Fail To Load Key num: 2

温馨提示，head部分没有载入是正常现象，Backbone部分没有载入是错误的。
The expanded size of the tensor (46917) must match the existing size (49104) at non-singleton dimension 1. Target sizes: [1, 46917, 4]. Tensor sizes: [1, 49104, 4]
Error occurs, No graph saved
在运行时出现了这个错误

bug

File "predict.py", line 17, in
r_image = efficientdet.detect_image(image)
File "/home/404/efficientdet-pytorch-master/efficientdet.py", line 109, in detect_image
detection = torch.cat([regression,classification],axis=-1)
TypeError: cat() got an unexpected keyword argument 'axis'
(yp) [root@localhost efficientdet-pytorch-master]#

loss问题

训练的时候，验证loss前几个值能达到几十万，虽然100epoch之后loss也只有1.多，但是map是0，数据集格式上我用在大佬你yolov3上能用，但是在这上面map一直是0，请问还可能是什么原因呢？模型用的d1

运行train.py出现问题

runfile('E:/A/efficientdet-pytorch-master/train.py', wdir='E:/A/efficientdet-pytorch-master')
Reloaded modules: nets, nets.efficientdet, utils, utils.anchors, nets.efficientnet, nets.layers, nets.efficientdet_training
Traceback (most recent call last):

File "D:\software\Anaconda\envs\torch1.2\lib\site-packages\torch\utils\tensorboard_init_.py", line 2, in
from tensorboard.summary.writer.record_writer import RecordWriter # noqa F401

ModuleNotFoundError: No module named 'tensorboard.summary'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "E:\A\efficientdet-pytorch-master\train.py", line 16, in
from utils.callbacks import LossHistory

File "E:\A\efficientdet-pytorch-master\utils\callbacks.py", line 9, in
from torch.utils.tensorboard import SummaryWriter

File "D:\software\Anaconda\envs\torch1.2\lib\site-packages\torch\utils\tensorboard_init_.py", line 4, in
raise ImportError('TensorBoard logging requires TensorBoard with Python summary writer installed. '

ImportError: TensorBoard logging requires TensorBoard with Python summary writer installed. This should be available in 1.14 or above.

map太低

训练d1去检测织物瑕疵（图片大小1280x1080，标签大小50x900，有些更小），训练出来map都是0.04这样的

作者，您好问您一个小问题

请问这个预测可以批量预测图片并保存吗？还有视频，能预测后进行保存吗？

self._root = parser._parse_whole(source) UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 129: illegal multibyte sequence

If you encounter the above problems, please follow the following additional coding format to solve.

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id), encoding='UTF_8')

trouble

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))

method

in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id), encoding='UTF_8')

输入图片

不同的模型比如d0和d7 不能设置输入训练图片大小都是640*640吗？看代码只能逐渐增加尺寸，这是为什么呢？

博主您好，我才使用您的efficientDet代码时候，分别用了两个不同的数据集，但是效果却相差较大。

  这两个数据集的图像尺寸均在1000像素左右
   效果差的那个数据集，存在这样的情况：共有10类目标，我的训练策略是使用efficientd-4作为初始权重。所测试的结果其map仅能达到75左右，但是用frcnn ssd retinanet等均能达到85-89左右。但是当我改用efficient d-3作为初始权重的时候，在batchsize=4的设置下，经过100epoch所测试的的各类ap只有个位数大小，实在好奇怪。
  但是另一个数据集，效果很好可以达到非常理想的map，与frcnn ssd retinanet等均相近甚至更高。
 希望能够得到您的帮助！非常感谢！

get_dr_txt.py 报错

RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([45, 112, 1, 1]) from checkpoint, the shape in current model is torch.Size([189, 112, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([45]) from checkpoint, the shape in current model is torch.Size([189]).
用的phi=2 pth文件也改了，为什么维度对不上呢

您好，想问一下，训练参数一致，最终结果不一样

您好，在使用您的FCOS模型时，训练了3次，都是在同一参数，数据集下训练的，可是最终得到的评估精度是不一样的，这个是正常的嘛？

多GPU训练

请问这个训练支持多GPU训练吗，在哪设置呢？

关于bifpn，多次执行bifpn次数的问题

从您的B站课堂过来的！大佬讲的真棒！！关于bifpn这里有一个地方没明白。论文中提到bifpn这个结构是需要使用多次的，您的代码里也提到了第一次bifpn结束以后会存在p3_out、p4_out.....p7_out它们会返回成为新的输入进行第二次bifpn操作。想麻烦问您一下代码中哪里对这个bifpn总共操作次数的值进行了定义？

数据转换

在annotations下的xml文件全部都转换到train的数据了,剩余的txt文档全是空的啊

bifpn中加权聚合时（简单的注意力机制），权重参数不更新问题

作者您好，我尝试将您这个代码中bifpn中的加权聚合方法移植到别的目标检测框架中看看能不能有作用，我修改后代码成功跑起来了，但在查看训练好的模型字典时，发现权重w初始化值是【1,1】训练好的模型，w仍然是【1,1】，希望大神能帮我看下我的代码是不是哪出现了问题，从而导致该参数完全不更新。

请问efficiendet检测小目标的效果怎么样？我20482048的图片目标大小为1616，好像检测不出来。

请教预测的时候出现了bug

FileNotFoundError: [Errno 2] No such file or directory: 'VOCdevkit/VOC2007/Annotations/CompoundLeaves.xml'

运行voc_annotation.py文件时，由于文件名含有空格，提示FileNotFoundError: [Errno 2] No such file or directory: 'VOCdevkit/VOC2007/Annotations/CompoundLeaves.xml'，请问如何解决。我的所有文件名都含有空格

标签框筛选问题

您好，想问一下代码在加载标签时有根据宽高比筛选标签框的操作吗，我发现训练出来的模型对细长直的目标检测效果比较差，有没有可能是在训练阶段这类极限宽高比的标签被筛掉了

用predict.py计算fps时出现的问题

Traceback (most recent call last):
File "e:/deeplearning/efficientdet-pytorch-master/efficientdet-pytorch-master/predict.py", line 120, in
tact_time = efficientdet.get_FPS(img, test_interval)
File "e:\deeplearning\efficientdet-pytorch-master\efficientdet-pytorch-master\efficientdet.py", line 237, in get_FPS
image_shape, self.letterbox_image, conf_thres = self.confidence, nms_thres = self.nms_iou)
TypeError: non_max_suppression() got multiple values for argument 'conf_thres'

训练日志 pr输出指定训练gpu

因为我在服务器上运行这个程序，但是没有看到设定日志保存的地方只看到了权重文件的保存
计算map可以指定阈值（如0.5）下的recall precision
最后没有发现可以指定gpu的地方负责一旦运行程序所有gpu都得工作影响别人的工作，请大神指点

和其他检测模型对比问题

这边我跑的b0网络和ssd效果对比，比ssd低了5个点，这个效果正常吗，其他朋友跑这个模型效果怎么样呢

RuntimeError: CUDA error: device-side assert triggered

使用自己的数据集进行训练时出现了这个错误，请问应该怎么解决呢？
./aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [1,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [2,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [3,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [4,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [5,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [6,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [7,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [8,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [9,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [10,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [11,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [12,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [13,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [14,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [15,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [16,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [17,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [18,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [19,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [20,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [21,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [22,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [28,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:91: operator(): block: [0,0,0], thread: [29,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
Traceback (most recent call last):
File "/mnt/disk1/data0/jxt/efficientdet/train.py", line 504, in
fit_one_epoch(model_train, model, focal_loss, loss_history, eval_callback, optimizer, epoch,
File "/mnt/disk1/data0/jxt/efficientdet/utils/utils_fit.py", line 37, in fit_one_epoch
loss_value, _, _ = focal_loss(classification, regression, anchors, targets, cuda = cuda)
File "/home/jxt/.conda/envs/efficient/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/disk1/data0/jxt/efficientdet/nets/efficientdet_training.py", line 210, in forward
if positive_indices.sum() > 0:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

voc2007test mAP只有30多（D0）

请问博主使用VOC2007trainval训练时使用了多少个epoch，我简单训练了65个（在D0权重的基础上），效果太差。是我训练时间太短还是有什么其他问题

关于模型的选择

作者您好，请问代码选择具体D0~D7的代码部分是在哪里的

训练预测出现问题

你好，我训练完50个epoch之后重新加载测试模型，出现了以下错误信息，是我没有改fc的输出导致的吗？谢谢
Traceback (most recent call last):
File "/home/yueyu/efficientdet-pytorch/predict.py", line 7, in
efficientdet = EfficientDet()
File "/home/yueyu/efficientdet-pytorch/efficientdet.py", line 53, in init
self.generate()
File "/home/yueyu/efficientdet-pytorch/efficientdet.py", line 74, in generate
self.net.load_state_dict(state_dict)
File "/home/yueyu/miniconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for EfficientDetBackbone:
size mismatch for classifier.header.pointwise_conv.conv.weight: copying a param with shape torch.Size([180, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([36, 64, 1, 1]).
size mismatch for classifier.header.pointwise_conv.conv.bias: copying a param with shape torch.Size([180]) from checkpoint, the shape in current model is torch.Size([36]).