tianxiaomo / pytorch-yolov4 Goto Github PK

View Code? Open in Web Editor NEW

4.4K 52.0 1.5K 2.39 MB

PyTorch ,ONNX and TensorRT implementation of YOLOv4

License: Apache License 2.0

Python 74.18% Makefile 0.71% Cuda 1.04% C++ 24.06%

yolov4 pytorch darknet2pytorch darknet2onnx tensorrt onnx pytorch-yolov4 yolov4-tiny yolov3

pytorch-yolov4's People

Stargazers

Watchers

Forkers

gzq0723 qiaosibo yuhonghong95721 riwaly 123wk45678 wep21 sailfish009 jasionkit chomolungma damon-li dyanwen qq2737499951 jiyuxuan926 aixining argusswift sunchuanxi autogyro wd-hub ma3252788 csjiangwm cv-ip 598717026 xialuxi wishgale jingmouren pingchesu kuan-li gill-wang supersivan collector-m east0 xwixcn wangni-git byq-luo tongtong-allure ucas-iigroup hjffily starduct hongyunnchen andrey-kovinjonok dsp6414 ansonyanxin dreamer121121 dongdem joheras vishalchak zhangxingchao wuxiangchao williamgrimes mengkunzhao caomy7 krisking-ping binianzjl aiformobility fanchingai ziv-wang masataka46 sporadiccactus chisyliu taoshss xuchangzhou congjianting minerva-j rystylee runauto yangyin2016 menglinghui unsky sinofairy rotking mmmmmmiracle zz2summer mutsuyuki mess-lelouch qqyouhappy poarse guangqianqin liuzeming-yuxi fridayjk deep-learning-newbie chentiantian5880 magi803 syg1996419 lixingkaisjf feixiangdekaka vcasecnikovs antonizhubar labimage icxa yhsmiley shiyaoa markweberdev tpz789 tehtea divineweir channingss vagaboundch timverion liu199604 skywalkerfmc

pytorch-yolov4's Issues

self.anchors in Yolo_loss (in train.py) is different from the one from yolov4.cfg or the one from do_detect().

The values of anchors in Yolo_loss seem different from the others (from yolov4.cfg and do_detect()).
It seems like it's from yolov3.cfg.
Can you plz double check on this, if it's not intended?

Thank you:)

which model is used in demo, v4 or old yolo?

I trained yolov4 with my own dataset and want to evaluate the result with demo.py, just want to confirm the model in the demo is the yolov4 as well.

Thank you!

about CosineAnnealingWarmRestarts

I found scheduler of CosineAnnealingWarmRestarts in your train code, but I don't know why you have no use this scheduler. Or it be implenment in other place in code?

nms() in do_detect() in utils.py should perform per class per image, no?

I thought we should perform nms per class per image, no?
def nms(boxes, nms_thresh): in tool/utils.py

demo tensorflow

你好，我已经实现了 .onnx -> .pb 的转换，并使用tensorflow2.2.0的compat.v1图实现目标检测了。请问如何操作呢？谢谢！

=== Google translate ===
Hello, I have implemented the conversion of .onnx-> .pb and implemented the target detection using the compat.v1 graph of tensorflow2.2.0. How do i do it? Thank you!

output bbox not correct

Thanks for your great work

但我将darknet训练好自己数据集模型放在您的repo使用
output出来的bbox不太对

只会出现下面这样一条短直线
但是在darknet是没问题的

There is a problem with line 102 in the training file: “i” is not defined

it is recommended to modify it as follows: put the following code into the for loop
all_anchors_grid = [(self.anchors[mask][0] / self.strides[i], self.anchors[mask][1] / self.strides[i]) for mask in self.anch_masks[i]]

How to convert from the Darknet model weights to Yolov4.pth?

Hi,

If I am interested in fine-tuning the yolov4 model. Is there a way I can convert from the Darknet model weights to Yolov4.pth? (any tips would be appreciated)
And, if you have, can you plz share it w/ me?

Thank you.

yolov4.pth

can you provide yolov4.pth

Max pooling issue

Sorry for bothering you again :-)
nn.MaxPool2d is already enough to handle your cases.
In line 237 of darknet2pytorch.py
Change

            elif block['type'] == 'maxpool':
                pool_size = int(block['size'])
                stride = int(block['stride'])
                if stride > 1:
                    model = nn.MaxPool2d(pool_size, stride)
                else:
                    model = MaxPoolStride1(pool_size)

            elif block['type'] == 'maxpool':
                pool_size = int(block['size'])
                stride = int(block['stride'])
                model = nn.MaxPool2d(kernel_size=pool_size, stride=stride, padding=pool_size//2)

pad=0 or pad=1

我看源码上的无论过滤器大小是3还是1,pad都是等于1，我想问一下你的网络里pad有两种参数,这会有问题吗？

Do you mind inviting me to development of your repository?

I think you have already contributed a lot into the opensource community on behalf of YOLOv4.
But I think we'd better en-tighten relations between your repository and the ONNX standard. This would also contribute to popularity of your repository .

I will add some python scripts here:

A script to convert pytorch into ONNX
An inference demo running ONNX

bad effect

Why doesn't pytorch-yolov4 recognize the small object in the upper right corner in dog.jpg

Potential bug risk in data enhancement / processing

def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):

if i_mixup == 0:
    bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)

    out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
    bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)

    out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
    bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)

    out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
    bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)

    out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]

return out_img, bboxes

===========================================================================
Suggested to be revised as：
def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):

left_shift = min(left_shift, w - cut_x)
top_shift = min(top_shift, h - cut_y)
right_shift = min(right_shift, cut_x)
bot_shift = min(bot_shift, cut_y)

if i_mixup == 0:
    bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)

    out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
    bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)

    out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
    bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)

    out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
    bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)

    out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]

return out_img, bboxes

How to use the pytorch model?

I want to use the .pth file alone without .weights file, what should I do?
Thank you so much.

batch size > 1时get_region_boxes1函数无法正常运行,导致nms出现问题

问题描述

当我一次性输入多张图像时,经过get_region_boxes1产生的bboxs无法被正常进行NMS,但传入一张图像时却可以正常NMS.但使用get_region_boxes函数时无论batch大小都可以正常进行NMS.

无法正常NMS代码如下非正常效果

anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
        num_anchors = 9
        anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
        strides = [8, 16, 32]
        anchor_step = len(anchors) // num_anchors
        detections = [detection.cpu().data.numpy() for detection in detections]
        bboxs_for_imgs = []
        for i in range(3):
            masked_anchors = []
            for m in anchor_masks[i]:
                masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
            masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
            bboxs_for_imgs.append(get_region_boxes1(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))

        bboxs_for_imgs = [
            bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
            for index in range(self.batch_size)]
        # 分别对每一张图片的结果进行nms
        detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
        detections = [np.array(bboxs) for bboxs in detections]

可以正常NMS代码如下正常效果

anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
        num_anchors = 9
        anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
        strides = [8, 16, 32]
        anchor_step = len(anchors) // num_anchors
        bboxs_for_imgs = []
        for i in range(3):
            masked_anchors = []
            for m in anchor_masks[i]:
                masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
            masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
            bboxs_for_imgs.append(get_region_boxes(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))
        bboxs_for_imgs = [
            bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
            for index in range(self.batch_size)]
        # 分别对每一张图片的结果进行nms
        detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
        detections = [np.array(bboxs) for bboxs in detections]

而且虽然get_region_boxes可以使用,但耗时却是get_region_boxes1函数的两倍以上

训练自己数据集，train.txt 格式?

darknet2pytorch.py文件中有些地方不是很明白

tool文件夹下的darknet2pytorch.py的169行紧跟在continue语句的下面，那这些代码是不是不会被执行？
if self.loss:
self.loss = self.loss + self.models[ind](x)
else:
self.loss = self.models[ind](x)
上面这段代码是不是应该放在176行yolo层下面？

How do you implement papers to code?

I'm aspiring developer in neural networks I don't Understand where to start while implementing a paper,
Can you please guide me ?

Thanks

怎么训练自己的数据呢

your training files (train.txt and val.txt)

First, thank you so much for your great work.
I am currently trying to learn YOLOv4 using your implementation.
I am interested in training.
Can you share your train.txt and val.txt for me to reproduce your work?
Cfg.train_label = 'data/train.txt'
Cfg.val_label = 'data/val.txt'

Thanks!!

你好，我训练了13个epoch，用训练的图片去检测，但是推理的结果没有bbox框，也没有正确识别出物体

请问这是怎么回事呢？

why run so slowly?

i run camera.py on CPU,why so slowly?

Residual over dense?

Hello, I've read your code and I managed to find some things, where refactoring is needed and which are not as in official yolo network. The first one downsamples from DownSample 2 to DownSample 5 are the same code, I offer you to use one module to all of them, the CSP in network is nice. Second, I have not found DENSENET layer in your implementation, it should be between downsampling and SPP. I hope we will find the way of dealing with this problem.
Best regards,
Vadim.

Please replace as many torch.tensor instances to numpy arrays as possible

I have got many problems trying to convert your model into optimized models for inferences.
I have found many places in pytorch-YOLOv4/utils/utils.py using torch.tensor that are problematic for torch.trace to handle.
For example, in method get_region_boxes(), please convert torch.tensor into numpy arrays because this method is NOT directly related to the YOLOv4 model itself.
Please replace as many these tensor instances by numpy arrays as possible in pytorch-YOLOv4/utils/utils.py.
不要用tensor去实现各种和model无关的逻辑，尽量用numpy
Thank you very much.

expected device cuda:0 but got device cpu

激活函数

我看到在cfg文件里，有一些卷积层的激活函数使用的是linear，但似乎在您的demo文件中我并没有看到

convalution havn't activate linear

你好用了你百度网盘上的yolov4.pth ,运行demo.py 报“convalution havn't activate linear“，并且无输出。

demo.py issue with darknet weights

I'm using weights I trained on the original darknet repo, and just updated the yolov3-tiny.cfg file with 3 classes (and changed the filters in the prior layer to 24), but I am getting a mismatched tensor error. I printed out x1 and x2 and I get the respective values:
x1: torch.Size([1, 128, 30, 30])
x2: torch.Size([1, 256, 27, 27])

Any advice?

Traceback (most recent call last):
  File "demo.py", line 213, in <module>
    detect(cfgfile, weightfile,imgfile)
  File "demo.py", line 46, in detect
    boxes = do_detect(m, sized, 0.5, 0.4, use_cuda)
  File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/utils.py", line 420, in do_detect
    list_boxes = model(img)
  File "/Users/vkrd/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/darknet2pytorch.py", line 144, in forward
    x = torch.cat((x1, x2), 1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 30 and 27 in dimension 2

Getting detection output frame by frame

Hi,

Does anyone know of a way to get the detection information from the model frame by frame and then save that information to a file? I have been trying to figure this out but have not found a good solution yet. Would love to hear your ideas.

Warm Regards

Some mistakes

首先感谢兄弟的代码。
models.py 里面的代码：论文写的是Neck，不是Neek
/tool/ 文件夹下是不是少的了一个字母：coco_annotatin.py -> coco_annotation.py ???
dataset.py 中数据集有少部分生成数据的值会大于255，这是允许的嘛？在rgb转hsv增强后再转回rgb时产生的，这是预期的情况嘛？
train.py 中val_loader实例少写了collate_fn在enumerate(val_loader)时会报错。

===Google Translate===
First of all thanks to the brothers for the code.
Code in models.py: The paper is written for Neck, not Neek
Is there a letter missing in the / tool / folder: coco_annotatin.py-> coco_annotation.py ???
A small part of the data set in dataset.py will have a value greater than 255. Is this allowed? It is generated when rgb is converted to hsv and then back to rgb. Is this the expected situation?
The val_loader instance in train.py is less written. collate_fn will report an error when enumerate (val_loader).

Route Layer implementation

Route layer is concatenation instead of summation. Am I missing something?
original
yours

input images for training need to be normalized

I noticed that the input image for inference in do_detect() is normalized (dividing each element by 255.0), but the input images for training are not normalized.
After normalizing the input images, fine-tuning seems working better.

Basically, I loaded the yolov4.pth and fine-tune the model for 1 epoch w/ few images (w/ no data augmentation and very small lr) to verify the functionality. Without the input normalization, it couldn't predict at all, but w/ normalization it does predict similar to the ones from converted yolov4.pth w/o fine-tuning.

What's the best result have you got by training the network using this code?

Hi, @Tianxiaomo
Nice work!
I'm attracted by your work, and could you please tell me the best result have you got by training the network using this code?

is this right?

更新

你好,小模：
我想请问你一下,你大概多久会更新一下yolov4代码？

There is something wrong in loading dataset when training my own dataset.

File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 375, in getitem
cut_y, i, left_shift, right_shift, top_shift, bot_shift)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 215, in blend_truth_mosaic
bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 184, in filter_truth
bboxes[:, 0] -= dx
IndexError: too many indices for array

Who has solved such a problem?Can you tell me?Thank you!

Lack of MiWRC (Multi-input weighted residual connections)

In original paper for the backbone the MiWRC was used. It is used to control which features must be dominant.

About the provided .pth

Dear Author:

Thank you. May I know what is the difference between:
yolov4.pth and yolov4.conv.137.pth？

get nan in box list when inference

My dataset only has two classes so I changed the channel num from 255 to 21 in the head, is this the only part you need to take care when train your own dataset?

代码运行报错

/pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 329, in train
loss, loss_xy, loss_wh, loss_obj, loss_cls, loss_l2 = criterion(bboxes_pred, bboxes)
File "/home/qw/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 210, in forward
output[..., np.r_[0:4, 5:n_ch]] *= tgt_mask

Epoch 1/500: 0%| | 0/103 [00:05<?, ?img/s]
Traceback (most recent call last):
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 331, in train
loss.backward()
File "/home/qw/.local/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/qw/.local/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 3, 19, 19, 7]], which is output 0 of torch::autograd::CopySlices, is at version 7; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

RuntimeError DoubleTensor FloatTensor

When I implement the code, I met the problem.Can anyone help me ?
Thanks a lot,please !

-dir 参数问题

（1）此问题已解决，-dir路径+train.txt路径 = 图片路径，train.txt里的路径不全，不能imread到图像，调试就可以发现。
（2）此程序可以训练自己的数据集，需要修改models.py，修改如darknet下cfg修改方式。coco_annotatin.py需要修改，修改成自己图片路径及名称。如果单运行train.py的话，cfg.dataset_dir设成自己的路径。

if support training custom dataset

hello, i wonder if this repo support training for custom dataset. If yes, will it update how to train ?

当我训练时会自动停止，报错如下

Traceback (most recent call last):
File "train.py", line 429, in
device=device, )
File "train.py", line 305, in train
bboxes_pred = model(images)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 417, in forward
d3 = self.down3(d2)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 173, in forward
x1 = self.conv1(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 58, in forward
x = l(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 10, in forward
x = x * (torch.tanh(torch.nn.functional.softplus(x)))
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 4647) is killed by signal: Killed.

我的cuda版本为10.2，显存9G，设置num_works=1，还是会自己中断？

训练时遇到的问题

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 19, 19, 85]], which is output 0 of AsStridedBackward, is at version 6; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
我训练自己的数据集遇到这个问题，之前重来没有遇到过，请问有解决办法吗？

训练个人数据集

请问如果想要训练自己的数据集，应该做哪些改动呢？

Incorrect YOLO head

Hello!
At first, sorry for DDOS : ), but there is one more thing, which makes me curious. It is SAM.
As you can see, there is a modified sam in YOLO layer, which you haven't implemented yet. So, as I understand it should be used in x2, x10 and x18 in your implementation, because they are empty not used now.
Best regards,
Vadims.