tianxiaomo / pytorch-yolov4 Goto Github PK
View Code? Open in Web Editor NEWPyTorch ,ONNX and TensorRT implementation of YOLOv4
License: Apache License 2.0
PyTorch ,ONNX and TensorRT implementation of YOLOv4
License: Apache License 2.0
The values of anchors in Yolo_loss seem different from the others (from yolov4.cfg and do_detect()).
It seems like it's from yolov3.cfg.
Can you plz double check on this, if it's not intended?
Thank you:)
I trained yolov4 with my own dataset and want to evaluate the result with demo.py, just want to confirm the model in the demo is the yolov4 as well.
Thank you!
I found scheduler of CosineAnnealingWarmRestarts in your train code, but I don't know why you have no use this scheduler. Or it be implenment in other place in code?
I thought we should perform nms per class per image, no?
def nms(boxes, nms_thresh): in tool/utils.py
你好,我已经实现了 .onnx -> .pb 的转换,并使用tensorflow2.2.0的compat.v1图实现目标检测了。请问如何操作呢?谢谢!
=== Google translate ===
Hello, I have implemented the conversion of .onnx-> .pb and implemented the target detection using the compat.v1 graph of tensorflow2.2.0. How do i do it? Thank you!
i run camera.py on CPU,why so slowly?
it is recommended to modify it as follows: put the following code into the for loop
all_anchors_grid = [(self.anchors[mask][0] / self.strides[i], self.anchors[mask][1] / self.strides[i]) for mask in self.anch_masks[i]]
Hi,
If I am interested in fine-tuning the yolov4 model. Is there a way I can convert from the Darknet model weights to Yolov4.pth? (any tips would be appreciated)
And, if you have, can you plz share it w/ me?
Thank you.
can you provide yolov4.pth
Sorry for bothering you again :-)
nn.MaxPool2d
is already enough to handle your cases.
In line 237 of darknet2pytorch.py
Change
elif block['type'] == 'maxpool':
pool_size = int(block['size'])
stride = int(block['stride'])
if stride > 1:
model = nn.MaxPool2d(pool_size, stride)
else:
model = MaxPoolStride1(pool_size)
to
elif block['type'] == 'maxpool':
pool_size = int(block['size'])
stride = int(block['stride'])
model = nn.MaxPool2d(kernel_size=pool_size, stride=stride, padding=pool_size//2)
我看源码上的无论过滤器大小是3还是1,pad都是等于1,我想问一下你的网络里pad有两种参数,这会有问题吗?
I think you have already contributed a lot into the opensource community on behalf of YOLOv4.
But I think we'd better en-tighten relations between your repository and the ONNX standard. This would also contribute to popularity of your repository .
I will add some python scripts here:
Why doesn't pytorch-yolov4 recognize the small object in the upper right corner in dog.jpg
def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):
if i_mixup == 0:
bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)
out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)
out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)
out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)
out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]
return out_img, bboxes
===========================================================================
Suggested to be revised as:
def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):
left_shift = min(left_shift, w - cut_x)
top_shift = min(top_shift, h - cut_y)
right_shift = min(right_shift, cut_x)
bot_shift = min(bot_shift, cut_y)
if i_mixup == 0:
bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)
out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)
out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)
out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)
out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]
return out_img, bboxes
I want to use the .pth file alone without .weights file, what should I do?
Thank you so much.
当我一次性输入多张图像时,经过get_region_boxes1产生的bboxs无法被正常进行NMS,但传入一张图像时却可以正常NMS.但使用get_region_boxes函数时无论batch大小都可以正常进行NMS.
anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
num_anchors = 9
anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
strides = [8, 16, 32]
anchor_step = len(anchors) // num_anchors
detections = [detection.cpu().data.numpy() for detection in detections]
bboxs_for_imgs = []
for i in range(3):
masked_anchors = []
for m in anchor_masks[i]:
masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
bboxs_for_imgs.append(get_region_boxes1(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))
bboxs_for_imgs = [
bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
for index in range(self.batch_size)]
# 分别对每一张图片的结果进行nms
detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
detections = [np.array(bboxs) for bboxs in detections]
anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
num_anchors = 9
anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
strides = [8, 16, 32]
anchor_step = len(anchors) // num_anchors
bboxs_for_imgs = []
for i in range(3):
masked_anchors = []
for m in anchor_masks[i]:
masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
bboxs_for_imgs.append(get_region_boxes(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))
bboxs_for_imgs = [
bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
for index in range(self.batch_size)]
# 分别对每一张图片的结果进行nms
detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
detections = [np.array(bboxs) for bboxs in detections]
而且虽然get_region_boxes可以使用,但耗时却是get_region_boxes1函数的两倍以上
tool文件夹下的darknet2pytorch.py的169行紧跟在continue语句的下面,那这些代码是不是不会被执行?
if self.loss:
self.loss = self.loss + self.models[ind](x)
else:
self.loss = self.models[ind](x)
上面这段代码是不是应该放在176行yolo层下面?
I'm aspiring developer in neural networks I don't Understand where to start while implementing a paper,
Can you please guide me ?
Thanks
First, thank you so much for your great work.
I am currently trying to learn YOLOv4 using your implementation.
I am interested in training.
Can you share your train.txt and val.txt for me to reproduce your work?
Cfg.train_label = 'data/train.txt'
Cfg.val_label = 'data/val.txt'
Thanks!!
请问这是怎么回事呢?
i run camera.py on CPU,why so slowly?
Hello, I've read your code and I managed to find some things, where refactoring is needed and which are not as in official yolo network. The first one downsamples from DownSample 2 to DownSample 5 are the same code, I offer you to use one module to all of them, the CSP in network is nice. Second, I have not found DENSENET layer in your implementation, it should be between downsampling and SPP. I hope we will find the way of dealing with this problem.
Best regards,
Vadim.
I have got many problems trying to convert your model into optimized models for inferences.
I have found many places in pytorch-YOLOv4/utils/utils.py
using torch.tensor that are problematic for torch.trace to handle.
For example, in method get_region_boxes()
, please convert torch.tensor into numpy arrays because this method is NOT directly related to the YOLOv4 model itself.
Please replace as many these tensor instances by numpy arrays as possible in pytorch-YOLOv4/utils/utils.py
.
不要用tensor去实现各种和model无关的逻辑,尽量用numpy
Thank you very much.
我看到在cfg文件里,有一些卷积层的激活函数使用的是linear,但似乎在您的demo文件中我并没有看到
你好用了你百度网盘上的yolov4.pth ,运行demo.py 报“convalution havn't activate linear“,并且无输出。
I'm using weights I trained on the original darknet repo, and just updated the yolov3-tiny.cfg file with 3 classes (and changed the filters in the prior layer to 24), but I am getting a mismatched tensor error. I printed out x1 and x2 and I get the respective values:
x1: torch.Size([1, 128, 30, 30])
x2: torch.Size([1, 256, 27, 27])
Any advice?
Traceback (most recent call last):
File "demo.py", line 213, in <module>
detect(cfgfile, weightfile,imgfile)
File "demo.py", line 46, in detect
boxes = do_detect(m, sized, 0.5, 0.4, use_cuda)
File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/utils.py", line 420, in do_detect
list_boxes = model(img)
File "/Users/vkrd/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/darknet2pytorch.py", line 144, in forward
x = torch.cat((x1, x2), 1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 30 and 27 in dimension 2
Hi,
Does anyone know of a way to get the detection information from the model frame by frame and then save that information to a file? I have been trying to figure this out but have not found a good solution yet. Would love to hear your ideas.
Warm Regards
首先感谢兄弟的代码。
models.py 里面的代码:论文写的是Neck,不是Neek
/tool/ 文件夹下是不是少的了一个字母:coco_annotatin.py -> coco_annotation.py ???
dataset.py 中数据集有少部分生成数据的值会大于255,这是允许的嘛?在rgb转hsv增强后再转回rgb时产生的,这是预期的情况嘛?
train.py 中val_loader实例少写了collate_fn在enumerate(val_loader)时会报错。
===Google Translate===
First of all thanks to the brothers for the code.
Code in models.py: The paper is written for Neck, not Neek
Is there a letter missing in the / tool / folder: coco_annotatin.py-> coco_annotation.py ???
A small part of the data set in dataset.py will have a value greater than 255. Is this allowed? It is generated when rgb is converted to hsv and then back to rgb. Is this the expected situation?
The val_loader instance in train.py is less written. collate_fn will report an error when enumerate (val_loader).
I noticed that the input image for inference in do_detect() is normalized (dividing each element by 255.0), but the input images for training are not normalized.
After normalizing the input images, fine-tuning seems working better.
Basically, I loaded the yolov4.pth and fine-tune the model for 1 epoch w/ few images (w/ no data augmentation and very small lr) to verify the functionality. Without the input normalization, it couldn't predict at all, but w/ normalization it does predict similar to the ones from converted yolov4.pth w/o fine-tuning.
Hi, @Tianxiaomo
Nice work!
I'm attracted by your work, and could you please tell me the best result have you got by training the network using this code?
你好,小模:
我想请问你一下,你大概多久会更新一下yolov4代码?
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 375, in getitem
cut_y, i, left_shift, right_shift, top_shift, bot_shift)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 215, in blend_truth_mosaic
bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 184, in filter_truth
bboxes[:, 0] -= dx
IndexError: too many indices for array
Who has solved such a problem?Can you tell me?Thank you!
Dear Author:
Thank you. May I know what is the difference between:
yolov4.pth and yolov4.conv.137.pth?
My dataset only has two classes so I changed the channel num from 255 to 21 in the head, is this the only part you need to take care when train your own dataset?
/pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 329, in train
loss, loss_xy, loss_wh, loss_obj, loss_cls, loss_l2 = criterion(bboxes_pred, bboxes)
File "/home/qw/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 210, in forward
output[..., np.r_[0:4, 5:n_ch]] *= tgt_mask
Epoch 1/500: 0%| | 0/103 [00:05<?, ?img/s]
Traceback (most recent call last):
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 331, in train
loss.backward()
File "/home/qw/.local/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/qw/.local/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 3, 19, 19, 7]], which is output 0 of torch::autograd::CopySlices, is at version 7; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
(1)此问题已解决,-dir路径+train.txt路径 = 图片路径,train.txt里的路径不全,不能imread到图像,调试就可以发现。
(2)此程序可以训练自己的数据集,需要修改models.py,修改如darknet下cfg修改方式。coco_annotatin.py需要修改,修改成自己图片路径及名称。如果单运行train.py的话,cfg.dataset_dir设成自己的路径。
hello, i wonder if this repo support training for custom dataset. If yes, will it update how to train ?
Traceback (most recent call last):
File "train.py", line 429, in
device=device, )
File "train.py", line 305, in train
bboxes_pred = model(images)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 417, in forward
d3 = self.down3(d2)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 173, in forward
x1 = self.conv1(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 58, in forward
x = l(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 10, in forward
x = x * (torch.tanh(torch.nn.functional.softplus(x)))
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 4647) is killed by signal: Killed.
我的cuda版本为10.2,显存9G,设置num_works=1,还是会自己中断?
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 19, 19, 85]], which is output 0 of AsStridedBackward, is at version 6; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
我训练自己的数据集遇到这个问题,之前重来没有遇到过,请问有解决办法吗?
请问如果想要训练自己的数据集,应该做哪些改动呢?
Hello!
At first, sorry for DDOS : ), but there is one more thing, which makes me curious. It is SAM.
As you can see, there is a modified sam in YOLO layer, which you haven't implemented yet. So, as I understand it should be used in x2, x10 and x18 in your implementation, because they are empty not used now.
Best regards,
Vadims.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.