syencil / mobile-yolov5-pruning-distillation Goto Github PK
View Code? Open in Web Editor NEWmobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!
License: MIT License
mobilev2-yolov5s剪枝、蒸馏,支持ncnn,tensorRT部署。ultra-light but better performence!
License: MIT License
it existed error to your pruned code.
思考再三,还是做出如下声明:
本git仅供学习!!! 拒绝伸手党
本git目的是帮助你大体了解一些做工程的一些基本流程,换backbone、剪枝、蒸馏、量化都是常规操作,包括利用现有框架往C++和android上转。
如果你要用来商用,或者做课设,做毕设,完全没有问题,遵从原git的相关开源协议即可。时间仓促,代码细节上难免会存在纰漏,但是只要按照readme复现整个流程是不会有问题的。很多可能的问题、细节在readme和代码注释中都有详细记录。包括一些关于实现的疑问在一些issue中我都解释过。
欢迎提供好的idea或者碰到了一些思路上的问题也欢迎共同来探讨,但是拒绝伸手党。
每个人的代码环境,操作方式都不一样,不可能简单的通过ctrl c+v 错误信息来帮你直接debug。
如果有商业合作需求,欢迎私下联系我。
test.py文件使用剪枝后的模型会报错,报错信息如下:
Traceback (most recent call last):
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 357, in
test(opt.data,
File "/home/demon/PycharmProjects/mobile-yolov5-pruning-distillation/test.py", line 89, in test
names = {k: v for k, v in enumerate(model.names if hasattr(model, 'names') else model.module.names)}
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 771, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'
Hello, excuse me again. I ran your code: pruning and distilling mobilenet. One can reduce the parameters and the other can improve the accuracy. Why not use the pruning model as a student model of distillation?
我在yolov5最新的工程文件上采用mobilenet作为骨干网络,此外采用mobilenet的权重文件收敛速度变得很慢,但是我直接跑作者的代码却收敛的很快,我能问问大概是什么原因吗?
RuntimeError: The size of tensor a (3) must match the size of tensor b (5) at non-singleton dimension 0
请问能不能提供一关于你这里所采用的的剪枝蒸馏策略的参考资料?万分感谢。另外,想请教一下,数据集的改变是否会影响剪枝蒸馏的策略与效果
I set test params(conf_trhesh=0.3, iou_thresh=0.3, batch=24,) when training the VOC2012 and 2017 data sets,but
the recall rate after 120 epochs iteration is only 0.3xx, is this normal?
蒸馏时用v5的v2权重AttributeError: 'Detect' object has no attribute 'm'错误,剪枝后map只有0.4几咋回事呢,我是按照快速开始来的。
ubuntu1804
torch: 1.6.0
torchvision: 0.7.0
Training: python3 train.py --type vocs python3 train.py --type dmvocs
export: export PYTHONPATH="$PWD" && python models/onnx_export.py --weights outputs/dmvocs/weights/best_dmvocs.pt --img 640 --batch 1
Error:
Namespace(batch_size=1, img_size=[640], weights='outputs/dmvocs/weights/best_dmvocs.pt')
Fusing layers...
Model Summary: 148 layers, 3.59629e+06 parameters, 3.31818e+06 gradients
Traceback (most recent call last):
File "models/onnx_export.py", line 35, in <module>
_ = model(img) # dry run
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/data/byronnar/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
x = m(x) # run
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/byronnar/mobile-yolov5-pruning-distillation/models/common.py", line 24, in fuseforward
return self.act(self.conv(x))
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 419, in forward
return self._conv_forward(input, self.weight)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [1, 3, 640] instead
How can I solve this problem? Looking forward to your reply! Thank you!
感谢大佬的工程。
我在蒸馏时,发现Precision下降挺多的,看到您的几个蒸馏方案中,也是Precision比没有蒸馏前低一些,这个问题可以通过啥方法改善么?
单分类的,倒是没什么大的下降~
大佬好,剪枝代码看不太懂,有写相应的解释性的博客吗?
如上面所说
training dataset: coco2017
model weights: yolov5s.pt
classes = 80
why was lower to value of p or r when pruning?
as following:
Epoch gpu_mem GIoU obj cls total targets img_size
3/49 2.66G 0.06193 0.0939 0.04484 0.2007 141 640: 100%|██████████████████| 7393/7393 [2:41:07<00:00, 1.31s/
Class Images Targets P R [email protected] [email protected]:.95: 100%|██████████| 313/313 [02:54<00:00, 1.80it/s]
all 5e+03 3.63e+04 0.000964 0.000638 5.18e-05 1.53e-05
Hi
Could you share your pretrained weight?
(mobilenet_v2-b0353104.pth)
When I download the weights, it turns that this folder is in your google drive trash can.
https://drive.google.com/drive/folders/1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J
Thank you.
我对剪枝后的模型蒸馏后发现,精度上去的同时参数量也恢复到剪枝前的参数量了。
稀疏化训练后,运行剪枝报错FileNotFoundError: [Errno 2] No such file or directory: 'render_img/before_pruning.jpg',这个文件是哪里生成的呢?谢谢
我先是通过 python train.py --type mvocs
进行训练,再通过python3 train.py --type smvocs
将上一步的best_mvocs.pt作为本次训练的输入模型,再通过python3 pruning.py -t 0.1
将上一步的best_smvocs.pt作为本次的输入模型,运行python3 pruning.py -t 0.1
就报段错误
def compute_pruning_loss(p, prunable_modules, model, loss):
ft = torch.cuda.FloatTensor if p[0].is_cuda else torch.Tensor
ll1 = ft([0])
h = model.hyp # hyperparameters
red = 'mean' # Loss reduction (sum or mean)
if prunable_modules is not None:
for m in prunable_modules:
ll1 += m.weight.norm(1) # BN 层 gamma 值 的 L1 范数
ll1 /= len(prunable_modules)
ll1 *= h['sl']
bs = p[0].shape[0] # batch size
loss += ll1 * bs
return loss
请教下函数中 平均范数为啥要乘上 batch_size?
新手想请教一个比较基础的问题,剪枝后的模型finetune训练次数设置的足够高(比如跟正常训练一样100个epochs)可以取得很不错的mAP,这样做有问题吗?finetune的训练次数有没有什么限制比如只允许在10,20 次这样
运行train.py想自己像YOLOv5y一样训练模型训练报错 'Got {}, but numpy array, torch tensor, or caffe2 blob name are expected.'.format(type(x)))
NotImplementedError: Got <class 'NoneType'>, but numpy array, torch tensor, or caffe2 blob name are expected.
Why did mAP all become 0 after the second epoch when I used mobilev2-yolo5s to train my data?
您好,感谢您开源您的工作,请问您在剪枝部分ignore_idx=[230, 260, 290],这个[230, 260, 290]分别指的是哪些层,或者您是否有一个权重可以共享
i can not run it ,it give me the message:
Traceback (most recent call last):
File "train.py", line 762, in
opt.cfg = check_file(opt.cfg) # check file
File "/home/demon/CLionProjects/mobile-yolov5-pruning-distillation/utils/utils.py", line 101, in check_file
if os.path.isfile(file):
File "/home/demon/anaconda3/envs/pytorch_gpu/lib/python3.8/genericpath.py", line 30, in isfile
st = os.stat(path)
voc_train.txt
voc_test.txt
where can get them?
训练 mvocs模型的时候再完成一个epoch后就报上面的错,谁知道怎么解决吗?
prunable_modules = []
prunable_module_type = (nn.BatchNorm2d, )
for i, m in enumerate(model.modules()):
if i in ignore_idx:
continue
if isinstance(m, prunable_module_type):
prunable_modules.append(m)
as mentioned above, model structure was not change!
in addtion, BN of conv module was not appended prunable_modules.
where pruning?
when I train it with 'train.py', the output model is non-pruned, as large as original model.
Do I need a *.yaml for pruned model, how to get it?
Thanks.
problem as above.
your code about yolo.py was inconsistent with ultralytics / yolov5
it was occur error that Detect has no 'export'.
问:为什么有时候训练mAP会变成0?
答:这要分情况讨论,我看大多数就是loss->0,val的map->0,这是很明显的overfit了。可以参考一下原来写的csdn文章
问:torch.nn.modules.module.ModuleAttributeError: 'Model' object has no attribute 'module'
答:这是原git挖的坑。你以为的模型加载->直接读取权重,恢复模型,构建梯度关系。实际上的模型加载->从yaml中构建一个随机权重的模型->得到梯度关系->加载每一层保存的权重。前者用在纯前向推理,后者用在需要梯度关系的时候,比如剪枝。还有就是保证原始工程结构,注意export PATH=$PWD
问:yolo5s的预训练权重没了怎么办?
答:我现在手上也没了。不过这个git主要是用更轻量的mobilev2-yolo5s,所以这里大家也可以来尝试一下。如果需要用yolov5,还是用他们最新的,对着我这个改改代码还是很轻松的Readme中有一个网盘链接,可以试一下
问:recall低或者precision低
答:和阈值有关,所以我这里只关注map作为中间指标。如果再剪枝或者蒸馏中出现map降低,也是很正常的,和你的数据集以及策略都有关,毕竟每个论文都是吹自己是sota。还需要自己鉴别。
问:剪枝之后的模型能保存yaml文件吗?可不可以蒸馏或者进行ncnn转换?
答:可以,但是没必要用yaml文件。这个时候你直接加载模型就行了,参考ft(finetune)模式。如果你很介意,那你就写个逆向的脚本就行了,不难,锻炼一下自己。
问:模型训练的细节,剪枝蒸馏参考的文献等
答:基本都在readme中有写,细节直接看代码就好了,不难,就当作学习。补充一下,剪枝是韩松的基于bn层剪枝,蒸馏则是用了针对一阶段、二阶段、分类网络等等的方式,具体细节肯定和原文有出入,但是差异不大注意鉴别。还有就是不要问我为什么不用什么什么方法,我不可能帮你实验完所有的,你可以自己动手。我所选的都是具有代表性的,很经典的方法,但是不一定最好。
问:yolov5在你这里剪枝蒸馏有一些bug
答:一针对yolov5我没有具体测试过,别人作者都还在更新呢,肯定是有bug的。二我这个项目主要是想做个更轻量的模型出来,但是我能保证只要按照我的步骤一步一步肯定能复现出来,我连随机数种子都给你们了>_<。
问:mobilev2-yolo5s收敛慢,但是yolo5s就很快?
答:别人在coco上预训练过的,参数量和计算量是我的两倍,capacity就很大。我这个就backbone用的imagenet权重,head全部随机初始化,收敛慢很正常,capacity小导致精度低也很正常。本身轻量化的思路就是从整体结构设计、剪枝、蒸馏、量化这4个点来做的。你要拿着yolov5x训练完了再做轻量化我也没得说,这也是个思路。
your work as following:
sparse learning-->prune-->finetune-->distill ?
有相应的前端demo提供吗,看你提供的链接是404.
你好,我尝试使用yolov5s作为模型进行训练,命令行为
python train.py --type vocs
但是在训练完1个epoch后测试验证集时,报错
Traceback (most recent call last):
File "train.py", line 802, in
train(hyp)
File "train.py", line 453, in train
results, maps, times = test.test(opt.data,
File "/home/cwy/mobile-yolov5-pruning-distillation/test.py", line 112, in test
inf_out, train_out = model(img, augment=augment)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 621, in forward
outputs = self.parallel_apply(self._module_copies[:len(inputs)], inputs, kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 646, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 1 on device 1.
Original Traceback (most recent call last):
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 94, in forward
return self.forward_once(x, profile) # single-scale inference, train
File "/home/cwy/mobile-yolov5-pruning-distillation/models/yolo.py", line 111, in forward_once
x = m(x) # run
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 86, in forward
return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/mobile-yolov5-pruning-distillation/models/common.py", line 21, in forward
return self.act(self.bn(self.conv(x)))
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/home/cwy/.conda/envs/yolo/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same
尝试了很多方法,但还是报同样的错误,想请教一下该如何解决?
大佬您好,我运行detect.py报错 x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
RuntimeError: shape '[1, 3, 85, 80, 64]' is invalid for input of size 655360
运行train.py拿来训练也报错,是不是这不能拿来训练或者测试呢?
按照你的步骤一步一步,是要比之前更轻量吗,内存暂用更小吗,一般pruning剪枝率给到多少呢,蒸馏一般选择策略几呢,,我大概总共3,4万多张图片。
我诚恳的想请教您一些问题,方便联系吗?[email protected]
感谢作者的repo,真的学到了很多,在看蒸馏实验室遇到了一些疑惑想和作者请教一下
RuntimeError: Given groups=1, weight of size [1, 255, 1, 1], expected input[1, 512, 8, 8] to have 255 channels, but got 512 channels instead
我运行那个mob-yolov5s.yaml,报错这个!
我发现您并未给出yolov5s在VOC上训练的权重,因此将无法复现README中的实验,请问能否上传一下呢
I would like to ask what is the voc data format of your training? Whether it needs to be processed into xywh format after downloading voc
first very thank you for your job
i just use pruning.py ,at last save just save a whole new model ,how can i get the new yaml.could teach me?
Thank you for sharing the code. When I run the pruning operation, there is a missing image. Can you provide me with this picture?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.